Python Performance for Data Science
Episode Deep Dive
Guests Introduction and Background
Stan Seibert is a manager at Anaconda, where he oversees open-source development teams working on projects such as Numba, Jupyter, and other tools that push the boundaries of Python’s performance. With a deep background in physics and scientific computing, Stan has spent nearly a decade focusing on accelerating Python for data scientists and developers. His work at Anaconda includes guiding the Numba team to help users speed up core numerical algorithms in Python while maintaining a highly productive, Pythonic workflow.
What to Know If You're New to Python
If you're just getting started with Python but want to follow the discussion about data science and performance, here are some targeted resources:
- Data Science Jumpstart with 10 Projects: Offers hands-on projects to learn Python for data science essentials and helps you handle real data workflows.
- Python Memory Management and Tips: Focuses on how Python handles memory behind the scenes, a crucial concept when optimizing for speed.
Key Points and Takeaways
Numba for High-Performance Python
- Numba is a JIT compiler that transforms Python functions into fast, machine-level code for numerical work. It excels at accelerating custom algorithms rather than forcing developers into rewriting them in C or another lower-level language.
- Links and Tools:
Profiling and Identifying Bottlenecks
- Before optimizing, you must measure your code’s performance with profiling tools. Surprisingly, code you suspect is slow might not be the actual bottleneck once measured.
- Links and Tools:
Python 3.13 and the Copy-and-Patch JIT
- The faster CPython team introduced a new JIT approach in Python 3.13 known as copy-and-patch. It generates specialized code for common execution patterns at build time, without requiring LLVM at runtime.
- Links and Tools:
- Anthony Shaw’s Article on the 3.13 JIT (Summary of upcoming CPython changes)
“Free-Threaded” Python and the Future of the GIL
- PEP 703 aims to remove or disable the global interpreter lock (GIL), unleashing true multithreading in Python without forcing the use of subprocesses. While it’s experimental for now, it could eventually reduce the need for workarounds in CPU-intensive parallel code.
- Links and Tools:
- PEP 703 - Making the GIL Optional
- PyFreeThread Project (Reference from scientific community)
Comparing Numba to Cython
- Unlike Cython, which generally requires adding type declarations, Numba infers types at runtime and offers a single decorator for compilation. This “no-Python” mode helps keep your focus on Pythonic code while unlocking C/Fortran-level speeds for numeric algorithms.
- Links and Tools:
GPU Acceleration with CUDA
- Numba has mature support for NVIDIA CUDA GPUs. You can write Python code that gets compiled into GPU kernels, offering a powerful option to speed up large-scale computations, especially after validating correctness on a CPU first.
- Links and Tools:
- NVIDIA CUDA
- CuPy (NumPy-compatible GPU array library)
Memory Profiling
- Python’s dynamic nature can hide significant memory usage in temporary objects (especially arrays). Memory profilers can help identify these hotspots, which is critical when scaling up data-intensive applications.
- Links and Tools:
Static Python Research (Spy)
- Although very much R&D, there’s work exploring a “two-phase” Python, where dynamic magic like metaprogramming happens first, and then a static phase locks everything down for compiling. This could open up even more optimization potential in Python without losing flexibility.
- Links and Tools:
- PyPy (Related JIT research in Python)
- (No official public link for Spy yet, but it’s discussed in PyCon talks)
Other Mentioned Tools and Services
- The discussion also touched on the possibility of bridging to C and Rust for advanced optimizations or security contexts, as well as referencing containerized builds of free-threaded Python.
- Links and Tools:
Interesting Quotes and Stories
- “You don’t want to optimize the wrong thing. Measure before you do anything else.” – Stan Seibert, emphasizing the importance of profiling.
- “Numba compiles two functions—a wrapper for the Python interpreter boundary and the no-Python core. Once you’re inside that core, it’s basically like writing C.” – Stan, describing how Numba keeps Python code feeling Pythonic.
Key Definitions and Terms
- Numba: A Python JIT compiler that optimizes numerical functions for CPU and GPU execution.
- Global Interpreter Lock (GIL): A mutex in CPython that prevents multiple threads from executing Python bytecode simultaneously.
- CUDA: NVIDIA’s parallel computing platform for GPUs, usable from Python via libraries like Numba or CuPy.
- Cython: A superset of Python that compiles to C for performance, but often requires explicit type declarations.
- Profiling: The process of measuring which parts of code consume the most time or memory.
Learning Resources
Here are a few places to learn more about Python performance, data science, and best practices:
- Data Science Jumpstart with 10 Projects – Dive into real-world data science problems as you level up your Python.
- Python Memory Management and Tips – Gain deeper insight into how Python uses memory and how to optimize it.
- Move from Excel to Python with Pandas – Helps Excel-centric folks transition to efficient Python data workflows.
Overall Takeaway
Python’s readability and massive ecosystem can coexist with serious performance gains if you choose the right tools and techniques. From optimizing a small numerical function with Numba to experimenting with truly parallel, free-threaded builds of Python, there’s no shortage of ways to speed up your data science workflows. By combining solid profiling habits with specialized libraries (and keeping an eye on new developments in the Python community), you can retain Python’s flexibility while harnessing the power of modern hardware.
Links from the show
Anaconda: anaconda.com
High Performance Python with Numba training: learning.anaconda.cloud
PEP 0703: peps.python.org
Python 3.13 gets a JIT: tonybaloney.github.io
Numba: numba.pydata.org
LanceDB: lancedb.com
Profiling tips: docs.python.org
Memray: github.com
Fil: a Python memory profiler for data scientists and scientists: pythonspeed.com
Rust: rust-lang.org
Granian Server: github.com
PIXIE at SciPy 2024: github.com
Free threading Progress: py-free-threading.github.io
Free Threading Compatibility: py-free-threading.github.io
caniuse.com: caniuse.com
SPy, presented at PyCon 2024: us.pycon.org
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy