Pydantic v2 - The Plan
Episode Deep Dive
Guest Background
Samuel Colvin is the creator and lead maintainer of Pydantic, a popular Python data validation library. He has been working on Pydantic for over five years and recently began a major rewrite of its internals to improve performance and design. Samuel is an active Python developer who came to Rust as a way to optimize low-level, compute-heavy parts of the ecosystem, especially the core of Pydantic. He has also developed related tools such as watchfiles and rtoml using Rust bindings for Python.
What to Know If You're New to Python
Before diving into advanced data validation topics, it helps to understand a few Python fundamentals:
- Familiarity with Python classes: Pydantic leverages classes for structuring data.
- Basic knowledge of type hints (e.g.,
int
,str
,list
) and Python 3.7+ features: Pydantic ties deeply into type annotations. - Some exposure to JSON data exchange and web-related usage will help you follow why performance and validation matter.
Key Points and Takeaways
Pydantic v2’s Core Rewrite in Rust
Pydantic v2 introduces an internal engine called pydantic-core, written in Rust with PyO3 bindings to expose a Python-friendly API. This rewrite targets major performance boosts and a cleaner design. Rust offers safe, low-level control over data handling and error-checking, which is especially beneficial for repeatedly validating large volumes of data.- Links and Tools:
Performance Gains and Environmental Impact
Early benchmarks show 4x to 50x speed improvements (commonly around 17x) for validation tasks. This can significantly reduce CPU usage across large-scale systems—many of which rely on Pydantic to validate millions of requests daily. Reduced compute often translates to lower operational costs and even environmental benefits due to decreased energy consumption.- Links and Tools:
Strict Mode vs. Coercion
Pydantic has always allowed “loose” validation, automatically converting compatible data (like"123"
toint
). Pydantic v2 formalizes a strict mode so that, when enabled, fields refuse to coerce data types (e.g., a string passed to anint
field raises an error). This solves use-cases where data integrity demands zero unexpected conversions.- Links and Tools:
Built-in JSON Parsing
Previously, JSON parsing was done in Python before passing data to Pydantic. With v2, you can parse JSON bytes/strings directly through Rust-based logic. This not only increases speed but also smoothly handles strict-mode scenarios (e.g., ISO date strings remain valid for date fields when coming from JSON).- Links and Tools:
Validation Without a Python Class
Pydantic’s v1 approach often created hidden “model classes” behind the scenes. In v2, pydantic-core allows direct schema definitions (e.g., validating aTypedDict
or individual fields) without defining a PythonBaseModel
. This opens up more flexible, micro-validation patterns for advanced or lower-level usage.- Links and Tools:
Aliases and Deep Flattening
The new alias system lets you pull data from nested locations via a path-like notation. For instance, you could flattenfoo["bar"]["baz"]
onto a top-level field. This is extremely helpful when dealing with large or inconsistent JSON structures, letting you unify how data is accessed without extra pre-processing steps.- Links and Tools:
- Pydantic alias documentation (v2 updates forthcoming)
- Links and Tools:
Improved Error Messages and Documentation Links
Pydantic v2 aims to provide more thorough error messages, including references to online docs for further clarification. Borrowing inspiration from Rust’s error-handling approach, you’ll have targeted help links for each validation error. This ensures users quickly track down where and why validation fails.- Links and Tools:
“From Attributes” Replaces “From ORM”
Pydantic v1 had a method calledfrom_orm
, mainly for ORMs like SQLAlchemy. It’s being replaced with “from attributes,” a generalized approach to read Python objects’ attributes (including properties) for validation. You can validate any class instance, not just database models, making the feature far more flexible.- Links and Tools:
Wrap Validators / Middleware-Style Logic
A new “wrap validator” approach mimics the onion/middleware pattern used in web frameworks. Developers can write before-and-after logic around core field validation. This allows skipping redundant checks for already-valid data or gracefully catching specific errors in a layered, composable way.- Links and Tools:
WebAssembly and Browser Testing
With help from Pyodide, all of Pydantic’s tests run directly in the browser as WebAssembly, verifying cross-platform reliability. This demonstration highlights the future potential of Python and Rust code in the browser, ensuring Pydantic’s expanded environment coverage.
- Links and Tools:
- Namespace and Method Cleanup
There will be several renamed or reorganized methods to make Pydantic’s API clearer (model_validate_python
,model_validate_json
, etc.). Deprecated methods will likely raise warnings for a while, but silent changes in behavior (like how sets are or aren’t coerced) can break code if not addressed.
- Links and Tools:
- Licensing and Documentation Considerations
Samuel discussed how the MIT license for Pydantic remains intact, but the docs licensing might shift. The goal is to prevent out-of-date or duplicated documentation from floating around under the same terms. This step ensures official references stay authoritative and accurate.
- Links and Tools:
Interesting Quotes and Stories
- Samuel on building Pydantic initially: “I literally built Pydantic for me and put it on PyPI just to see what would happen.”
- On environment and performance: “If we reduce Pydantic’s CPU usage by 10x, that might actually have an environmental impact given how often it’s called across big companies.”
- Regarding strict type checks: “For me, it was obvious that a string '123' should become an int. But I also see the value in sometimes saying, ‘No, that’s not an int if it’s a string.’”
Key Definitions and Terms
- Strict Mode: A configuration that disallows automatic data type coercion (e.g., no conversion of
"5"
to an integer). - Alias Flattening: A feature letting you specify how deeply nested data paths map onto a top-level field name.
- PyO3: A library enabling Rust and Python interoperability, allowing Rust code to be compiled as Python modules.
- Wrap Validator: A new validation approach that wraps the validation chain, letting you add or skip logic before and after the core validator runs.
Learning Resources
If you want to grow your Python skills and foundational knowledge:
- Python for Absolute Beginners: For those new to coding in Python.
- Rock Solid Python with Python Typing: Learn how to effectively use and apply Python’s type hints, a major pillar of Pydantic’s design.
- Modern APIs with FastAPI and Python: See how Pydantic gets used in real-world API development with FastAPI.
Overall Takeaway
Pydantic v2 heralds a significant leap forward for Python data validation. By moving its core to Rust, it achieves astonishing performance gains while enhancing clarity around strict typing, JSON parsing, and custom validation. Teams can look forward to cleaner, faster, and more reliable validation pipelines—potentially with broad benefits from both a productivity and environmental standpoint.
Links from the show
Pydantic v2 plan: pydantic-docs.helpmanual.io
Py03: pyo3.rs
FastAPI: fastapi.tiangolo.com
Beanie: github.com
SQLModel: sqlmodel.tiangolo.com
Speedate: docs.rs
Pytests running on Pydantic in browser: githubproxy.samuelcolvin.workers.dev
JSON to Pydantic tool: jsontopydantic.com
Pyscript: pyscript.net
Michael's Pyscript + WebAssembly: Python Web Apps video: youtube.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy