Learn Python with Talk Python's 270 hours of courses

Migrating 3.8 Million Lines of Python

Episode #401, published Thu, Feb 2, 2023, recorded Wed, Jan 18, 2023

At some point, you've probably migrated an app from one framework or major runtime version to another. For example, Django to Flask, Python 2 to Python 3, or even Angular to Vue.js. This can be a big challenge. If you had 100s of active devs and millions of lines of code, it's a huge challenge. We have Ben Bariteau from Yelp here to recount their story moving 3.8M lines of code from Python 2 to Python 3. But this is not just a 2-to-3 story. It has many lessons on how to migrate code in many situations. There are plenty of gems to take from his experience.


Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

Guest Introduction and Background

Ben Bariteau is a seasoned Python developer who works at Yelp on the Core Services team. In this episode, he shares the detailed story of migrating Yelp’s massive Python codebase—3.8 million lines of code—from Python 2 to Python 3. Ben’s team is responsible for the essential infrastructure at Yelp, including internal tooling around deployments, testing, and managing their private PyPI server. His first major project and what he’s most known for in this context is guiding Yelp through its incredibly complex two-to-three migration.

What to Know If You're New to Python

If you’re just getting started with Python but want to learn from this episode, here are a few items that will help you follow along:

  • Python 2 vs. Python 3: Much of the conversation centers on the differences in the Python 2 and Python 3 runtimes. Understanding that Python 3 is the modern standard with new features and better performance will help.
  • Virtual Environments: Yelp uses two separate environments (Python 2 and Python 3) for testing, so know that tools like venv or virtualenv are common to manage dependencies separately.
  • Automated Testing: Large applications rely heavily on tests (pytest, or internal tools) to ensure that code changes don’t break existing functionality.
  • Incremental Changes: Ben and the team emphasize small, incremental updates to reduce risk and allow teams to keep shipping features.

Key Points and Takeaways

  1. Large-Scale Python 2-to-3 Migration The heart of the episode is Yelp’s decision and process to move 3.8 million lines of code from Python 2 to Python 3 with minimal downtime. By systematically fixing Python 3 syntax issues, making dependencies compatible, and carefully rolling out live traffic to the new runtime, they proved even huge monoliths could be migrated incrementally.
  2. Maintaining Development Velocity Yelp had hundreds of active developers making dozens of pushes per day. Ben highlights how they designed the migration to be as non-disruptive as possible so other teams could continue shipping new features. They avoided a “big freeze” by ensuring that every migration step was rollback-safe.
    • Tools & Links:
      • Git documentation (for version control practices)
      • Yelp’s internal “Jolt” test runner (mentioned in the show)
  3. Emphasis on Incremental Changes and Rollback Safety Rather than flipping a switch for millions of lines at once, the team introduced changes in tiny, workable chunks. Each change—like removing Python 2-only imports or switching from dict.iteritems() to items()—was guaranteed to be safe to revert, ensuring no single step permanently broke the system.
    • Tools & Links:
  4. Testing Infrastructure and Strategies Yelp’s codebase had around 100,000 tests, taking over 30 hours to run if done sequentially. By using a distributed test runner (Jolt) and focusing first on areas affected by code changes, the team kept feedback loops manageable. Full test runs remained essential but were automated and parallelized for speed.
    • Tools & Links:
      • pytest as a common Python testing framework
      • Yelp’s in-house Jolt runner (mentioned in the show)
  5. Dealing with Dependencies Yelp used 700 Python packages in its monolith, some of which were no longer maintained or had no Python 3 support. They replaced or upgraded most libraries, occasionally switching to forks that supported Python 3. In rare cases, they deleted code segments to remove outdated packages entirely.
  6. Caching and Serialization Challenges A major hurdle was the switch from Python 2 pickles to Python 3-compatible data. They migrated from pickling objects in memcached to using JSON-based caching. This allowed Python 2 and Python 3 services to share caches without corrupting data or requiring a major flushing event.
  7. Using Reverse Proxies for Incremental Rollouts The team configured an OpenResty (NGINX + Lua) reverse proxy to direct certain URL endpoints to Python 3 services while leaving the rest on Python 2. This approach enabled them to send a small percentage of live requests to Python 3, verify stability, and then expand coverage while avoiding mass downtime.
  8. Performance Improvements and Resource Savings Moving from Python 2 to Python 3 netted a 15–20% speed improvement and saved ~20% memory usage in some cases. Because of these efficiency gains, Yelp was able to reduce server allocations. Ben emphasizes that “base-level infrastructure” work can create tangible cost and speed benefits.
  9. Managing a Monolith Versus Splitting Services Yelp’s single large codebase, “Yelp Main,” still exists, but over time they’ve extracted many services. A monolith complicates large upgrades (e.g., package version pins or migrations), but microservices can increase overhead in other ways. They’ve found a balance by gradually splitting critical features into separate repos while still maintaining some monolithic functionality.
  10. Takeaways for Other Major Transitions While this story is specifically about Python 2-to-3, the process and lessons apply to any massive technology migration—like frameworks (Flask to FastAPI) or libraries (Angular to Vue). Key lessons include: thorough tests, incremental rollout, rollback safety, and letting developers continue building features without a development freeze.

Interesting Quotes and Stories

  • “You can’t just wave a wand and say, ‘Stop shipping code for six months while we do this migration.’ We had to keep pushing new features out.” — Ben
  • “I had people say, ‘I never thought we’d actually get Python 3 done.’ But we did.” — Ben
  • “We discovered a lot of code that was basically dead code. This migration forced us to clean house.” — Ben

Key Definitions and Terms

  • Monolith: A single codebase or application that contains many interwoven components or services.
  • Incremental Migration: The strategy of systematically modifying and shipping small changes rather than big-bang upgrades.
  • Rollout/Canary Release: Gradually exposing parts of your application (e.g., certain endpoints) to the new version in production to limit risk.
  • Backport Packages: Libraries that bring Python 3 functionality into Python 2 environments (e.g., futures, functools32).

Learning Resources

Here are some ways to dive deeper into similar or related topics:

Overall Takeaway

Migrating millions of lines of code and hundreds of daily commits from Python 2 to Python 3 might seem impossible. Ben and Yelp’s story shows that with rigorous testing, a commitment to rollback safety, and a creative use of incremental rollouts, even large-scale migrations can succeed without grinding product development to a halt. Their approach reveals a valuable blueprint for any major technology transition—plan carefully, automate what you can, and keep shipping.

Links from the show

Ben on Twitter: @benbariteau
Ben's Talk at PyCon 2022: youtube.com
python-modernize: github.com
python-future: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon