Monitor performance issues & errors in your code

#401: Migrating 3.8 Million Lines of Python Transcript

Recorded on Wednesday, Jan 18, 2023.

00:00 At some point, you've probably migrated an app from one framework or major runtime version

00:04 to another.

00:05 For example, Django to Flask, Python 2 to 3, or even Angular to Vue.js.

00:12 This can be a big challenge.

00:14 If you had hundreds of active devs and millions of lines of code, it's a huge challenge.

00:20 We have Ben Bariteau from Yelp here to recount their story of moving 3.8 million lines of

00:26 code from Python 2 to 3.

00:28 But this is not just a two to three story.

00:30 It has many lessons on how to migrate code

00:33 in many situations.

00:34 There are plenty of gems to take from his experience.

00:38 This is Talk Python to Me,

00:39 episode 401, recorded January 18th, 2023.

00:44 (upbeat music)

00:47 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the podcast using @talkpython, both on Bostodon.org. Be careful with in personating accounts on other instances, there are many keep up with the show and listen to over seven years of past episodes at talkpython.fm

00:58 We've started streaming most of our episodes live on YouTube.

01:22 Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and

01:28 be part of that episode.

01:30 This episode is brought to you by Cox Automotive.

01:33 Join their team and use your technical skills to transform

01:36 the way the world buys, sells, and owns cars.

01:39 Find an exciting position that's right for you at

01:41 talkpython.fm/cox

01:43 and it's also brought to you by User Interviews.

01:46 Earn extra income for sharing your software development

01:49 opinion at User Interviews.

01:51 Head over to talkpython.fm/userinterviews

01:54 to participate today.

01:56 Ben, welcome to Talk Python to Me.

01:58 Thank you. Thank you so much for having me, Michael.

02:00 We're going to talk a little bit of legacy code,

02:02 a little bit of very, very large code bases,

02:05 and how you might not have to permanently live in the past,

02:09 which I think would be really welcome to a lot of people.

02:12 I just talked a little bit about this before I hit record, but I,

02:15 even though your topic is specifically

02:17 how the story of move from Python 2 to 3,

02:20 and this, like, making your whole code base modern,

02:23 I do think that this idea of how do I move from one code base to another code base

02:27 is super relevant to lots of folks who might not be going from Python 2 to 3,

02:31 but maybe from Flask to FastAPI or vice versa,

02:34 or those types of things. I think the techniques that you're going to talk about here are

02:39 more broadly applicable than just a 2 to 3 migration.

02:43 And it's really cool how you all migrated

02:46 3.8 million lines of code without interrupting development. That's kind of nuts.

02:50 Yeah, I did it and it still seems ridiculous.

02:53 You lived it, and it seems like a dream. Amazing.

02:55 Before we get to all that, though, let's start with your story.

02:57 How did you get into programming and Python?

02:59 Took a job at Yelp. Yelp was a Python shop.

03:01 Before that, I had a couple internships,

03:03 and I went to Georgia Tech, and I mostly did Java.

03:06 So it was sort of a new experience for me.

03:08 You know, Python is one of those beginner languages

03:11 that everyone loves to throw around.

03:13 So I had done, you know, I dabbled in a little bit,

03:16 but I first started really getting deep into the language

03:20 when I started it at Yelp.

03:22 And I've definitely made it,

03:25 it's sort of become like,

03:26 I've become sort of a local expert on it.

03:28 So I've been able to build up a lot of knowledge about like,

03:31 you know, a lot of weird edge cases and, you know, stuff like that.

03:35 You may be familiar with it.

03:36 There's a t-shirt that's kind of a joke,

03:38 a meme that says, "I learned Python. It was a great weekend."

03:41 Yeah.

03:42 And yet, I've been doing Python for many years,

03:45 and I'm still learning new stuff.

03:46 Even today, I learned some interesting new Python things.

03:50 So which is it?

03:51 Do you learn in a day or is it like this deep journey?

03:54 I think most programming languages have some amount of, you know,

03:58 width and depth. I think, you know, Python definitely has the advantage of being

04:02 a relatively straightforward language.

04:05 One of the nice things, obviously, is that like,

04:07 instead of using a lot of weird keywords, it has, like, you know,

04:11 words, like, you know, instead of being like,

04:13 – Oh, like, "or double pipe." – Yeah, or instead of double pipe and stuff like that.

04:19 Those are the things that I think definitely help people.

04:22 Personally, you know, me coming into it,

04:24 that wasn't as big of a deal for me because I was already familiar with all that stuff.

04:27 You're coming from a very symbol-heavy world of

04:30 Java, which is not as symbol-heavy as C++,

04:33 but it's got a lot of abstractions in what it builds, for sure.

04:36 Yeah, I will say, I think there are certain things that Python does.

04:40 I will say Python is not a perfect language by any stretch of the imagination.

04:44 But I will say one thing about Python that I think is really cool is

04:48 did sort of, before many other languages,

04:50 managed to integrate a lot of functional paradigms.

04:53 Like, list comprehensions or comprehensions in general

04:56 are, I think, one of those features where it's just like,

04:58 this doesn't exist in a lot of other, sort of, more popular languages,

05:02 and they're really, really fluent and powerful in a way that

05:05 you kind of miss when you don't have it, right?

05:08 And so, I think that's something that's really cool about Python.

05:11 But Python itself, being a language with the legacy that it has,

05:15 I mean, we're going to be talking about the two to three differences which have their own nuances to them,

05:20 but it's always going to have some weirdnesses to it.

05:22 And some of those things are just like, "Oh, someone made a decision 30 years ago that still reverberates today."

05:29 And so that means that it has this depth to it.

05:32 You have to really learn the depth in order to fully understand all of the problems that exist.

05:38 Not necessarily problems with the language, but when you're building software, you run into problems that you have to solve.

05:44 And so that's like the main, I think that's

05:46 true of all languages to an extent, especially popular ones and older ones.

05:51 But yeah, I do think that

05:53 Python more than a lot of languages really is able to straddle the line of

05:58 being like, oh, it's approachable, but also you can do a lot of

06:00 really interesting and powerful stuff with it.

06:02 Yeah, you compare that with like Java.

06:04 Java, you've got to understand functions,

06:07 you've got to understand classes,

06:09 possibly namespaces, like

06:11 just to write the first line of code,

06:13 Whereas Python, you can work with it for,

06:16 like, you know what, this is really clumsy to repeat this.

06:18 Maybe I'll learn what a function is

06:20 and then I can start using that,

06:21 but you don't know what a class is,

06:22 you don't care about it.

06:23 You kind of like slowly layer on the stuff as you need it

06:27 rather than you've got to jump in

06:29 and go with it all at once.

06:30 - Yeah, for sure. - Yeah, interesting.

06:32 So you're still at Yelp?

06:33 - Yes. - I presume, yeah.

06:35 - I'm in a conference room

06:37 in our San Francisco office right now.

06:38 - Excellent, and what are you doing there?

06:40 - So I work on a team that's called Core Services.

06:42 Our team is responsible for a lot of infrastructure,

06:47 mostly Python focused, though not exclusively.

06:51 We do a lot of our sort of like internal

06:53 Python infrastructure, so we'll be responsible

06:57 for making sure that we can upgrade to new Python versions.

07:00 We've got, on our internal PyPI,

07:02 we recently, this is a cool thing

07:05 that has happened since my talk,

07:08 so I didn't mention it, is we recently built a system

07:11 to automatically import certain packages from a public PyPI.

07:16 And that has saved us some headaches.

07:18 And then we own some other stuff, like,

07:20 you know, we've dealt a lot with sort of the general service contract at Yelp.

07:24 So like, being like, okay, what does it mean to be a service?

07:27 How do you be a good citizen there?

07:28 And a lot of other things like testing tools,

07:31 a lot of like, sort of like, oh, I need to test against multiple services.

07:36 We have a testing tool that like,

07:38 automates a lot of the steps to like, sort of get those all connected together

07:41 you can test against them.

07:42 - Yeah, it sounds like a really fun

07:44 sort of task you're doing there.

07:46 So you said you have an internal,

07:47 we're going to dive into the code and stuff,

07:49 all this whole migration, but, you know,

07:51 kind of sidebar, like you said,

07:52 you have this internal private PyPI server.

07:55 What's the details around that?

07:57 Like how, obviously, you're whitelisting

08:00 things that can be brought inside,

08:02 saying we're going to put those onto our server

08:04 and you can request and you can choose

08:06 when to let the new one in and so on.

08:08 But what's the software

08:10 How do you put that together?

08:12 A number of years ago now, we

08:15 switched to a piece of software

08:17 that I don't think is used pretty much

08:19 anywhere else, which was built by

08:21 one of my teammates who

08:23 named Chris Keel. So Chris Keel

08:25 built essentially a

08:27 PyPI implementation called

08:29 dumbPyPI.

08:31 It's called dumbPyPI because

08:33 unlike some other PyPI servers,

08:35 the way that it works is you

08:37 just sort of give it a list of distributions

08:39 and then it just generates all of the HTML pages.

08:42 So instead of being like, "Oh, I'm a server,

08:45 and I'm going to handle this request and blah, blah, blah,

08:48 and run some code," it's literally like, "Okay, here are the HTML pages."

08:51 I see. It's like a static site generator

08:55 for a PyPI backend.

08:57 Yeah, yeah. And we've been using that for a really long time,

09:00 and maybe four or five years now.

09:03 And it's serviced pretty well.

09:06 It's really nice in my opinion, because it's the only service that my team actually owns.

09:11 So it, and it never pages us.

09:13 So that's great.

09:13 I love that it never pages us.

09:15 It can't go down.

09:16 Yeah.

09:17 Not really.

09:18 Well, it can't, it wouldn't be good that if it goes down, but also it doesn't.

09:22 That's the great part is it just, it just doesn't.

09:24 So Chris's, Chris's software works great.

09:26 Yeah.

09:26 that's a really cool idea.

09:28 And so you tell it certain versions or, or do you just limit it to the libraries and let it pick the latest versions of whatever's on real PyPI?

09:35 So the way that we do it is we have a whole system which imports packages.

09:42 We actually rebuild all of our wheels.

09:45 It's kind of for hard to explain reasons.

09:48 So what we'll do is we'll like, someone will say, "Hey, I want this version of this package."

09:52 Or maybe they'll just say, "Hey, I want this package," and then we'll just pull down the newest one at the time.

09:56 And we'll do some security vetting on it.

10:00 So we have some automated security stuff.

10:02 And basically just make sure that it's not malicious.

10:04 And then we build the wheels and then we upload those

10:09 to the S3 bucket that backs our PyPI.

10:12 So we just do that. In terms of how we decide, it's basically just sort of like,

10:17 we make sure, we do the security check, we do like,

10:21 there's a few other things, like we make sure we have all the dependencies, we make sure we have

10:25 that it has a license that we're okay with using internally.

10:30 And so all of those things are checked.

10:32 And then we have, as I mentioned, we have this sort of like automated import system.

10:37 So like certain packages, we'll just we'll try to download them.

10:40 They might fail, you know, one of those checks, and then we won't upload it.

10:43 But like, you know, so we'll just like import it, we'll try to import it.

10:46 And and so certain packages, we'll try to get the newest one.

10:49 Some packages, you know, how we haven't set that up for one reason or another,

10:53 some packages, there's certain packages that are just like difficult to build.

10:56 And so we avoid importing them.

10:58 Right. You just we got this one working. It's fine.

11:02 Yes, yes. And so like some are difficult to build, some are just like,

11:06 oh, this is a package we've never used before.

11:08 So we just like don't use it.

11:10 So, or we don't have it.

11:12 You talked about how many dependencies your projects

11:14 have and stuff, and that'll be fun.

11:16 But let's maybe take a step back and just talk about

11:18 you know, Python at Yelp,

11:20 you've, this main project that you have,

11:23 it was running on Python 2.

11:25 It's kind of obvious, but

11:27 some of the reasons are obvious, some are not.

11:29 Like, why did you care what version of Python it's on?

11:31 That's a good question. I mean, I think the main reason was just sort of like we saw the writing on the wall. The writing on the wall was the end of life for Python 2, right?

11:40 And I think everyone else, we knew that other people were going to follow that, right? There was, I remember in 2019 when I was looking into this, there was a thing I think was called like the Python pledge or something like that, where basically like packages would, like open source packages would say like, hey, we're going to drop Python 3, you know, after end of life at some like, you know, either the day of or,

12:00 of or a few months later or something like that.

12:03 And so we were sort of looking at that and being like, well, we use some of those packages,

12:06 you know, and eventually we might want to upgrade them.

12:09 Yeah, you're about to get frozen in time around mid 2020.

12:13 So you're, you maybe don't want that.

12:15 By the time that I did my talk, I remember, I think it was early 2021 or something,

12:21 you know, pip had dropped support for Python 2.

12:23 So that was like one of those things where it was just sort of like,

12:26 Yeah, there's not a realistic ecosystem in which you are able to use open source and upgrade your stuff

12:33 for security patches or whatever.

12:35 I want this new feature. Oh, sorry, that's Python 3 only, you know, kind of thing.

12:40 So that was the main motivation.

12:42 And then I think some secondary stuff was just sort of like,

12:45 as you build, as time marches on and people start

12:49 being familiar with Python 2 and it has some quirks compared to Python 3,

12:53 you definitely have the problem of like, okay, now you have to like,

12:56 if you're hiring people and they're working on Python 2, you have to train them up on those quirks

13:00 in a way that like, you wouldn't necessarily have to do if you're using a modern language that other places are using.

13:06 Those are the, I think, the main motivations.

13:09 Personally, I think I had like a small motivation myself, which was just sort of like,

13:13 I hate seeing things like be left behind like this, you know.

13:16 Yeah, sure.

13:17 Very emotional thing, but yeah, that's part of the reason I pushed for it.

13:21 Well, there's the train thing.

13:23 I mean, there's obviously just the infrastructure stopping,

13:25 stopping the updates, but there's the training side of helping people who are

13:31 new come, but there's also the, how do you hire the very best engineers?

13:35 It's really hard to get an amazing Python engineer to come and say, you're

13:40 going to do amazing work from 2008.

13:43 You're going to love it.

13:43 You know what I mean?

13:44 Right?

13:44 Like if they're working on some new package that they're inspired about,

13:48 Instead of trying to bring that in and like,

13:49 help make that better and also boost what you're doing.

13:52 It's like, well, we can't use that because that,

13:54 you only do it in Python 3.

13:55 Well, of course I created it this, you know, two years ago.

13:57 Why wouldn't it be Python 3 only?

13:59 There's a lot of knock on effects like that, right?

14:01 Yeah.

14:01 Did you see the performance stuff from 2.11 or even from 2, sorry, 2.11,

14:06 3.11 or even 3.10,

14:09 where you're like, you know, there might actually be

14:12 fewer servers as well if we do this?

14:14 That's definitely something that we are,

14:17 you know, we want to do. That specific issue is something that we're

14:21 sort of, we're trying to move towards being able to use those versions of Python right now.

14:25 It's always a process just because of various, you know, internal things. But it's definitely

14:30 something that has been talked about. We're like, yeah, if we could use new versions of Python,

14:35 maybe things will be faster, things will be more efficient. Trying not to spend too much money

14:40 is definitely a thing that we think about. So that's definitely exciting.

14:43 - Yeah, when I did the episode on 3.11,

14:46 we talked a lot about the performance there.

14:48 And it's impressive.

14:49 It's, you know, 40, 50, 60%.

14:51 And I won't steal your thunder.

14:53 I know at the end, you've got some nice performance boosts

14:57 that you got even from the changes that you made.

14:59 But there was somebody in the audience that pointed out,

15:01 like, not only is this faster, which is nice for us,

15:04 right, it's nice that we have to pay less for servers,

15:06 or it's nice that our code runs a little bit faster,

15:09 but it's also good for the planet, right?

15:11 If we just all start using newer, faster foundations, then necessarily we just

15:16 use less energy to do the same thing that we're already doing.

15:19 Right.

15:19 Yeah, that's definitely a, I like that.

15:22 This portion of talk Python to me is brought to you by Cox automotive with

15:28 brands like Kelly blue book, auto trader, dealer.com and more Cox automotive

15:33 flips the script on how we buy, sell, own, and use our cars.

15:38 And now the team at Cox Automotive is looking for software engineers, data

15:43 scientists, scrum masters, and other tech experts to help create meaningful

15:47 change in the industry.

15:49 Do you want to be part of a collaborative workplace that values

15:52 your time and work-life balance?

15:54 Consider joining Cox Automotive.

15:56 Visit talkpython.fm/cox today.

16:00 Thank you to Cox Automotive for sponsoring the show.

16:03 Let's talk about Python at Yelp.

16:06 So you've, you've got this repo, this, this big project called Yelp main.

16:11 Let's start there.

16:12 Sure.

16:12 Yelp main is what it sounds like.

16:15 It is sort of the original repo Yelp.

16:18 That's when you're a startup in 2004, you're, you kind of just make a repo, right?

16:23 And it's your web app.

16:24 And that's how we made a subversion repo.

16:26 Cause it wasn't none of that CVS stuff we're doing subversion.

16:30 I don't remember when we switched from subversion, but we, it was subversion.

16:35 I don't know if it was before...

16:37 I don't know if we actually started out in Subversion.

16:40 But I didn't start until 2014.

16:43 But yeah, it's definitely Subversion was...

16:45 I know there was some old Subversion stuff.

16:47 So you have this one sort of web app,

16:49 and the web app is a server that serves...

16:54 that originally served everything.

16:56 So there's a bunch of stuff,

16:58 like you can sort of think of as like Yelp.com, right?

17:01 Like if you go to www.yelp.com,

17:04 then you're looking at what you think of as Yelp, right?

17:07 It's like, "Oh, I can search for businesses, I can look at their reviews,

17:10 I can write my own reviews," that kind of stuff.

17:13 So it's that, but it's also other stuff.

17:14 It's also our business owner site, so biz.yelp.com,

17:18 which is where business owners look at their own businesses

17:21 and are able to see the metrics and buy ads and stuff like that.

17:25 There's our admin site, which is where a lot of anybody,

17:29 we have our user operations people

17:31 whose job is at least partially to do some moderation and stuff like that.

17:35 So like, we need to be able to have those tools.

17:37 And then there's also what we call internal API.

17:42 And internal API is a way for internal stuff to get the data that's in Yelp name.

17:49 So that's what that is.

17:51 And that's like its own separate sort of site.

17:53 But these are all in the same repo.

17:55 They all run in the same process.

17:57 That's... Sorry.

17:58 No, I was just going to ask, is this kind of the, the mono repo style, or it's,

18:02 it's truly a monolith in the sense that it's kind of all the same app.

18:06 It's truly a monolith.

18:07 There is some amount of stuff where it's like, Oh, we have like different

18:11 containers running like different entry points, but like the code is all

18:15 kind of tangled up together.

18:16 So there's not really a meaningful delineation between different components.

18:21 In a way that you could really separate them out in any meaningful way.

18:24 So like my understanding of like what I would define as a monorepo, I

18:28 I wouldn't really call it that.

18:29 I would just call it, I would call it--

18:31 - Just a large app.

18:33 - Yeah, it's a huge, huge, huge app, huge repo.

18:35 - Yeah, yeah, okay.

18:37 So in your talk, you said that you have six different sites

18:40 with 2000 different endpoints, which it's a lot.

18:44 I don't think it's completely excessive or anything

18:46 like 2000 URL endpoints for all those different services

18:50 and like all those different admin apps.

18:52 It seems it's a lot, but it's not insane.

18:55 And then you have these background batch services.

18:57 What's the story of those?

18:59 It's just sort of anything that you need done.

19:01 As I said, this was just sort of like, this is the one repo, right?

19:04 And so there's a lot of things that you want done

19:07 that aren't necessarily done in the context of a web request

19:10 or don't make sense to do synchronously.

19:13 So a lot of that is just sort of like, okay,

19:16 I need to do this really complex report or something, right?

19:19 I want to get some metrics that involve collating a bunch of data,

19:23 doing a bunch of joins against a bunch of tables.

19:25 "Okay, well, I'm not gonna have just like a web request do that, I'm gonna put that in like a separate process."

19:30 And we originally, as you can imagine, for that type of application,

19:35 we just called those batches.

19:37 Yeah.

19:37 Like a batch job, right?

19:39 That name is stuck, despite the fact that now batches don't necessarily do that type of work.

19:44 They're just sort of anything you want to do in the background.

19:47 Right.

19:47 And that could be something like, "Oh, the first of the month we do our like add billing."

19:52 or we might have some process where it's just sort of like, oh, we want to like update this cache

19:58 based on like data like stuff, but we don't want to do it inline in a web request.

20:04 We can do it asynchronously.

20:05 So it's really anything that is not in the context of a web request.

20:10 I suspect most major apps, most companies have that kind of stuff too, right? They've got to.

20:16 I mean, everyone has some version of it.

20:18 whether or not they do it exactly the way that we do it is

20:21 a separate question that I'm not really sure.

20:23 Yeah, I think part of the story is, do you deploy them all out of the same codebase?

20:27 Or are they a bunch of different jobs

20:29 and repos, or how does that fit together? That's probably where it varies.

20:33 Yeah, I mean, for us, we have,

20:35 I mean, I said 800 batches, and I was referring specifically to the batches that are still in the Yelp main repo.

20:41 And like I said, all these things are kind of tangled together.

20:43 So it's not like, oh, you can just pull a batch out.

20:46 That's like talking about, well, how do you get the data that it needs?

20:50 What does that look like?

20:52 How does it get the data access layer?

20:54 How does it get a hold of the logging thing that's over here?

20:56 And all that kind of stuff, right?

20:57 Exactly. So those 800 batches, but we also have tons and tons of batches

21:02 that are in services.

21:04 So they live in service repos,

21:06 and they run, and they're in totally separate code base.

21:09 I don't know what that number is.

21:10 I'd have to figure it out. It's definitely a lot.

21:12 It's almost certainly more than we have in Yelp main at this point.

21:16 But that paradigm exists all over Yelp and not just in this repo.

21:20 Yeah, well, I think these are a lot of value to having that code together, right?

21:25 If you break this out into a whole bunch of different repos,

21:28 you've got dependency management, versioning, deployment,

21:31 there is some value to just saying, just let it live together.

21:34 We'll upgrade it together.

21:36 But it does make for some striking headlines when you talk about how many lines of code got upgraded at once, right?

21:41 Yeah, I mean, I think that when you're talking about, like,

21:44 why do we stall them all? The answer is mostly just because

21:47 it's really hard to not have one once you have one.

21:51 You have to do all the work to move it out.

21:53 And there are disadvantages. Like you said, it's sort of like,

21:55 okay, now as soon as you have a new repo,

21:57 it has its own set of dependencies that you have to keep up to date,

22:00 and you have to do other sort of maintenance on it.

22:04 Generally speaking, we consider that better though still,

22:07 because it's sort of like, it's always better.

22:09 Like imagine if I'm in this giant monolith,

22:12 and I'm like, "Oh, I need to upgrade this package."

22:14 And it's like, "Okay, well, you want to do a major upgrade,

22:17 and this package is imported in a thousand places.

22:20 Now you need to deal with that migration."

22:22 Whereas if it's like, "Oh, I'm in my service,

22:25 and I need to do this package upgrade,

22:28 and it's imported in ten places."

22:30 That's like an afternoon instead of like,

22:32 you know, a quarter of a year or something, right?

22:35 So there's definitely advantages to that.

22:38 It does add, it's sort of like more work overall,

22:41 but you can do it in a more granular way,

22:43 so it allows you to unblock people faster, essentially.

22:47 So we definitely want to move away from monolith,

22:50 and we have been doing that.

22:51 Like, compared to when I started at Yelp,

22:53 we have way, way less code that is important running in Yelp main.

22:57 There is still a ton that's important in there,

22:59 like I mentioned in my talk,

23:00 almost inevitably someone has to call into an internal API

23:04 to get data out of it.

23:05 So that's something that we definitely want to fix at some point,

23:08 but it is a process, and that process is, generally speaking,

23:12 we're getting to a point where some people who work at Yelp

23:15 don't really work in Yelp main anymore.

23:17 They just don't have to deal with it, especially not on a day-to-day basis.

23:20 Right, sure.

23:21 And you mentioned your talk, I don't know if I said this at the beginning,

23:24 but you gave a talk at PyCon 2022, which definitely was a very popular one

23:28 and highlights some of these things there as well.

23:30 So I'll be sure to link to that so people can check it out.

23:33 And you talk about people developing in Yelp main,

23:35 some of them not, but there's still a lot happening there.

23:38 You said 20 pushes a day, 800 simultaneous developers.

23:43 And yeah, that's no joke.

23:47 That's a lot of traffic on a repo.

23:49 - Yeah, I think since I did that talk

23:51 where we've been trending down

23:53 in terms of number of changes per day,

23:55 but it's still probably like somewhere in the eight,

23:58 like 15 to 25 a day.

24:01 So it's less.

24:02 Like that's an appreciable percentage less, but it's still a lot of changes per day.

24:06 Yeah.

24:08 And you also said you have 700 Python package dependencies.

24:11 We talked about the private PyPI.

24:13 So when you say you have 700 dependencies, that's if I go into the virtual environment and type,

24:18 you know, pip list, I see 700 things.

24:20 Okay.

24:21 It's a lot.

24:21 It's a lot.

24:22 It was an ordeal dealing with that.

24:24 Especially coming from a long time ago until present, right?

24:28 In terms of code, code compatibility, right?

24:31 Some of those things you depended on,

24:33 maybe their new versions have moved to Python 3,

24:36 but maybe with breaking changes.

24:37 Others, they might just not have a Python 3 version.

24:41 And how'd you deal with that?

24:42 - There were basically, in terms of like open source stuff,

24:46 there were basically like three ways that we dealt with that.

24:48 So one is just like upgrade

24:51 and like deal with whatever the upgrade entails.

24:54 I don't think we really ran into any issues

24:56 where we were like, oh no,

24:57 we have to do this massive breaking change migration.

25:01 That wasn't really a problem that we ran into, thankfully.

25:04 So a lot of those were just sort of like

25:06 figuring out what packages need to be upgraded,

25:08 and just sort of doing the upgrade, making sure that they test pass and that kind of stuff.

25:11 So that wasn't too bad.

25:12 The other one, which was a little bit more annoying, was

25:16 like you said, some packages just stopped updating before

25:19 they got Python 3 support.

25:22 And we were relying on them, so we had to be like,

25:24 "Okay, well, can we replace these with something that, you know, fix that?"

25:29 And there were a few examples of packages

25:32 where it sort of stopped getting development and then someone was like,

25:35 "Oh, I see where the problem lies.

25:38 That's a problem for me, so I'm going to fix that."

25:40 And so luckily, a lot of people had already done that work and they,

25:43 there were like forks or sort of drop-in replacements.

25:46 Sometimes not exactly drop-in replacements, but like, you know, close enough that we could like do

25:50 do the small amount of work that was needed.

25:52 It's one of the advantages of being a little bit later to the party is

25:55 it lets other people bump into those problems and maybe they fixed them for you, right?

26:00 That probably happened most of the time, honestly.

26:01 That was definitely a good chunk of the time.

26:03 I couldn't tell you, I'd have to go back and run the numbers on what percentage of the time that was.

26:08 But we definitely, yeah, there was definitely a good chunk of things where we were just sort of like,

26:11 "Oh, someone already made a fork or whatever, and we can just use that."

26:14 And that was nice. It was, "Okay, that one checked off."

26:19 And then the final sort of grouping was stuff where that wasn't available.

26:25 So it was like, oh, this package is Python 2 only,

26:28 and no one ever made a replacement.

26:31 So we need to deal with that.

26:32 Luckily, none of those were in a position where we were completely

26:37 unable to deal with it.

26:38 Like, we didn't run into anything where we were like, oh, this is just like,

26:41 this is like a blocker.

26:42 But there were things where we were like, oh, this thing needs to be replaced

26:45 with something else that does something similar.

26:48 or maybe right away.

26:51 Very often we ran into code where it was like,

26:53 "Oh, this is using this thing."

26:55 And then you start looking into it and you're like,

26:57 "Oh, actually this code is like,

26:59 this branch or whatever that uses this package

27:02 isn't actually used anymore.

27:04 So we can just delete all that code

27:06 and not have to think about it."

27:08 So that's how we dealt with it.

27:09 Yeah, that's a nice way to upgrade it,

27:11 just get rid of it.

27:12 Were there any packages that you're out there

27:15 that didn't have Python 3 support?

27:17 like, "We really depend on this one,"

27:20 that you upgraded and contributed back?

27:23 Or were you able to just move on?

27:24 There was nothing that we ran into that was like

27:27 an absolute walker like that.

27:29 So we didn't end up contributing anything

27:32 in terms of open source, other than

27:35 there were some packages that are on our GitHub,

27:38 like the Yelp GitHub, that we did do upgrades for.

27:41 So that was the only sort of open source work

27:44 that I think we really ended up doing.

27:46 So luckily, I mean, I don't know if this is lucky or not,

27:49 but it's definitely, it happened

27:50 so that we didn't have to do that.

27:52 - Yeah, that's good.

27:53 I mean, it would be a nice if you ran across that

27:55 and helped solve it for someone,

27:57 but you don't have to even better.

27:59 Testing, one of the challenges of,

28:01 well, first it's good to have tests,

28:03 but one of the challenges of these upgrades

28:05 is you wanted to do this without disrupting development.

28:09 You wanted to keep adding new features.

28:12 You didn't want to say, hey, everyone,

28:14 stop making any progress or bug fixes for six months

28:18 and we're all just gonna do this until we're done.

28:20 All right, you wanted to keep it moving.

28:21 But in order to do so, you gotta run the test

28:24 'cause you're making wholesale changes

28:26 to millions of lines of code.

28:27 So that's pretty nerve wracking, right?

28:29 And you're swapping out its dependencies in big ways.

28:32 And yet running tests, you all have a lot of tests

28:35 and they take a while to run, right?

28:36 - Yeah, we have about 100,000 tests in Yelp main,

28:39 little under.

28:40 And yeah, if you were to run them serially,

28:43 at least when I wrote my talk, it was about 35 hours total.

28:46 But we have a test runner framework called Jolt

28:50 that we run internally.

28:51 And what it does is it basically like

28:53 puts those tests up into bundles

28:55 and then runs those across a bunch of machines.

28:58 And so you're basically able to get

29:00 all of the tests run for Yelp main

29:02 in about give or take an hour and a half.

29:05 - Okay, that's pretty good for running 100,000 tests.

29:07 That's still a long time to have a test run though, right?

29:10 So you probably need it.

29:11 You can't just get immediate feedback.

29:13 Minor change, how'd that go?

29:14 Minor change, how'd that go?

29:15 You gotta be a little more thoughtful than that, right?

29:17 - Yeah, I mean, I think that in terms of,

29:20 and this sort of gets into testing theory,

29:22 is that you start to get an idea of what changes

29:26 are like, affect what other things.

29:29 Sometimes you're not gonna have a perfect idea,

29:32 but if you're like, oh, this is a thing

29:35 that just affects everything,

29:36 then you're gonna run all the tests.

29:38 But we did have the ability to run tests

29:40 if we were like, okay, we want to just run tests under Python 3,

29:43 we could do like, oh, I'm just going to run this test module

29:47 under Python 3. I can do that.

29:48 And so if you were literally just like,

29:50 oh, I'm checking, I'm fixing this test under Python 3,

29:54 then you could just do that.

29:56 You could just be like, oh, I'm iterating very quickly

29:58 by changing the code and then running the test under Python 3.

30:01 And then, oh, it passes.

30:03 Okay, let me double check it passes under Python 2 as well.

30:06 And then you can commit that and then

30:09 put that into PR and then we do require,

30:12 so one of the things is we do require running all,

30:14 doing a full jolt run for every pull request to Yelp.

30:17 So in order to do, so you have to run that anyway,

30:21 but like, while you're waiting for that to run,

30:23 you can work on something else.

30:24 - Sure. - And you're really high confidence.

30:26 You run a couple, okay, that makes sense.

30:27 So run a couple local tests,

30:29 10 hundred, five hundred, whatever.

30:31 Once you're happy with that, then you put it as a PR and

30:34 CI figures out what happens.

30:36 - Yeah, and I think that ultimately,

30:39 when we were really early on,

30:40 and we were working on the really foundational stuff,

30:43 that that was causing the most issues,

30:45 that was the time when we were like,

30:47 oh, we really gotta run all the tests.

30:48 But once you get down to the nitty gritty,

30:50 pretty early on, actually,

30:52 you really don't need to think about

30:54 how it affects other things.

30:55 It's mostly just sort of like, yeah,

30:57 this module affects its own tests,

30:59 and that's pretty much it.

31:00 - Okay, yeah, I'm sure you get a feel for it over time.

31:02 You're like, these are the kinds of far-reaching changes,

31:05 And these are the kinds of things I can stay really focused on. This portion of Talk Python to Me is brought to you by user interviews. As a developer, how often do you find yourself talking back to products and services that you use? Sometimes it may be frustration over how it's working poorly. And if they just did such and such, it would work better? And it's easy to do. Other times? It might be delight. Wow, they autofill that section for me? How do they even do that? Wonderful. Thanks.

31:34 While this verbalization might be great to get the thoughts out of your head, did you

31:39 know that you can earn money for your feedback on real products?

31:43 User interviews connects researchers with professionals that want to participate in

31:46 research studies.

31:48 There is a high demand for developers to share their opinions on products being created for

31:53 developers.

31:54 Aside from the extra cash, you'll talk to people building products in your space.

31:58 You will not only learn about new tools being created, but you'll also shape the future

32:02 of the products that we all use.

32:04 It's completely free to sign up and you can apply to your first study in under five minutes.

32:09 The average study pays over $60.

32:12 However, many studies specifically interested in developers pay several hundreds of dollars

32:17 for a one-on-one interview.

32:19 Are you ready to earn extra income from sharing your expert opinion?

32:23 Head over to talkpython.fm/userinterviews to participate today.

32:28 The link is in your podcast player show notes.

32:30 Thank you to User Interviews for supporting the show.

32:32 The other requirement you said that you had was that any changes must be rollback safe.

32:40 Can you speak to that? I'm thinking like database migrations or that, right? What are you thinking here?

32:47 Yeah, I mean, it's, I think database migrations are a good example of that type of thing.

32:51 We didn't really run into a situation where we actually had to do any schema changes to databases.

32:57 although there was a thing where we had to do,

33:00 we had to make some changes to some data

33:02 such that it would be parsed properly under both

33:05 Python 2 and 3.

33:06 But yeah, what you always want to do is you want to say like,

33:09 "Okay, if I undo this later,

33:12 maybe like a week later, someone realizes,

33:14 'Oh, this change made a problem, has a problem,'

33:16 we don't want to be in this position where we say,

33:18 'Oh, we can't undo that.'

33:19 Something else, you know, it depends on it,

33:21 and we can't undo it."

33:23 And so that was like a main thing.

33:25 and it was just sort of like, don't do these things

33:27 where you're just sort of like,

33:28 oh, well, once we do this, we can't go back.

33:30 Like, no, don't do that.

33:31 If you need to like do some extra work

33:34 where you like build up scaffolding or whatever,

33:36 then like do that work instead.

33:38 And it might take a little bit longer in the long run,

33:42 but it makes us have less risk.

33:44 - Yeah, I'll save diving into this

33:46 for later in our conversation.

33:47 But one of the things that you were able to do

33:49 because of that is you were able to run

33:51 apps simultaneously in two and three

33:53 and use URL reverse proxy like Nginx or something to say,

33:58 this part of the web app runs Python 3,

34:00 and this one over here is running Python 2,

34:02 and filter the traffic and switch it

34:05 based on how it's performing or behaving.

34:07 If it goes wrong, you can switch it back quick.

34:09 If you didn't have that compatibility,

34:11 it would be like, all right, today we pulled a switch,

34:14 chunk, and then you deal with the consequences

34:16 for how long Yelp is down, right?

34:18 So that's an interesting consequence of this idea

34:21 that it should be able to be rollbackable

34:25 as you can actually run both versions

34:26 and then sort of migrate more cautiously.

34:29 You had a cool picture,

34:31 and let me put it on the screen for us here,

34:34 where you talked about the four different steps,

34:37 the phases and timelines,

34:40 and how much time you spent in there.

34:42 You wanna talk us through this?

34:43 - Yeah, sure.

34:44 So this is just sort of like,

34:45 if you wanna think about,

34:47 okay, you've got some Python 2 code,

34:50 and you want it to get it to Python 3.

34:52 It's very easy to think about it in a sort of atomic way,

34:54 as you sort of like, "Oh, make it Python 3 compatible."

34:57 And it's like, okay, it makes sense on small stuff.

35:00 You know, if you're like, "Oh, I got my 500 lines back,

35:02 and I'm gonna migrate to Python 3 today," you know.

35:04 But on big stuff, when you're talking about millions of lines of code,

35:08 you want to think about it in terms of, in sort of level of compatibility.

35:12 And so the three levels that we had to deal with here

35:15 were parsability, which basically just means

35:18 if you try and run this module with Python 3,

35:22 will it fail with a syntax error or not?

35:24 - Okay. - And so that's the main thing.

35:28 And parsability, it turns out, is pretty easy to fix

35:30 because there were not a huge number of syntax changes,

35:33 and they're pretty easy to detect and fix in an automated way.

35:35 Yeah, did you use some tooling like PyUpgrade or any of those types of things?

35:40 PyUpgrade we used a little bit.

35:41 It's not super designed for this,

35:43 but there was one specific thing that was really nice about it,

35:46 which is that it could detect octal literals.

35:50 So if you put like zero and then a number,

35:53 that's an octal literal in Python 2,

35:55 that's not allowed in Python 3,

35:56 so you have to do zero, O, number.

35:59 It was able to detect those really easily and fix them, which was really nice.

36:04 >> Those things, it sounds like,

36:05 "Oh, well, that's not that much work or that much help."

36:07 But when you're doing it across millions of lines of code,

36:10 anything you can automate,

36:11 it's got to be really welcome, right?

36:13 >> There's a relatively common pattern.

36:15 The reason that I remember that one is there was a relatively common pattern where like people would

36:19 create like date time objects and then they would write

36:22 year, month, day. And if the month

36:24 or a year was single digit, they would prefix it with a zero.

36:28 Which works in Python 2. They probably didn't mean to make it octal, but that's what they did.

36:33 And so it kind of worked.

36:36 And so people and so that existed in a lot of places and it was like a popular pattern.

36:41 But yeah, so PyUpgrade was useful in that way.

36:43 It was useful later on when we were like,

36:45 like, blowing away all the sick stuff,

36:47 because it's able to fix all those things automatically,

36:49 which is nice, or most of them.

36:51 Python Modernize was where a lot of,

36:54 most of our automation went,

36:55 because it could fix a lot of this stuff.

36:57 Yeah.

36:57 So that was parsability.

36:59 Importability is similar, is that

37:01 you try to import it, and then you say,

37:03 you say like, "Okay, this is

37:05 failing with an import error," like,

37:07 or something is making it fail to import, like,

37:10 usually running code at the top level.

37:12 And that was a little bit longer.

37:16 A lot of that was fixing standard lib imports,

37:19 making most of those use 6 shim.

37:22 If they change, there's 6 shim for that.

37:24 And then some of it was also upgrading the packages

37:28 so they could be used under Python 3 and imported.

37:31 But there was a little bit of top-level stuff

37:34 where it was like, "Oh, this top-level thing is

37:36 calling dict.iteritems or something, and you gotta fix that."

37:40 Yeah, so that's, that probably gets maybe a little into the functional parity,

37:45 which if people look at your talk, they'll see there's a couple weeks of the parsability,

37:50 maybe a month or two of the importability, then a whole bunch of the functional parity.

37:55 And it reminds me when I was learning C++ way, way, way back.

37:59 And I got really excited because I finally got some complicated code to compile.

38:03 Not really knowing like, oh, no, no, no, no, no,

38:07 you're only at the beginning of figuring out what's wrong with this.

38:09 The compile is the part where it shows you what's wrong.

38:11 Now it's like the mystery tour.

38:13 And this is after that, right?

38:15 This is like kind of once you get past parsing and importing,

38:18 then you're into the how are they different behaviorally.

38:21 Yeah, and this is, it's just sort of like,

38:23 I alluded to this earlier, but basically the idea of

38:27 you run all of your tests.

38:29 And luckily we had already built up a lot of infrastructure

38:32 that was really useful to us.

38:33 So one of the things that Jolt was able to do

38:36 is it was able to do some normalization of tracebacks,

38:40 and then be like, "Oh, these tracebacks are similar enough

38:44 that I'm going to group them together as a single error."

38:47 And say, "Oh, this many tests are failing with this traceback."

38:51 That was really useful because we were able to be like,

38:55 "Okay, here's where the error is,

38:57 and it's going to fix this many tests."

38:59 Or at least unblock this many tests.

39:01 So that was about a year of basically going through all of those test failures

39:06 and figuring out, okay, why do they fail under Python 3?

39:10 And just fixing them.

39:12 So a lot of it was like, oh, this thing's supposed to be a string,

39:16 but it's bytes or vice versa.

39:19 They're calling .items, but that used to be a list and it's not a list anymore.

39:24 Yeah, so you can index it.

39:25 Yeah, so there's all sorts of nitty gritty things

39:28 that you just have to go through and fix them

39:31 Some of them are automatable, but like, you really need to,

39:34 but not everything is, and some of them are more subtle.

39:37 What was your target Python 3 version?

39:39 So we originally targeted 3.6.

39:42 At the time, it was the newest version.

39:45 When we started the project, it was like the newest version that we had available.

39:48 That was, that we were like, we're like, you know, we're sort of ready for, if you will.

39:53 Yeah.

39:53 During the project, because obviously a long project, we were able to get 3.7 available.

39:58 And it was actually really great because we were like, I don't know,

40:01 less than a month out from when we were like, oh, we're going to start doing the rollout.

40:05 And my coworker, Chris, who wrote dumb PyPI,

40:07 was working on this project at the time. And he was like,

40:10 "You know what, I bet we could migrate this to Python 3.7."

40:13 And I'm like, "Go for it. Let's see how hard it is."

40:16 And he did it in like a day.

40:17 So it was just like, oh, he just upgraded it.

40:19 It was like, I think there were like maybe a few...

40:22 3.7 does have like that one backwards incompatibility

40:25 where it makes async a keyword.

40:26 So there were a few packages where he needed to upgrade,

40:29 but he was able to do it really quickly,

40:31 and we were just like, "Okay, and now we're going to roll out to 3.7."

40:34 So that was nice that it was sort of like we were working on 3.6 for most of it.

40:38 We switched to 3.7 near the end, and it just sort of worked.

40:41 It sets the foundation for going to the next version after that, right?

40:44 It's actually really weird.

40:45 Chris is working on that again.

40:47 He's going to be trying to upgrade us to 3.8 this week, basically.

40:51 Okay, cool. That's really excellent.

40:54 What was the emotional state of you and the team

40:58 as you were going through that year of fixing?

41:00 No, it's list of dict.items, not list.items

41:04 or dictionary.items.

41:05 And probably excited in the beginning,

41:07 but six months, then what was that like?

41:10 Were we making progress or like,

41:11 "Oh God, it's still here, we're not done."

41:13 I knew what it was going to be like going into it.

41:15 Like I was like, I mean, not exactly,

41:17 but like I was like,

41:18 "I know that it's going to be this thing

41:19 where it's like, we're going to make some progress

41:21 and then it's going to taper off

41:22 because of the way that these things work.

41:25 But it was definitely like,

41:27 it was sort of like you were just sort of doing your tasks every day,

41:30 and each task in and of itself was not valuable, right?

41:33 It was sort of like, "Oh, well, I fixed three tests today," you know, kind of thing.

41:37 But ultimately, I was able to see where the end was.

41:42 So for me, I was like, "Yeah, we're going to do this. We're going to do it."

41:45 I think not everyone on my team was necessarily as sort of buying the prize as I was,

41:51 which is fine. I think we ended up swapping out...

41:54 Basically, everyone on the team ended up working on it at one point or another,

41:57 but it was only me and another one of my colleagues

42:00 who worked on it basically the whole time.

42:03 So I think part of it was that some people

42:06 were like, "Okay, I'll work on that a little bit,

42:09 but I don't want to only work on that." And that's totally understandable.

42:12 I think that this type of work is kind of tedious.

42:14 And this is sort of like...

42:16 This is another argument against monoliths.

42:18 It's sort of saying, "If you have to do this

42:21 when you need to do these separate migrations,

42:23 it becomes really punishing on software engineers.

42:25 >> If you haven't done linting ever,

42:28 and then five years into it,

42:29 you're like, "Oh, let's see what's wrong with it."

42:31 You run into a hundred thousand errors,

42:33 like, "You know what? We're not doing that.

42:34 We're just going to ignore those.

42:35 Let's just stay." Because you can't just stop and go do 100,000 fixes.

42:40 There's more value on the other side of this.

42:42 So it makes a lot of sense.

42:44 But it must have felt pretty good to get it all done though.

42:47 >> It really did. I mean, it's weird how

42:50 these projects work is that like, you're sort of like you're doing the work, you're doing the work and then like one day you're just sort of like, and we're done. And it's been a year and a half of my life, you know, like, but it's exciting. It was I had multiple people tell me, they said to me, Hey, it's so cool that you that you know, we were able to do that, because I never thought it would happen. Yeah, it's kind of amazing to do something that some people are like, this won't ever happen. But I did it happened and we made it happen. And I think that was, you know, really great.

43:20 Let's talk a little bit about how you were able to run this on Python 2 and 3.

43:25 What did you do? You basically create two virtual environments, one from each setup,

43:28 and then, or each version, and then run tests there, try it out there.

43:34 Yep, that's basically it. I mean, there is a technique to having like code that runs under Python 2 and 3,

43:40 which is that, you know, you basically have to make sure that you're using compatibility layers,

43:45 and we use 6 for that.

43:47 that was something that me and most of the people on my team had some pretty significant experience doing,

43:54 because that's basically how we wrote all of our libraries, our internal libraries,

43:58 and actually a lot of our open source ones as well,

44:01 because for a long time there, you were like, "Okay, I want to have Python 2 and 3 compatibility."

44:06 So having code that works under both was pretty normal.

44:09 Making sure that that code can run under Python 2 and 3,

44:13 and then building the virtual one,

44:15 There was a little bit of nuance or like a there was a snag there

44:18 Which is that like something that comes up every once in a while when you're doing this kind of stuff

44:23 And this still this is still a thing that happens to this day is you have you have to deal with backport packages

44:28 so there's like the futures backport, which was

44:32 The concurrent which like concurrent futures was added in might have been 3.0

44:36 I don't remember exactly what version of python 3 python it was added in and there was like a few other backports

44:42 functools32's backport for adding some of the stuff in Python 3.2's functools, like lru-cache,

44:50 which is something we used a lot of. So those were both packages where we needed to actually

44:55 install them in Python 2, because there are packages somewhere in our dev tree that needs

45:01 them. So what we ended up doing is we made this silly little script that just took our requirements

45:08 and then filtered out the things that don't install under Python 3

45:13 and just spit out a new one, and that's the one we built our Python 3 virtual link with.

45:17 And so then, now we have Python 2 and Python 3 virtual link,

45:20 and they're like really similar, not exactly the same, but close enough,

45:24 and then we can run the test against either one of them.

45:27 And then eventually we would do the rollout.

45:30 It doesn't sound like one of the challenges had to do with caching,

45:33 And you have a way in which you were using pickle

45:37 to stuff some results into memcached.

45:40 Is it memcached D or memcached like past tense?

45:44 I never know how to pronounce that one right.

45:46 - I looked at their website like a year ago

45:49 or like I guess two years ago or something

45:51 when I was actually working on this.

45:53 And I'm sure that I know, I'm sure that I read it

45:56 'cause it said it there I remember

45:58 but I don't remember what the answer is.

45:59 - Yeah, no worries.

46:00 So let's go with memcached, I'll call it memcached.

46:02 So you were, previously you were pickling things.

46:06 You were CPickling, but then that just became pickling.

46:09 But at some point, it's one thing to say

46:11 at the database query level,

46:13 well, deserialize and Serialize an ORM object

46:16 to match the schema.

46:17 It's a whole nother to say the binary shape of this thing

46:20 is the same across Python versions,

46:22 which is highly unlikely, right?

46:24 Which is pickle.

46:24 - It's basically impossible.

46:26 That was like the, yeah,

46:27 that was one of the big problems that we had.

46:28 So we were basically taking,

46:29 we were basically like, so we would pickle,

46:31 There's a cache key and then there's a cache value.

46:33 We'd pickle both, and then the cache key we would like hash

46:37 so that it could be a specific binary sequence.

46:41 And then we'd key into that

46:45 in order to get stuff out of the cache.

46:49 But it turns out that for a multitude of reasons,

46:53 both the key, like you said, the key is not going to be binary the same.

46:56 So that's one of the problems.

46:57 And the other problem is that there's a lot of weirdnesses

47:00 when you end up like either reading Python 2 Pickles in Python 3,

47:06 or reading Python 3 Pickles in Python 2.

47:08 It seems like Pickles are kind of meant to be transient.

47:11 They're not meant to be long-term storage

47:13 because there's not a lot of guarantees around their parsability.

47:17 We were like, okay, well, what's a thing that we can do

47:20 where we don't have to start being like,

47:21 okay, now we have to write complicated serialization and stuff.

47:26 And we were like, well, probably JSON. JSON will work.

47:29 And so this is something I worked on for about three months or something,

47:33 was just migrating all of our caches to use JSON instead of Pickles.

47:38 Yeah, you had this kind of fallback mechanism or this slow upgrade mechanism

47:43 that said, try to get the JSON version from Memcache.

47:46 And if you got it, awesome, go with that.

47:49 But then fall back and try to get the binary Pickle,

47:52 but then immediately replace it with the JSON version

47:55 so that it just grows over time.

47:57 I mean, thinking about that much code and that many services,

48:00 there must just be a ton of startup cost

48:03 if you just kick all the servers over and clean the cache.

48:05 We've never tried it.

48:07 I think everyone's a little bit too scared.

48:09 But it's definitely not something we wanted to do.

48:11 And we wanted to be able to be like,

48:13 okay, if we're, when we cut over to Python 3,

48:15 we're not just going to lose all of our caches.

48:16 I think this is actually a really great example

48:18 of something we were discussing before the recording started

48:21 of doing a sort of incremental upgrade.

48:24 And one of the other things I didn't super get into

48:29 with in my talk is that like,

48:31 one of the things that I felt was a really cool technique,

48:34 and this really depends on whether or not this is worth it,

48:36 depends on like how you end up,

48:38 what the value of your like uptime is basically

48:41 compared to your dev time.

48:42 But what I did is I sort of logged,

48:46 what I would do is I would like,

48:47 for every cache, I'd be like, okay, I'm gonna try to log this to JSON.

48:50 And then if it failed, I wouldn't just fail,

48:52 I'd do the normal stuff, I'd do all the pickle stuff, whatever,

48:55 but then I'd log it somewhere.

48:57 And so that way I could just look at this log and be like,

48:59 "Oh, here's where my errors are."

49:01 So it wasn't just like, "Oh, I would like ship changes

49:05 and then see if there were actual errors on prediction."

49:08 It's like, there's no errors on prediction,

49:10 there's just errors in this log that I can fix and iterate on,

49:15 and no user ever sees a 500.

49:19 Not everything's going to fit into that,

49:20 But I think that's a really useful technique.

49:23 Yeah, that is really cool because

49:24 no matter how much testing you do on something this big,

49:27 it's not until you really put it out there

49:29 you see that 100% sure it's going to hang together.

49:31 But if it can fail silently in a way that people don't see,

49:36 but you get notified about this and can start working on it,

49:39 that's really valuable.

49:41 Another thing that you did that I thought was pretty clever

49:43 was the way that you did the rollouts

49:46 where you were able to say,

49:48 Even though this is one huge monolith of code, it doesn't mean it breaks evenly.

49:54 Right? Once you get it past the parsability stage, there could be some URL

49:58 endpoint that's going to fail if you request it.

50:00 And another that works totally fine.

50:01 All right.

50:02 So what you were, what you all did is you created a reverse proxy.

50:06 And I was imagining Nginx.

50:08 What were you actually using here?

50:10 So it's kind of Nginx.

50:11 It's OpenResty, which is a framework where you can write

50:16 plugins for Nginx in Lua.

50:20 So you can do some sort of general logic in that.

50:24 - So you're basically able to say,

50:26 when you go to yelp.com/something

50:28 or api.yelp.com or whatever it is,

50:30 as far as a user, it's the same.

50:32 But some of those URLs are hitting the Python 3 version

50:36 of this large monolith app running,

50:38 and some are hitting the Python 2.

50:39 and you could move it URL, URL, endpoint at a time, right?

50:44 Yeah.

50:45 Talk to us about that. That's pretty clever.

50:46 Yeah, this was a super cool technique.

50:49 So we already had the reverse proxy layer, we had the routing service.

50:53 This is something that we had built for just sort of consolidating a bunch of logic

50:57 in a general place where like everything could rely on it.

51:01 But it was a really great place for us to be able to put this logic as well.

51:06 And I'm going to say him again, Chris Keel, my colleague on my team,

51:12 came up with this idea as well.

51:14 So it's such a great idea, and it applies, I think, really generally.

51:19 Like, you can just sort of say, "Okay, anytime I'm doing some sort of rollout

51:22 where the setup is in such a way that I can't do it within my application,

51:27 like there's something about the application setup,

51:29 if you have this external layer, then you can pretty easily do it."

51:34 And yeah, it was basically just sort of like,

51:37 we would have a configuration and it would say,

51:39 "Okay, this endpoint prefix

51:43 would go to Python 2, or this one would go to Python 3."

51:47 And we could actually even be a little bit more granular than that.

51:50 We could actually give it a percentage of the time.

51:52 So basically, 20% of the time it goes to Python 2,

51:55 80% of the time it goes to Python 3.

51:57 And so we could do these sort of slower rollouts

52:00 if people wanted to be more careful.

52:02 I see.

52:02 So maybe it goes like it's on Python two.

52:04 Now 1% of the traffic goes to Python three.

52:07 Is it dying or no, it seems okay.

52:09 It seems okay.

52:10 All right.

52:10 Now 20 now 80, like you could like slowly move it over.

52:14 So if it fails, at least it fails just for a few people.

52:18 And even you don't even roll it back.

52:20 You just stop sending traffic there and fix it, which is really good.

52:24 Yeah.

52:24 Yeah.

52:24 Very clever.

52:25 And it certainly makes sense for large projects, but it was great.

52:28 Is that lets you start getting your Python three version in production.

52:31 Way earlier, right? You're not waiting on the last endpoint.

52:34 You just need the first endpoint. I mean, probably you didn't do this very,

52:37 like, one URL works, put it out there, but like,

52:39 you could do it much sooner than you would otherwise, right?

52:42 For various sort of practical reasons,

52:46 we didn't want to actually start the rollout until we were like,

52:49 "Oh, all of the tests pass under Python 3."

52:52 Because we didn't want people to be like,

52:54 "Oh, I'm running my tests and they're not passing and that's bad.

52:58 and I'm either going to like ignore them

53:00 or try to fix them in a bad way and stuff like that.

53:03 But like, I mean, it was like a two month process

53:05 where we were like from the first endpoint

53:07 to the last endpoint, it was like two months.

53:09 And so that was able, that was really nice

53:11 because it was like, oh, we would disect issues

53:14 and then we would, but we would keep rolling out other stuff

53:18 and then, you know, the teams

53:19 or we could try and fix the issues and so.

53:22 - Very neat, that's great.

53:24 All right, let's wrap this up.

53:25 We're getting short on time here.

53:27 you had some clear benefits,

53:29 even though you went to Python 3.6,

53:31 which I think you'll see those benefits again

53:34 if you go to 3.11, all right?

53:36 But even so, going from, where'd you go from 2.7 to 3.6?

53:40 - We went from 2.7 to 3.7.

53:42 - 2.7 to 3.7, right on, okay, cool.

53:44 And you said it got faster and used less memory.

53:47 That's pretty good.

53:48 - I don't remember the exact numbers.

53:50 It was in my talk.

53:50 - I stole them from your talk.

53:51 15 to 20% speed up and 20% less memory.

53:54 That's tangible.

53:56 That's right. That's right. Yeah. I remember this is something I didn't mention my talk, but I thought I think it's kind of interesting. So we have some stuff that is, you know, more CPU heavy, which we send to what we call VIP instances. So VIP, like containers have more memory and more CPU allocated towards them.

54:19 And so I remember I talked to someone

54:21 who was involved in doing a lot of that sort of operational stuff.

54:26 And after that migration,

54:28 they looked at numbers and they were like, "Oh, we can now

54:31 scale down the VIP to what the old normal one was."

54:36 And the normal one is now scaled down even more.

54:38 Oh, that's cool.

54:38 Yeah, which was super cool. So it was super good to do that.

54:42 And I think that beyond just sort of like,

54:45 "Oh, this gave us this immediate,

54:48 or this gave us this outcome, it's like, we weren't

54:51 going for this outcome, but I think it really shows how

54:53 this type of work, if you're like,

54:56 I think very often it's easy to look at

54:58 base level infrastructure work as like, oh, well,

55:00 it's just maintenance and you just need to do it and blah, blah, blah, and

55:04 it's not really benefiting anything. And it's like,

55:06 no, we did this and it saved money on our bottom line.

55:10 And not necessarily everything is going to be like that, but I think that

55:13 thinking about base level infrastructure is like,

55:16 it does have a benefit.

55:18 It might not necessarily be obvious

55:20 before you do it,

55:22 but this is an example of

55:24 okay, if you're doing your upgrades,

55:26 you get to take advantage of all the really hard work

55:29 that all of the people on the Python language team have done

55:31 to make it more efficient and faster.

55:35 Yep, and it probably opens up other possibilities.

55:38 The previous show I just did, which

55:39 not out yet, so you wouldn't know, but I was talking about Ruff,

55:42 the linter written in Rust for Python.

55:45 But you have this ability to integrate with more modern tools and modern language.

55:49 It's like, oh, if we got to rewrite this section and Rust for that computation,

55:53 it's trivial now, where probably it wasn't before.

55:56 I would imagine. I haven't tried integrating Python 7 with things like that,

56:01 but I bet it's not as easy as the new tools.

56:04 >> Yeah, for sure.

56:05 >> All right. Well, let's leave it here.

56:06 I think that's all the time we got to talk about it.

56:09 But you must be enjoying it,

56:11 enjoying work on the projects and the features more now.

56:14 you can just, the world is your oyster again.

56:17 - Yeah, I love using,

56:19 the project that I've been working on actually lately

56:21 is we've been adding a lot of type annotations internally.

56:24 That's a Python 3 feature right there.

56:27 - Yeah, absolutely.

56:27 You can use f-strings, you can use type annotations,

56:30 you can start using tools like mypy,

56:33 not just standard type annotations, just for editors,

56:35 but yeah, Pydantic, for example, all those things, right?

56:40 Very cool.

56:41 All right, now before you get out of here,

56:42 I got two questions.

56:43 always says at the end of the show,

56:46 if you're gonna write some Python code,

56:47 what editor are you using these days?

56:48 - I'm a Vim person.

56:50 Do all my development in Vim.

56:52 - Right on.

56:52 And then notable PyPI package.

56:55 Give a shout out to Python Modernize,

56:57 but anything that stands out

56:58 you wanna give a shout out to like that?

57:00 - I mentioned Python Modernize,

57:01 great, excellent tool for what it is.

57:04 I also give a shout out in my,

57:06 I give a shout out to a couple other things in my talk,

57:09 but I think they're definitely worth

57:10 giving a shout out to still,

57:11 which is PyUpgrade, which we mentioned earlier.

57:15 It has a lot of really nice features,

57:17 but one of the other things is the other half of Python Modernize

57:22 is that it can take your six shim-filled code

57:25 and turn it into normal Python pre-code.

57:28 And then another tool by a former colleague of mine, Anthony Stille,

57:33 is Precommit.

57:35 Modernize is a Precommit hook, PyUpgrade is a Precommit hook.

57:38 That's a thing that we use extensively internally.

57:40 super, super nice to be able to do all those sorts of things and do it in an incremental way, which is something that was really valuable.

57:48 And then the last one, this is just a completely random one, but I just love it, is

57:53 more itertools, one of my favorite packages. We have an internal package

57:58 that has a lot of the functions that are in moreitertools, but they're worse,

58:02 or quirky in some way that I don't like.

58:06 And so that's something that was like,

58:08 when I first found out about it, I was like, "Oh man, this is great."

58:10 And I think it's pretty popular now, but like,

58:13 I think it's one of those things where it's just sort of like,

58:15 "Oh, there's all these little functions that you're like,

58:17 'Oh, I could write that,' and it's like, you could, but you'd probably write it

58:20 with bad edge cases or something."

58:22 It's just, it's a great library.

58:24 And it's better if you don't have to write it, that's for sure.

58:26 - Yes. - All right.

58:27 Well, Ben, thanks for being on the show.

58:29 Final call to action, maybe some other people are out there facing

58:32 this transformation they gotta do.

58:34 like I said, not actually Python 2 to 3, but

58:36 some major foundation in their code base to another.

58:39 What do you tell them?

58:40 Figure out a way to make it incremental.

58:42 That's really, I think, the main takeaway for me is that

58:45 incremental changes

58:47 have multiple benefits.

58:49 They make you less risky. You're able to

58:52 do these types of changes

58:54 in a way where you don't necessarily have to be like,

58:57 "Oh, we have to schedule two years of work."

58:58 It's like, no, you can do it a chunk at a time

59:01 when you have time.

59:03 and also it just generally makes you have less errors.

59:08 Absolutely.

59:09 Alright, well, thanks so much for being here.

59:11 It's been great to have you on the show. Appreciate it.

59:13 Thanks so much for having me.

59:15 This has been another episode of Talk Python to Me.

59:18 Thank you to our sponsors.

59:20 Be sure to check out what they're offering. It really helps support the show.

59:23 Join Cox Automotive and use your technical skills

59:26 to transform the way the world buys, sells, and owns cars.

59:30 Find an exciting position that's right for you at talkpython.fm/cox.

59:35 Earn extra income from sharing your software development opinion at user interviews.

59:40 Head over to talkpython.fm/userinterviews to participate today.

59:45 Want to level up your Python?

59:47 We have one of the largest catalogs of Python video courses over at Talk Python.

59:51 Our content ranges from true beginners to deeply advanced topics like memory and async.

59:56 And best of all, there's not a subscription in sight.

59:59 Check it out for yourself at training.talkpython.fm.

01:00:02 Be sure to subscribe to the show, open your favorite podcast app,

01:00:06 and search for Python. We should be right at the top.

01:00:08 You can also find the iTunes feed at /iTunes,

01:00:11 the Google Play feed at /play,

01:00:13 and the Direct RSS feed at /rss on talkpython.fm.

01:00:18 We're live streaming most of our recordings these days.

01:00:21 If you want to be part of the show and have your comments featured on the air,

01:00:24 Be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

01:00:29 This is your host, Michael Kennedy.

01:00:31 Thanks so much for listening.

01:00:32 I really appreciate it.

01:00:33 Now get out there and write some Python code.

01:00:35 (upbeat music)

01:00:38 [Music]

01:00:53 (upbeat music)

01:00:55 [BLANK_AUDIO]

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon