Monitor performance issues & errors in your code

#415: Future of Pydantic and FastAPI Transcript

Recorded on Sunday, Apr 23, 2023.

00:00 The release of Pydantic 2.0, its partial rewrite in Rust, and its refactoring into Pydantic Core and top-level Pydantic in Python is big news.

00:09 In fact, the alpha of Pydantic 2 was just released.

00:12 Of course, these changes will have potentially wide-ranging and positive effects on the libraries that are built upon Pydantic, such as FastAPI, Beanie, and others.

00:22 That's why this chance I had to catch up with Samuel Colvin from Pydantic and Sebastian Ramirez from FastAPI together live from PyCon 2023 was really timely. It's a super fun and wide ranging interview I'm sure you'll enjoy. Plus, there's a bit of an Easter egg in the middle. This is Talk Python to Me episode 415 recorded on location at PyCon in Salt Lake City on April 23rd, 2023.

00:50 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy.

01:05 Follow me on Mastodon, where I'm @mkennedy and follow the podcast using @talkpython, both on fosstodon.org.

01:12 Be careful with impersonating accounts on other instances.

01:15 There are many.

01:16 Keep up with the show and listen to over seven years of past episodes at talkpython.fm.

01:20 We've started streaming most of our episodes live on YouTube.

01:24 Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.

01:31 This episode is sponsored by Sentry.

01:34 Don't let those errors go unnoticed.

01:36 Use Sentry.

01:37 get started today at talkpython.fm/sentry.

01:40 And it's brought to you by InfluxDB.

01:42 InfluxDB is a database purpose-built for handling time series data at a massive scale for real-time analytics.

01:49 Try it for free at talkpython.fm/influxdb.

01:53 Samuel, Sebastian, super nice to see you here at PyCon.

01:56 Welcome to the show.

01:57 - Thank you very much for having us.

01:59 It's strange and exciting to do this live and to see you in person. - It is, yes, I know.

02:04 - Normally it's remote over screen share over half the world or something like that.

02:09 - Yeah, I've been able to talk with you, like also with Samuel here, like it's super cool, super cool to be here.

02:14 - Yeah, thank you very much, yes.

02:15 It's great to be here.

02:16 It's really fun being at PyCon and then doing this is like, yeah, even more fun.

02:19 I've done my talk, so I'm much more relaxed than I would have been if it had been this time yesterday.

02:23 - I was just thinking, talking to someone else, like one of the best parts about giving a talk is that when it's over, you can really relax.

02:30 You know what I mean?

02:31 You're like, okay, now I can enjoy the conference.

02:33 - Absolutely, yeah.

02:34 - And the parties, 'cause you can't go too big on the parties if you gotta talk.

02:37 - I feel like all the best ones were last night.

02:38 I feel like--

02:39 - I'm afraid, we were at a pretty good one last night.

02:40 (laughing)

02:41 Yeah.

02:42 - Yeah.

02:43 - But that was excellent.

02:44 All right, well, really good to have you both at the show.

02:46 I guess you hardly need introductions.

02:49 You both are doing such cool work.

02:51 We've had you on the show several times each.

02:53 So maybe just, let's start with a catch up.

02:56 Like, you both have lots of big news.

02:58 Don't wanna necessarily spoil too much, but you know, what have you been up to?

03:01 Yeah, so I raised money earlier this year.

03:04 Well, it was all sorted last year.

03:05 Money came in in January this year.

03:07 Started a company around Pydantic.

03:08 So I've been busy hiring.

03:10 Got a team of seven now.

03:12 One more going to join in June.

03:14 And yeah, currently we're working full-time on Pydantic version two, getting that released.

03:18 And then after that, we're gonna move on to the commercial plans, which I'm not talking about too much, mostly because they're up in the air a bit.

03:25 Also because if you start talking about them, you have to finish talking about them, and then that's like, I'll just take over the whole podcast.

03:30 So I'll say that, yeah, working on Pydantic V2 for now and then moving on soon.

03:35 Well, first from the whole community, congratulations.

03:37 You must be really thrilled.

03:38 Yeah, it's amazing.

03:40 It's a very surreal, right?

03:41 Because I was going to say, did you see this coming?

03:43 No, I didn't.

03:45 My plan had been to start a different company once Pydantic V2 was done.

03:50 And then in November, Bogumil from Sequoia, who Sebastian knew, Sebastian recommended he chatted to me.

03:56 We had a call.

03:57 We had another call two weeks later.

03:59 And then he said, let's have the final meeting with a few more partners to decide whether to invest in two weeks time.

04:04 So I thought, oh, I should probably go and speak to some other VCs.

04:07 So Sebastian very kindly got me lots of intros.

04:09 My girlfriend also got me some intros.

04:10 I had like five meetings lined up.

04:12 And then the floodgates opened and I got another 20 or so VCs emailing me being like, please can I call?

04:18 Starting to hear about, oh, why are we not part of this?

04:21 Right.

04:22 And then I got COVID.

04:23 So I spent a week locked in the bedroom upstairs doing VC calls, most of them with the camera are feeling absolutely horrific.

04:30 And yeah, and then came back full circle, back and had the big call with Sequoia and took their money.

04:35 And they've been amazing.

04:36 - So it was Sequoia that invested?

04:37 - Yeah.

04:38 - Wow, awesome.

04:39 That's a big name to have behind you.

04:40 - So it's Sequoia and Patek, who are the smaller VC, who are like French American, and then Irregular Expression, which is this really cool CTO network based kind of again like New York and Paris, and then a bunch of angels.

04:52 - Yeah, last time we spoke it was about Pydantic V2 and then all of this broke and yeah.

04:58 I feel like I'm just back up to, as in, problem was, although I was doing it to speed up, that was within two months of basically doing meetings and doing legals.

05:06 So I think I've now got a team sufficiently that I'm like caught up to where I would have been if I had just sat there and written code all along.

05:12 - Yeah.

05:12 That's how it goes, right?

05:15 You gotta put a little more sand in the gears to grow, I guess.

05:20 And Sebastian, how about you?

05:21 What have you been up to?

05:22 How have you been?

05:23 - Oh, I've been good.

05:25 Very excited about what they are doing at Pydantic.

05:27 like the team they are assembling is like just amazing.

05:30 And like, yeah, just like recently working a bunch in FastAPI and like Dyper and actually like in also some of the low level things of FastAPI and also not just FastAPI, but like the things that go underneath. Right now, like one of the things that I am pushing for is having documentation of the API reference of the reference of the each one of the parameters for what it is for each one of the methods, like all the stuff.

05:56 Yeah, and I want to do it in a better way that is more maintainable and that I can test the actual documentation for those parameters and consistency between like, there's a bunch of things that I'm trying to do and like it also goes to the low levels of like typing and interacting with the people that is handling typing and like all the stuff that is super cool, super exciting, but like, I think it can work and it can make these things have like, you know, like the API reference for the tools is something that a lot of people have been requesting and like being able to have that in a way that is easy to maintain, that can work well and that I can handle, like, I think that's super exciting on that side.

06:34 And on the other side, of course, like the integration with PyDantic V2 is super exciting now that they have like the first alpha available.

06:41 - It is, I mean, here you are going along, working on FastAPI.

06:45 everyone I talked to was just universally impressed with it.

06:49 Honestly, I've never heard a bad thing about FastAPI, and people are really enjoying it.

06:54 And then here comes Samuel just changing the foundation, changing up Pydantic.

06:59 - Taking FastAPI.

07:01 - No, no, no, I'm just teasing.

07:03 So how much work is that actually going to be to kind of make this change? Is it kind of nothing or is it some work?

07:09 - No, there will be some work in FastAPI.

07:11 The thing is, for final users, it will be like almost transparent.

07:15 They will probably, like if they are doing like weird stuff, complex things like that touches the corner cases or things like that, like they will probably have to update some things.

07:25 But for most of the use cases, it will be pretty much transparent for people.

07:29 And they will just like get like the, like I don't know, like 10, 20X performance from Python TV2.

07:34 And also like the--

07:35 - I was gonna suggest on the performance, I'm sorry to interrupt you.

07:37 One of the big things that we will be able to, you'll be able to drop from FastAPI is the, I'm gonna call it hack, but it's not your fault, it's my fault.

07:45 of like don't ask the type problem of serialization.

07:49 So I think that the speed up on serialization in FastAPI could exceed, like it could be even bigger than that.

07:54 I don't know that yet, but I'm really hopeful for some massive improvements because of fixes in Pydantic that make FastAPI simpler and more elegant.

08:03 - Yeah, and since they're turning off the lights, we'll see how long we last here.

08:07 (laughing)

08:08 We'll stay as long as we can.

08:10 If you hear any noise in the background, that's 'cause they're trying to tear down PyCon, but we're gonna work for it.

08:14 - We're wearing down PyCon around us, it seems.

08:16 - We will not let it be torn down, it will live on.

08:19 - It's not because of PyDantic, we do.

08:20 (laughing)

08:22 - So, yeah, we may have to pause a minute, but we'll find out.

08:25 Anyway, from a user for PyDantic, Samuel, if you haven't gone, like, deeply gone into, like, root object validation and all that kind of stuff, it's probably, you won't even know, right?

08:36 - So I think the hardest thing, yeah, you're right, the vast majority of your code will either continue to work or we'll have a, >> We'll have-- >> We're going to crush it.

08:45 >> If you get run over by a forklift, it's going to really slow down the development of Pydantic, by the way.

08:50 >> We're going to have a mod tool to change the name of methods.

08:55 So, with luck, the vast majority of the changes should be automated.

09:00 I suspect that, and I was saying this earlier in the open space, the hardest thing is probably going to be where your API subtly changes in its restrictions because of effectively edge cases that Pydantic has fixed.

09:13 Like, so for example, in Pylantic V1, we would coerce a int to a string.

09:17 If you ever passed a string to an int to a string field, we would coerce it.

09:22 I think that that's wrong and we shouldn't have done it, and so now we don't.

09:24 But I was saying in the example, if for some reason you stored your IDs as strings, and therefore your API had the ID field as a string, but your user was just like pumping them into your API as integers 'cause that seemed to make sense to them, that's going to break.

09:38 And you probably haven't gotten a unit test that tests that because you know your ID field is a string.

09:42 So I feel sorry for those people.

09:44 And my biggest request would be if you're a user, try Pydantic v2 as soon as possible.

09:51 I know if you use it via FastAPI, you can't yet, but like all the other libraries.

09:55 But the sooner you can try it, the sooner you can tell us, and the more easily we can fix things.

09:59 And we are prepared to add compatibility shims.

10:02 - Okay, well, I mean, in Python, we have sort of a from futures import, well, there'd be from history import type of reverse thing, to slow that down or is it going to be a deprecation or is it just...

10:14 So we're doing deprecation warnings everywhere we can or deprecation errors saying this has gone away you probably want to replace it with this thing.

10:21 We're working really hard on that.

10:22 We haven't got a like from future import or a compact layer yet for actual like validation logic but if we have to we will.

10:30 Yeah okay you'll see right?

10:32 See how much screaming...

10:34 What we didn't want to do was try and guess at what the problems were and build a compatibility layer that people didn't need.

10:39 Yeah, of course.

10:40 So that's why we're doing it this way.

10:42 Yeah, that makes a lot of sense.

10:43 You want to go as minimal, backwards, trying to fill those gaps as possible, right?

10:48 And if I'm brutal about it, if, in certain name of big bank that use Pydantic locks and never engage with the open source community, get stung by this, they never paid me a penny and they've never engaged, then I'm sorry for them, but I'm not as sorry as I would be if they had come and reported an issue and tried to help along the way.

11:06 Yeah, yeah.

11:07 Can we work with you to smooth this over?

11:09 you know, worst case, init, be, you know, equal, equal, Pydantic equal, equal, 1.10.

11:14 - I think we'll carry on supporting critical security fixes for a year.

11:18 - Okay, so there's something of an LTS type of thing you're thinking?

11:21 - Yeah, for a while we have to, right?

11:23 For a while, and yeah, we'll see, look at the download numbers and play it by ear.

11:27 - Yeah, all right, cool.

11:28 While we're talking about compatibility, if people are like doing a lot of the overriding functions and stuff in their Pydantic models, like what do they, what should they expect?

11:37 Too many changes or pretty similar?

11:39 One of the biggest changes is that the init method of a model is now no longer called unless you literally call init.

11:47 So if you call model validate or if your model is nested inside another model, init is no longer called.

11:53 The solution for that is to use a wrap validator or a model validator, but that's going to be one of the pain points for people.

12:00 But it turns out with the Rust API, it's literally impossible without a massive performance hit to do that.

12:07 This portion of Talk Python to Me is brought to you by Sentry.

12:10 You know that Sentry captures the errors that would otherwise go unnoticed.

12:14 Of course, they have incredible support for basically any Python framework.

12:19 They have direct integrations with Flask, Django, FastAPI, and even things like AWS Lambda and Celery. But did you know they also have native integrations with mobile app frameworks?

12:30 Whether you're building an Android or iOS app or both, you can gain complete visibility into your application's correctness both on the mobile side and server side.

12:41 We just completely rewrote Talk Python's mobile apps for taking our courses, and we massively benefited from having Sentry integration right from the start.

12:50 We used Flutter for our native mobile framework, and with Sentry, it was literally just two lines of code to start capturing errors as soon as they happen.

12:59 Of course, we don't love errors, but we do love making our users happy.

13:03 problems as soon as possible with Sentry on the mobile Flutter code and the Python server side code together made understanding error reports a breeze. So whether you're building Python server side apps, or mobile apps, or both, give Sentry a try to get a complete view of your apps correctness. Thank you to Sentry for sponsoring the show and helping us ship more reliable mobile apps to all of you.

13:28 Do you already have a roadmap?

13:30 Have you already tried the alpha on fast API?

13:33 Like what's the story for you guys?

13:35 - So like, - Pydantic wise.

13:37 - Yeah, yeah.

13:38 So like, we have actually been interacting a lot, like with what are the changes that are needed?

13:42 Like what is it gonna be?

13:43 And like, as someone was saying, like I have a lot of code that is quite hacky.

13:48 I was actually surprised it didn't break much.

13:51 It just like really worked.

13:53 And like, it's for this particular use case where you can have like, they are so loud, they really want to tear us down.

13:59 - I know, we might have to be really silent, but let's go ahead and finish this up, yeah.

14:03 - So--

14:04 - Wait, what are those like concentration challenges?

14:05 (laughing)

14:07 - Validation error, can it concentrate?

14:09 So imagine this use case where you have user model, and then this, you want to return this user model, but then you have an authenticated user model, as you were showing in the talk, in the PyCon talk.

14:22 And then this authenticated user model has a field that is a password.

14:26 If you return that use, the authenticated user directly, FastAPI does a lot of tricks to make sure that what you receive in the client side is the actual user without the password.

14:38 That is the thing that you declared that we were going to return.

14:41 But by default, if you don't do it through FastAPI, but you do it with just plain Pydantic, it will just check like, hey, is this an instance of the other?

14:48 And then it will include the field.

14:51 Because like, you know, because in thinking about types, it makes sense, like, oh, this is a subclass of that, so it makes sense that it's valid.

14:58 But when you think about data in an API, it doesn't make sense that it will include more data than what it should.

15:03 - Right, right, right, 'cause you don't wanna either have a mass injection attack on the inbound data or an over-exposure on the way out, right?

15:13 - You know, give away the password from users.

15:14 - Is that bad?

15:15 - I think it's pretty bad.

15:16 - Okay, yeah, all right.

15:18 - You know, for example, some months ago or years or something like I remember that the Kaggle API was returning some of the hashes of the experiments.

15:28 So like, you know, it's a mistake, but it's sad that it could end up just like filtering more data than what it should be returning.

15:36 And it's something that can happen very easily.

15:38 It can happen very easily to FastAPI applications if people don't specify what is the response model, the thing that they want to return, and they just return a bunch of data directly.

15:47 So FastAPI does a lot of things to make sure that when you declare a response model that should filter this data, the data is filtered.

15:55 But that's a lot of code in FastAPI to make it compatible.

15:58 With the new Pydantic V2, that's going to be pretty much transparent.

16:02 So that's amazing.

16:04 That is amazing.

16:04 There's going to be a bunch of things that require some refactoring and also making sure that the Pydantic V1 and V2 are compatible at the same time in some way so that people can have a migration path.

16:17 But yeah, we have been making sure that all the things that need to be changed or that need to be updated, or all the things that need to be exposed from the Pydantic side are actually available.

16:26 Yeah, it's awesome that you guys are working so closely together.

16:29 I mean, it's going to make it--

16:30 Yeah, absolutely.

16:31 So--

16:31 In my mind, these two projects are pretty closely tied.

16:34 I know that they're not, but that's a big use case.

16:36 Yeah, I think that's true.

16:37 And we know that FastAPI is by far our biggest dependent.

16:41 But also, Django Ninja, which is, I think, now second or third, maybe third after SQL model by stars, is like Vitaly who maintains that, has been engaging a lot with us on V2.

16:52 So yeah, lots of other projects are interested in it.

16:54 And I think, yeah, lots of people will be able to remove messy code because of that problem.

16:57 Because yeah, like the invariance of the response interface problem.

17:01 - That's fantastic.

17:02 - Coming back to your previous question about--

17:04 - Before you go to that, I think we should probably find out what do you think?

17:07 - Yeah, I think you might be right.

17:08 - Yeah, not even necessarily, I think the audio may be okay, but just for a concentration, it's very loud with the trucks around us.

17:13 I feel like I'm on the deck of the aircraft carrier, so I throw things off the side.

17:16 - Yeah, okay, let's pause this for a moment.

17:17 We'll be back, hold on.

17:19 So we have survived the disassembly.

17:23 We have returned to continue.

17:24 We were talking about the integration of FastAPI and Pydantic, and that was really cool.

17:29 I think something I'd like to kind of move to real quick is this big announcement, alpha of version two.

17:35 Samuel, last time you were on the show, we spoke about the plan for version two, and now you're at least in an alpha stage.

17:42 tell people where we are with this.

17:43 - Yeah, so we're, yeah, we've had two alphas, maybe three alphas now out.

17:47 We're basically pretty close to a feature freeze, and the plan is to release the package, release the beta, and effectively, we hope that we can then release the full release, say two weeks after that, but there'll be bugs, and we'll fix them, and we'll have more betas, but like, effectively, once we get to beta, the plan is that like, active development is stopping bar fixing bugs.

18:07 - Now it's performance and bugs, right?

18:09 - Yeah, and obviously one of the big things be once that's out, there'll be a lot more work on say, fast API, Django Ninja, etc, etc.

18:16 And that might come back with, we really need this thing.

18:18 Either this is broken or we really need that to make it, you know, to reduce the overhead of upgrading.

18:24 One of the things I did for Pydantic 1.10, which was super valuable in beta, was to go through packages that use Pydantic, initially sorting by star, but then looking at what they actually do and trying to upgrade them.

18:37 And that found me a bunch of bugs in either libraries or in Pydantic.

18:41 So we're not promising we're going to go and upgrade the whole of GitHub to Pydantic v2, but we'll do a bit of that mostly to try and find bugs.

18:50 One of the things that would be really useful is if anyone has an application that uses Pydantic, preferably without FastAPI or another library, that they would like help upgrading, we would love to come and help.

19:01 And it might be a really powerful way of us, again, seeing the real pain points and identifying bugs before we get to v2.

19:07 Yeah, and I guess another thing to mention that is a real headline and I also want to get your thoughts on this, Sebastian, is the performance numbers, right?

19:15 I mean, you put out some pretty impressive performance numbers and Sebastian gets to piggyback on that, right?

19:20 Yeah, I mean, yeah, I'm really proud of it, right?

19:23 Yeah, you should be.

19:24 To go in a change of release, in a bump of release, to be in the ballpark of 22 times faster, not 22%, but 22 times faster, I don't know of another package that's made an upgrade of that kind of order of magnitude.

19:38 What's crazy is it's not numerical computing, right? It's general purpose. If you look at the example I gave in my talk earlier, it's a completely standard Pydantic model with four fields. And that's where we're seeing that kind of like 22 times speed up. So I think it's going to be massive. I have my own cynicism about people who hype about performance as being the most important thing. I don't think most applications, it's actually the thing that matters most.

20:01 But I think it matters, A, it matters to everyone and everyone wants it to go in the same direction.

20:05 And two, it matters to the whole world and to the whole community that we effectively reduce the energy that we consume, like doing compute, basically.

20:14 Right. That's absolutely true. And also, even if people don't actually need it, there's a group of people who are like, "Well, I'm going to leave for Go because it's not fast enough or I can't do enough concurrency or whatever." And if they don't have that feeling, even if they didn't actually need that percentage increase, that's still really good for the Python community.

20:30 Even me, I was saying I had a gigabyte of data from web analytics that I was, I needed to load into a Polar's data frame.

20:38 And for that, I needed to A, extract some attributes from nested fields and B, parse dates and things like that.

20:44 And I use Pydantic v2.

20:45 And like, it was vastly faster with v2.

20:48 It went from like, toodling, tiddling, what's the word I'm looking for?

20:51 Twiddling your thumbs to like, it happens virtually instantly, right?

20:56 And that's fantastic.

20:57 And that'll be even more true when you have an order of magnitude to order the magnitude more data.

21:02 - Yeah, for sure.

21:03 Sebastian, how about you?

21:04 What's the knock-on effect for FastAPI?

21:06 - So I think one of the coolest things is that people won't have to change the code to get that performance benefit.

21:12 It's just gonna be like a bump suddenly.

21:14 And because of the new ways that Pydantic can handle the data, we're gonna be able to, there's something that needs to be done in FastAPI, but we're gonna be able to let the parsing of the data, let Pydantic handle that in the Rust side.

21:28 So Pydantic will be able to just read the JSON bytes instead of reading them in the Python side and let Pydantic do that, and then Pydantic give the models back to the rest of the code of FastAPI.

21:39 That alone will boost performance a lot, but the fact that it's being done in Rust, in the Rust side, it's just gonna be amazing.

21:46 Like, one of the other things that I want to do that is on the plans is to let users define how they want to serialize data and not have like, this is just like by default, it's just like Pydantic models and it converts automatically to JSON.

21:59 But I want to allow users to decide how to serialize the objects and the data so that they can--

22:04 - Wait, like data classes or something like that?

22:06 - Yeah, for example, they can say, oh, I don't want to serialize with the standard JSON serializer.

22:10 I want to serialize with our JSON, which is like the Rust-based implementation to serialize JSON. - Ah, got it.

22:16 - Or they can say, I want Pydantic to be the one that serializes this because I know that this is just a model that can handle that.

22:22 They can also, and this is one of the things that I think is super cool, they can also create, they will be able to create a way to say, "I want to serialize this response to XML," or something like that, or to YAML, and to let Python decouple the validation, but then do the serialization in a way that they can customize the whole thing without having to do it directly in the code.

22:43 >> Maybe even some of these crazy stream buffer protocols.

22:47 >> Yeah, like protocol buffers with crazy stream buffer or even message pack or a bunch of these things that There's no obvious way and there's no native way to have support for that, for reading the data and for exporting the data.

23:00 And that's one of the things that I have in plans.

23:02 I'm probably saying too much and then I'm going to account for all the things I'm recording.

23:06 And now they're like, you know what, you promised this.

23:08 You did promise it.

23:10 Well, can I just come back on serialization for a minute?

23:12 Yeah, please.

23:12 Yeah, so I've worked from October, putting to one side the whole funding round in the middle of it, was working solidly on serialization.

23:21 So we have, there's almost as much code in Pydantic Core now for serialization as there is for validation.

23:26 Yeah.

23:27 Wow.

23:27 We can serialize directly to, to JSON using the same underlying library that, or JSON uses using Serdy.

23:33 But one, and also we can, you can customize how your serialization goes on a per field basis rather than on a per type basis, which is like incredibly powerful, but we also allow you to effectively serialize to Python.

23:46 So not just what used to be the dict method, but basically do JSONable Python.

23:51 So you effectively set the mode when serializing to Python to JSON, and you will only get the like, whatever it is, seven primitive JSON types in Python, which is super valuable if you want your output to be XML, because then, you know, your XML encoder only needs to bother, needs to take in dictionaries, lists, ints, floats, none, bool, rather than whatever complex data you have.

24:11 So there's an, yeah, I'm like super proud of lots of the advantages of serialization.

24:15 My 45 minute talk earlier, I was able to touch about half of the big new features, which kind of talks about quite how much has changed.

24:23 - Yeah, that's really exciting.

24:24 - I think we definitely, I was saying earlier, if I had known how long it was gonna take, I would never have set out on this journey.

24:30 So the best thing about it is I didn't think about how long it was gonna take because we didn't try and do a bit more.

24:35 We tried to do everything, or I tried to do everything.

24:38 And that's how it's disadvantaged.

24:40 It's taken longer than we had hoped, but here we are.

24:41 - But here you are, you're pretty much here.

24:43 That's really good, that's really good.

24:44 And so when you think about performance, right, obviously the 22 times faster is awesome.

24:49 The fast API speed up is awesome.

24:51 But if you do something like SQL model and fast API or beanie and fast API, you're getting on both ends, you're getting that beanie or the.

25:00 I'd ended up with the beanie integration and the fast.

25:04 So you, you're kind of putting.

25:05 Pydantic in both those layers.

25:08 And so those speed ups are like twice as good or something like that.

25:11 Yeah.

25:12 Right.

25:12 I think you, well, they're probably, yeah, they're like, they're the same relatively, but more in absolute terms.

25:17 Yeah.

25:18 Yeah.

25:18 Yeah, absolutely.

25:19 So yeah, I think it's, you know, the fact that so many things have been built upon Pydantic means you've just sped up a bunch of projects without them doing too much.

25:28 Yeah.

25:29 We get the, like the win.

25:30 It's like CPython itself getting faster, helps everyone.

25:32 This is like the next layer down, but we, you know, as a dependency of lots of packages, we get to speed up lots of the community with one package, like one person devoting a year to it.

25:41 Does this surprise you to see all these projects coming out?

25:44 Like here's another project based on Pydantic plus, you know, name your other thing that it's integrated with.

25:48 It's been crazy.

25:49 particularly in the machine learning space where, you know, Langchain, who are one of the big names right now in these big language models, large language models, all based on Pydantic, right?

25:59 Yeah.

26:00 You were saying, I think, on Twitter that OpenAPI use a bunch of FastAPI, right?

26:04 OpenAI?

26:04 Sorry, OpenAI, not OpenAPI. Marvin, I think, from Prefect, is built on Pydantic again. So the wave of machine learning stuff seems to have leveraged Pydantic, a whole lot, DockerA being another big example.

26:18 some for Elastic and some other things as well.

26:21 This portion of Talk Python to Me is brought to you by Influx Data, the makers of InfluxDB.

26:29 InfluxDB is a database purpose-built for handling time series data at a massive scale for real-time analytics.

26:37 Developers can ingest, store, and analyze all types of time series data, metrics, events, and traces in a single platform.

26:44 So, dear listener, let me ask you a question.

26:46 how would boundless cardinality and lightning fast SQL queries impact the way that you develop real-time applications? InfluxDB processes large time series data sets and provides low latency SQL queries, making it the go-to choice for developers building real-time applications and seeking crucial insights. For developer efficiency, InfluxDB helps you create IoT, analytics, and cloud applications using time-stamped data rapidly and at scale. It's designed to ingest billions of data points in real time with unlimited cardinality.

27:19 InfluxDB streamlines building once and deploying across various products and environments from the edge on premise and to the cloud.

27:27 Try it for free at talkpython.fm/influxdb.

27:31 The link is in your podcast player show notes.

27:34 Thanks to InfluxData for supporting the show.

27:38 I'm sure one of the big things you're thinking going forward with fast API is like how do you guys work together and make this a seamless change.

27:45 What else you got?

27:46 What else you working on?

27:47 What else do you see in the future?

27:48 - I have a bunch of things.

27:50 - Are they secret or can you tell us?

27:51 - No, no, I can tell.

27:53 Most of it I can tell.

27:54 I just feel the accountability.

27:57 But I can tell.

27:59 So I have a bunch of projects.

28:01 The funny thing is that in some way it's kind of a dependency graph of things that I should work on.

28:07 So for example, I have this project generator for FastAPI to generate a project with a SQL database.

28:12 I haven't updated it in a long time, and it uses SQLAlchemy.

28:16 I built SQL model for that project to use it there, but I haven't updated it there, because first, I want to upgrade more things in SQL model.

28:23 I want to finish the documentation, finish the story about migration.

28:26 But then for the story about migration, I need Typer for SQL model.

28:30 So I need to update things in Typer.

28:32 And then for Typer, I want to add support for annotated, which is actually one of the big things, one of the recent big things in FastAPI is that now there's support for annotated.

28:41 So annotated is this feature from Python.

28:44 It's like standard Python typings.

28:46 You import from typings, import annotated, and then you can use that to emit information to the types that you define for parameters.

28:55 - Like what? I haven't used this.

28:56 I love typing, I use it all the time, and here I'm learning more about typing.

28:59 - The thing is, it exists there in the standard library, but it doesn't have like a canonical use in Python itself.

29:05 it's there mainly for FastAPI and Pydantic to use it.

29:09 You know, like, it's just that, like, I hadn't pushed for that before.

29:13 But the thing is, you import from typings, import annotated, and then you create a function that takes like a username, and then this function will normally be of type string.

29:24 So it will be, the parameter of the function will be username colon str.

29:28 Now you can say username colon annotated, and then open square brackets, as if it was like a list or like a dict or something, open square brackets, and then the first thing that you put there, that's the actual type.

29:39 So you will say, annotated string, and then you can pass additional metadata afterwards.

29:44 And the additional metadata is the thing that you will use to tell FastAPI, this should be extracted from the query parameters or from the cookies or from headers.

29:51 Before, and I kept to recently in FastAPI, the only way to do that was using the default value of the parameter.

29:57 - Right, you would set the default to like a depends or--

29:59 - Yeah, to depends or equals a cookie or equals header or something like that.

30:04 And then FastAPI can take the information from that to give you the data in your function.

30:08 But the thing is, if you call that function manually somewhere else, the editor and Python won't complain that you are not passing some parameter that is required.

30:17 And then you're gonna end up with some strange value internally that is just for FastAPI.

30:20 - Right, or the type checker complains, "You're not passing a depends." Like, "No, I'm passing a string.

30:25 "That's what it's supposed to be." But that's-- - Yeah, exactly.

30:27 - Something weird thing like that.

30:28 - Exactly. - Yeah, okay.

30:29 - So for those cases, having annotated, like all the type is exactly what it is.

30:32 And if it has a default value, it's the actual default value, instead of some strange internal concept in FastAPI.

30:40 And having support for that allows having much better support for typings, for editors, auto-completion, inline errors, all these things, reusing the same functions in other places.

30:49 And it will also, having support for that in Typer will allow users to have the same function being used for FastAPI and Typer, having the custom metadata necessary for each one of the parameters for FastAPI and for a typer, and things like that.

31:04 So it's something that is super powerful and super interesting.

31:07 - I'm gonna come in on annotated 'cause I'm excited about it too.

31:09 So, in Pydantic V2, we use annotated for all of our custom types.

31:13 So for example, positive int type is just annotated of int, and then we use the annotated types package, which is some reusable metadata for annotated.

31:23 So we would use, like, positive int is annotated of int, and then greater than zero.

31:28 And what's even cooler is that will be used by Pydantic, of course.

31:33 Hypothesis is going to get support for that really soon.

31:36 So it will only pass a positive value in if it sees greater than zero there.

31:40 And then Typer, I guess, could--

31:42 even if it's still based on click, it can go and take that greater than and infer it as it must be greater than zero.

31:48 So I think it was one of the things that typing guys, when they first created Annotated, hoped was going to build a rich network of libraries that interchanged metadata.

31:56 It's taken a bit longer than they expected, but we're getting there.

32:00 - I hear the two of you are kind of doing that a little bit, right?

32:02 That's cool.

32:02 That's really cool.

32:03 One of the areas where I feel like typing is a little janky is on ORMs and ODMs.

32:08 When you define a class, you say, for example, it's like a SQL alchemy column or it's a Beanie column or something like that.

32:17 And the type is it's a string column, but really it's a string.

32:22 It's not a string column.

32:23 And so there's this weirdness of using types to kind of drive behavior.

32:29 - That's a perfect case for using annotated.

32:31 - That's what I was thinking, yeah.

32:31 - What it doesn't do is the other case where there is a context where you'd want to get the column object of some sort rather than the integer in a row.

32:40 So it does mean two different things, the kind of dot objects in the Django context, but yeah, absolutely.

32:45 - Yeah.

32:46 - It's there precisely to solve this kind of problem.

32:48 - And it's also because like, currently as far as I remember, there's no way, so the thing is that this is all based on something called descriptors, and is that when you call, like, I don't know, class user.name is actually the attribute in the class.

33:03 But when you create an instance of that user and then say user.name, that is the attribute on the actual instance.

33:08 And the way that this ODMs or ORMs or these things work is that they have a special way to say like, when someone calls the actual class, this is a different thing than when someone calls an instance attribute.

33:20 - Right.

33:21 - And there's-- - It's sort of two behaviors - Yeah, yeah. - in the context, right?

33:23 - And it's super powerful.

33:24 That's how SQLAlchemy works.

33:26 And it's super powerful because then all the queries and all the stuff is actually consistent with how Python works.

33:32 And you can say greater than or equals to using Python syntax, which is great.

33:37 But then currently, as far as I know, there's no way to define that with type annotations in a standard way.

33:42 I think it's something that will probably be improvable, but I think there's currently no way.

33:46 There will probably be a way at some point, but to be able to say, hey, this SQLalchemy column is a column when it's accessed at the class level, but this is gonna be a string when it's accessed at the instance level.

33:58 - A scope level in the annotated, you know.

34:00 - Yeah, yeah, yeah, something like that.

34:00 - The class is this and this, and the instance is that and that.

34:04 That's really interesting.

34:05 While we're talking types, and I know you both are really big fans 'cause it's such a central part, both your libraries, what do you wanna see coming?

34:13 It feels like this is an area of Python changing fast, but is there something like, if they could just...

34:17 And I have another question on types otherwise after that, if I remember.

34:20 - So I gave a talk at the Typing Summit asking for certain things.

34:24 So now we're going to test my can I remember pep numbers challenge, which I'm going to fail in, but pep 472 is that keyword args to... so one option would be to allow keyword arguments to get item, which would make annotated even more powerful because then you could use keyword arguments to describe the meaning of your metadata, rather than having to have these kind of identifier types like greater than. One of the big things that I hope we're going to persuade, so I think one of the things that's happened recently is that everyone gets that runtime use of type ints is a legitimate thing.

34:54 They might not want to do it themselves, but they get that it's a legitimate thing to do.

34:58 - How much pushback was there when you first came out with Pydantic there?

35:01 - I think we were like the black sheep of Python.

35:03 I was a black sheep of Python.

35:04 - This is supposed to have no meaning.

35:05 What are you doing?

35:06 You're doing it wrong.

35:07 - And I think nowadays that's changed and everyone gets it's a real thing.

35:10 So for example, the hash of a union is independent of the order of the members of the union.

35:16 That makes sense in the context of static typing where the union of int float is exactly the same as a union of float int.

35:23 It turns out in static typing, particularly when you're doing coercion, there are some cases where that is not the case.

35:29 And so it's really difficult right now that effectively when you, unions are fine on their own, but if you have a union say within a string, the capital string square brackets, it will, the order will be, match the order that the first time you called that, not what you actually call, unless you use lowercase string when it is the right order, except there's a PR open right now to break it on string too.

35:52 So, string as well.

35:53 So anyway, we are on lowercase list as well.

35:56 Anyway, so things like that, where I do think that like, we'll see what happens on that particular case, but I feel like the voice of people doing runtime use of types, we're not the only people, are being heard better.

36:08 And like, yeah, I think things are going to continue to improve.

36:11 Yeah, there was a pep that proposed an optimization for typing that kind of broke the runtime behaviors of it a little bit for both of y'all.

36:18 Yeah, did in some edge cases, and that's going to be fixed soon by the successor, Pep.

36:24 Absolutely. So that's really good.

36:26 Generic alias is another thing that kills us internally in Pydantic.

36:30 I won't go into all of the details of it, but yeah, we would...

36:33 The high-level takeaway is that the typing community seemed happy with the idea that they might make a change to typhint to make it easier for us.

36:41 And I think that's also for the Pydantic team to engage better.

36:44 And instead of spending ages...

36:46 Problem is, right, like you have a problem, you see a solution in typing, you submit the PR, even if it gets accepted in a week, which it won't, you wait five years before we can remove the code that deals with the other case.

36:56 So it's very tempting not to engage with typing, but just go and write the work around where we should be better Python citizens and go and submit the PR to CPython to try and fix it properly.

37:05 Yeah, what's your wish list for typing?

37:07 So, well, the first thing is like this thing that I have been trying to work on and like trying to do to have better ways to do the documentation of the APIs.

37:14 That's also related to typing and to the annotations.

37:17 Like, let's see if I can pull it off.

37:19 The other thing is, it's actually not that necessarily related to the things that we have been talking about, but it's quite relevant for the data science and machine learning community.

37:28 That is that there are many APIs for many libraries that decorate in some way some function, and then the function is expected to give some types of values to the body of the function internally, but to be able to receive different types of values.

37:45 That sounds a bit abstract, but that's the core idea that is replicated across several libraries.

37:52 And this will apply, for example, to Ray, the distributed machine learning or computing system, to Dask, to I think Daxter also uses something like that, Monad, this system for deploying workloads and machine learning, and a thing like that also uses these types of ideas.

38:07 So there are many of these libraries that like, the way that they are designed is that you create some function, and then you're going to tell something to call this function. And then in the function, you say, I want to expect this value. And instead of you calling all the functions that will generate that value, you tell it like, hey, distributed system, blah, blah, blah, give me this value, execute this for me. But that means that you will have or no type annotations, or invalid type annotations, or red squiggly lines in some places, or no to completion or or auto-completion for the wrong things, just because there's currently no way to define this specific thing of saying, hey, this function after being decorated is gonna be able to receive a different type than what it's gonna give to internal.

38:52 So I think that's something that, and it's probably quite challenging and a big thing to tackle, but it's something that is replicated across several libraries, in particular for these things that do distributed processing of data.

39:04 I think that's something that will be great to improve.

39:07 Does param spec fix some of that?

39:10 Very close, but the param spec only does it for being able to sort of copy the params that come from one function to another.

39:20 And actually I use all that for, for example, for asynchronous and for other things to be able to get like auto completion for the decorated function or for the generated function or things like that.

39:30 And it will probably like the change will probably be somewhere around param spec to be able to say like, not just the param spec, but like this function will not only have the param spec, but will receive a modification like this of the parameters.

39:46 Almost making param spec generic.

39:48 All right.

39:51 One more typing question.

39:52 Do you all think typing is going too far with like the generic stuff?

39:56 And is it going too much like C++ and C# and Java?

39:59 Or is it still good?

40:02 I think the way Python is growing is super interesting because we all have to agree that Python 3.12 is not the same Python 2.7.

40:12 It's quite different.

40:14 And I think it's different in a good way.

40:15 - The users are different and the focus of the runtime is different, yeah.

40:21 - And the things that we can do with types now, and the fact that in Python we can access these types at runtime, which means, I don't know, I was always confused with the term runtime.

40:32 It's like, what does that mean?

40:33 I did like, when you execute Python, the same Python code can inspect and like see what are those types.

40:39 That's what FastAPI and Python do.

40:40 It's just like seeing like, what are those types?

40:42 We can do that in Python.

40:43 You cannot do that in things like TypeScript or you cannot do it in Java.

40:48 You cannot do it in many other languages.

40:49 You get access to this typing information to be able to do additional things with that, like validation, data serialization, documentation, all that stuff.

40:57 So I think that's, to start, that's super powerful in Python.

41:00 The language in Python for typings is not as powerful as, for example, TypeScript.

41:06 There is just like so much stuff that you can do with that.

41:09 Nevertheless, I feel that in Python it's just like, it's growing and it's growing organically.

41:14 And like, we have growing pains, you know?

41:16 Like, there are some things that are like, "Oh, this little thing here is slightly incorrectly named." But like, now there's a better way to do that in Python 3.10, so we don't care much about that slightly incorrect name, things like that.

41:30 Yeah, I feel like there's some tensions of people who are on the mypy side and they want perfect validation of, I want to prove my code hangs together like a static compiler.

41:40 And folks like you all who are like, we want to leverage typing to make it behave in interesting ways.

41:46 And maybe that behavior expression doesn't exactly match what it looks statically like, but it is, everybody wants it, but it's, it might trip up mypy.

41:55 I feel like there's this tension between those two things.

41:57 That's kind of what I was thinking when I asked that question.

41:59 I think there's a little bit of that, but at the same time, there's, it's much less than you could imagine.

42:05 There are so many people that are so close to, you know, core mypy and these things that are actually very excited about the things that we are doing.

42:13 So, you know, like, it's actually quite friendly, all the communication.

42:17 It's just that there is some people that just don't really care about runtime types, and that's fine.

42:22 But I feel like it's much more closer together and much more stronger, the relationship, I think.

42:30 Yeah, yeah, it's great.

42:31 Yeah, I think actually we've gone in the...

42:33 Typing's got better for someone who's not...

42:35 Like, it's actually got less verbose, cleaner, easier to understand.

42:39 You don't have to import union, you can do pipe operator.

42:41 You don't have to import list from typing, you can use list, which makes complete sense.

42:46 Any is an unfortunate one, but I also understand why the any function would not make sense.

42:51 I give up, I just I can't deal with this part.

42:54 So yeah.

42:55 No, in general, I think it's got much better.

42:57 I do think that the interchange between runtime, there's a pep open now to add to data class transforms a converter function.

43:06 I forget exactly how it works.

43:07 But I think that is awareness in the static typing space that the data gets converted when you construct something that looks like a data class.

43:16 So no, I think I think it's really positive.

43:18 I think we're incredibly lucky that we're, like I say, TypeScript is the other, is in some ways the best untyped language typing system.

43:26 But the fact that they're not available at runtime means we're killing it, I think.

43:29 I spoke to someone who maintains a library that does type analysis at runtime in TypeScript, and all his types are strings.

43:38 And they're valid TypeScript, but they're strings.

43:40 And that's, you know, he was saying that doesn't matter, and it's all fine.

43:44 I tend to feel like it probably does a bit.

43:46 We're really lucky to have them at runtime.

43:48 Then you go to the other end, where I've been writing a lot of Rust.

43:51 I am like, Rust's great, it has many advantages, but if you want to just get something done and not have to think too hard about what the types are, it's really nice.

43:58 I write a lot of Python that's untyped when I'm just trying to get something to work.

44:02 I'm not a like, everything must have a type on it kind of person.

44:05 So no, I think we're in a really great place, and I think most of the advantages are actually cleaning it up.

44:09 So the new six something, 649, the new generics syntax, to me, 695.

44:18 (laughing)

44:20 There are two types of people in the world.

44:22 There are people who know the numbers of their peps and there's everyone saying.

44:26 That for me cleans up generics, right?

44:30 Yes, it's a fundamental change to the language.

44:32 Yes, it makes the syntax or function look a bit more like Rust or something, but if you look at it independently of our experience, it's a heck of a lot more elegant than importing Typefor.

44:43 - Yeah, yeah.

44:44 And let me ask you one that's purely theoretical, because I don't think it'll get adopted, but we have int pipe none, we have optional of int.

44:52 A lot of languages have question mark for nullable types.

44:55 Like it could be int.

44:56 You could even say like int.

44:57 I'm not sure.

44:58 Is it an int?

44:59 It could be an int.

45:00 It might be nullable.

45:01 I don't know, right?

45:02 Or use int.

45:03 It's an int.

45:04 You just know.

45:05 There's no question mark.

45:06 And those types, what are your thoughts about like null coalescine, you don't care?

45:08 I'm really happy with the new situation and not having the optional that isn't optional.

45:13 that's been a problem for a long time, so not needing to use optional is being able to use pipe none is great.

45:19 I actually think one of the things that's gonna happen, particularly with the advent of the match syntax and with increased use of type dict, we're gonna need a new union type that operates much more like an enum in Rust.

45:32 So basically a union that keeps track of which member of the enum you have an instance of.

45:38 I keep meaning to build a package to demonstrate what I mean, and I haven't got around to it, Like, if you have a union of type dicts, which is a legitimate thing to do, it's effectively impossible without starting to do effectively validation to work out which member you're on.

45:51 So I think we need, and it would be really neat if you could use a match expression to process each branch of your union.

45:57 - Sebastian?

45:58 - You already said everything.

45:59 (laughing)

46:00 - No, but like, you know, like, I feel that way.

46:03 I was saying that, like, I feel Python is just, like, growing and, like, the typing system is growing.

46:08 I feel it's growing in a very healthy way because it's not like just some academics hidden in some corner somewhere saying, like, this is how it should be done.

46:17 - I did my thesis on this type system and here we are.

46:20 - And then like everyone should just use it.

46:21 It's just like a lot of hearing everyone and just receiving the feedback from everyone and just like growing in the ways that it should grow.

46:29 I think that's amazing.

46:30 I think like we are, you know, it's like a kind of renaissance of like typing in Python and like how we can build all these things.

46:37 I think that's amazing.

46:38 - I think it absolutely is.

46:39 It absolutely is.

46:39 All right, I think we're pretty much out of time.

46:41 We've used up all the various places we've escaped to at a shutting down conference here.

46:45 Final question for you both.

46:46 Just, you know, what's your big takeaway?

46:47 What's the experience like here at PyCon?

46:49 Like, how's it been 2023?

46:51 - For me, it's been amazing.

46:53 It's my first PyCon in the US.

46:55 - Oh, it is?

46:56 - Yeah, like I have never been in a PyCon in the US.

46:58 I have been in PyCons in like many other places, but not in the US.

47:01 And like, I got to see, I got to put faces to so many handles in Twitter and GitHub.

47:06 I got to meet you in person.

47:08 - Yeah, you as well, it's great.

47:09 - And like a bunch of other people that I only knew, you know, just like on the internet.

47:14 A bunch of core developers and like, that's so cool, they are so cool.

47:17 Like I knew they were super cool, but just like, you know, talking on Twitter and like then seeing in person, that's amazing.

47:23 - I really, that's my favorite part of the whole conference.

47:25 It's just the people and the getting together.

47:27 - Definitely, I think I attended like two talks.

47:31 So I was just on the hallways talking to everyone.

47:34 - You feel the hallway traffic.

47:35 - Yeah, yeah, I was all the way on the hallway traffic.

47:38 - Awesome, well that's great.

47:39 - Yeah, I absolutely love it.

47:40 I remember Sebastian and I joined the language summit remotely two years ago, the year when there was no PyCon.

47:45 And the most interesting bit of the like four hour Zoom call was the five minutes between talks when people just chatted.

47:50 And I remember then thinking how cool PyCon must be to have that same group of people like in a room rather than on a Zoom call.

47:57 So, no, I love it.

47:58 I think it's, I've really enjoyed it.

48:00 Last year was my first year.

48:02 This year is even more fun.

48:03 Yeah, I really enjoy it.

48:04 - Awesome, yeah, it's been great to meet you both in person.

48:06 - Of course, meeting you has been the best bit of all.

48:07 - Thank you very much.

48:08 No, no, it's been really great to spend some time with you all here and thanks for coming on the podcast.

48:12 Part two now here to wrap things up.

48:15 So it's, thanks for taking the time and congrats both on the success of your projects.

48:18 They're amazing.

48:19 - Thanks so much.

48:20 - Thank you very much.

48:21 Thanks for having us and thanks for seeing us.

48:23 - Yeah, bye guys.

48:24 This has been another episode of Talk Python to Me.

48:26 Thank you to our sponsors.

48:28 Be sure to check out what they're offering.

48:29 It really helps support the show.

48:31 Take some stress out of your life.

48:32 Get notified immediately about errors and performance issues in your web or mobile applications with Sentry.

48:38 Just visit talkpython.fm/sentry and get started for free.

48:43 And be sure to use the promo code, talkpython, all one word.

48:46 InfluxData encourages you to try InfluxDB.

48:50 InfluxDB is a database purpose-built for handling time series data at a massive scale for real-time analytics.

48:56 Try it for free at talkpython.fm/influxdb.

49:00 Want to level up your Python?

49:02 We have one of the largest catalogs of Python video courses over at Talk Python.

49:06 Our content ranges from true beginners to deeply advanced topics like memory and async.

49:11 And best of all, there's not a subscription in sight.

49:13 Check it out for yourself at training.talkpython.fm.

49:16 Be sure to subscribe to the show, open your favorite podcast app, and search for Python.

49:21 We should be right at the top.

49:22 You can also find the iTunes feed at /iTunes, the Google Play feed at /play, and the Direct RSS feed at /rss on talkpython.fm.

49:31 We're live streaming most of our recordings these days.

49:33 If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

49:41 This is your host, Michael Kennedy.

49:43 Thanks so much for listening.

49:44 I really appreciate it.

49:45 Now get out there and write some Python code.

49:47 (upbeat music)

49:50 [Music]

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon