#313: Automate your data exchange with PyDantic Transcript
00:00 Data Validation and Conversion is one of the truly tricky parts of getting external data into your app. This might come from a REST API or a file on disk or somewhere else. It includes checking for required fields, the correct data types converting from potentially compatible types. For example, from strings to numbers, if you have quote, '7', but not the value 7, and much more 'Pydantic' is one of the best ways to do this in modern Python using data class like 'Constructs' and 'Type Annotations' to make it all seamless and automatic. We welcome Samuel Colvin creator 'pydantic to the show, we'll dive into the history of 'Pydantic' and it's many uses and benefits. This is talk python to me Episode 331, recorded April 14 2021.
10:00 Well is undoubtedly the biggest, most obvious competitor to
10:00 'Pydantic', and it's great. I'm not gonna say here. And now that it's been around for longer, and it does a lot of things really well 'pydantic' has just overtaken a few months ago, 'Marshmallow' in terms of popularity, in terms of GitHub stars, whether you care about that or not, it's another matter. There's also classes which kind of predates these classes, and it's closer to their classes. But the big difference between 'Pydantic' and marshmallow and most of the other capacitors is 'Pydantic' uses 'Type Hints'. So that one means you don't have to learn a whole new kind of micro language to define types. You just write your classes. And it works. It works with your with
10:00 'mypy', and with your static type analysis. It works with your IDE like PyCharm'. Now, because there's an amazing extension, I forgot the name of the guy who wrote it. But there's an amazing extension that I use the whole time. With 'PyCharm'. That means it works seamlessly with pydantic. And there's some exciting stuff happening Microsoft, they emailed me actually two days ago, one of their technical fellows about extending their language server or the front end for language server 'Pyright' to work with 'Pydantic' and other such libraries. So because you're using standard type, it's all the other stuff, including your brain should in very click into place. Yeah, that's really neat. I do think it makes sense to separate sort of the serialization file, save me a file, load a file type of thing out, I really love the way that the type hints work in there, because you can almost immediately understand what's happening. It's not like, Oh, this is the way in which you describe the schema and the way you describe the transformations. It's just, here's a class, it has some fields, those fields I've typed, that's all you need to know. And
10:00 'pydantic' will make the magic happen. What would you say the big difference between 'Pydantic' and 'Marshmallow' is and I haven't used 'Marshmallow' that frequently. So I don't know super well, I would first give the same proviso that I haven't used it that much, either. I probably if I was more disciplined, I sat down and used it for some time before building 'pydantic'. But that's not always the way things work. The main difference is it doesn't use type hints, or it doesn't primarily use type hints as its source of data about what type something is 'pydantic' is around from my memory, I can check but significantly more performance than marshmallow. Yeah, you actually have some benchmarks on the site. And we could talk about that in a little bit. And compare those Yeah, yeah. So just just briefly, it's about two and a half times faster. The advantage of marshmallow at the moment is it has more logic around customizing how you serialize types. So when you're going back from a class to a dictionary or list of dictionaries, and then out to the JSON or whatever, much matter has some really cool tools there, which pydantic doesn't have yet. I'm hoping to build into v2, some more powerful ways of customizing serialization, Okay, fantastic.
12:50 This portion of talk, by the way is brought to you by '45drives'.
12:50 45 drives offers the only enterprise data storage servers powered by open source. They build their solutions with off the shelf hardware and use software defined open source designs that are unmatched in price and flexibility. The Open Source solutions '45Drives' uses our powerful, robust and completely supported from end to end. And best of all, they come with zero software licensing fees, and no vendor lock in 45 drives offer servers ranging from four to 60 bays and can guide your organization through any sized data storage challenge, check out what they have to offer over at
12:50 'talkpython.fm/45 drives'. If you get in touch with them and say you heard about their offer from us, you'll get a chance to win a custom front plate. So visit talk python.fm/45 drives, or just click the link in your podcast player.
13:44 Let's dive into it. And I want to talk about some of the core features here. Maybe we could start with just you walking us through a simple example of creating a class and then taking some data and parsing over and you've got this nice example right here on the homepage. I think this is so good to just sort of look at there's a bunch of little nuances to cool things that happen here that I think people will benefit from. Yeah, so you obviously defining your class user here, very simple inheritance from
13:44 'Bayes model', no decorator, I thought about at the beginning that like this should work for people who haven't been writing Python for the last 10 years and where decorators look like strange magic. I think using inheritance is the obvious way to do it. And then obviously, we define our fields. The key thing really is that the typing into in the case of ideas is used to define what type that fields going to be. And then if we do give it a value as we do with name, that means that the field is not required, it has a default value. And obviously we can infer the type there from the default which is a string, then sign up timestamp is obviously an optional date time so it can be none. And critically here you can either enter none or leave it blank and it would again go and be done. And then we have friends, which is
15:00 A more complex type, that's a list of integers. And the cool stuff is because we're just using Python type ends, we can burrow down into lists of decks of lists of sets of whatever you like, within reason, and it will all continue to work. And then looking at the external data, again, we see a few things like we were talking about the coercion, right? This external data is just a dictionary that you probably have gotten from an API call, but it could have come from anywhere. It doesn't have to come from there, right? Exactly anywhere outside. But right now we've got it as far as being a dictionary. So we're the point here is we're doing a bit of coercion. So the trivial converting from a string 123 to the number 123. But then a bit more complex, parsing the date and converting that into a data object, right. So you have the in here, the data that's passed in, you've got a quote, 2019-6- 1 at a time. And this is notoriously tricky, because things like JSON, don't even support dates, like, freak out if you try to send one over. So you just got this string, but it'll be turned into a date. Yeah, and we do a whole bunch of different things to try and do all of the sensible day format. There's obviously limits how far to go. Because one of the things 'Pydantic' can do is it can interpret integers as dates using Unix timestamps. And if they're over some threshold, and about two centuries from now, it assumes they're in milliseconds. So it works with milliseconds, which Unix milliseconds which are often used. But it does also lead to confusion when someone puts in 123 as a date, and it's three seconds after 1970. There are like there's an ongoing debate about exactly like what you should try and coerce and when it gets magic, but for me, it's there's a number of times, I've just found it incredibly useful to that it just works. So for example, the string format that Postgres uses when you use to JSON just works with 'Pydantic' so you don't even have to think about whether that's come through as a date, or as a string until you're worried about a limit of performance. Most of that stuff just works. Yeah, then one of the things that I think is super interesting is you have these friend IDs that you're passing over. And you said in 'Pydantic'. It's a list of integers in the external data. It is a list of sometimes integers and sometimes strings. But once it gets parsed across, it, not just looks at them immediate fields, but it looks at say things inside of a list and says, Oh, you wanted a list of integers here. This is a list it has a string in it, but it's a quote three, so it's fine. Yeah. And this is where, where it gets cool, because we can go recursively, down into the rabbit hole. And they will continue to validate one of the tricky things that I think is what most people want, but where our language fails, this is about the word validation, because quite often, validation sounds like I'm checking the input data is in the form that I said, that's kind of not what 'Pydantic' doing. It's optimistically trying to pass, he doesn't care what the list contains, in a sense, as long as it can find a way to make that into an end. So this wouldn't be a good library to use for, for like unit testing. And checking that something is the way that it should be because it's going to accept a million different things, it's going to be as lenient as possible in what it will take in. But that's by design. Yeah, the way to take this external dictionary and then get 'pydantic' deposit is you just pass it as keyword arguments to the class constructor. So you say user of '**dictionary', so that'll, you know, explode that out to keyword arguments. And that's it that runs the entire parsing, right? That's super simple. Yeah. And that's, again, by design to make it just the simplest way of doing it. If you want to do crazy complex stuff, like constructing models without doing validation, because you know, it's already been validated. That's all possible. But the simplest interface, just calling in it on the class is designed to work and do the validation. Cool. One thing I think is really neat and not obvious right away is that you can nest these things as well, right? Like I could have a shopping cart and the shopping cart could have a list of orders. And each order in the in that list can be a 'Pydantic' model itself. Right, exactly. And it's probably an open question as to how complex we should make this first example. Maybe it's already too complicated. Maybe it's doesn't demonstrate all the power. But yeah, I think it's probably about right. But yeah, you can go recursive, you can even do some crazy things like the root type of a model can actually not itself be a sequence of fields, it can be a list itself. So there are there was a long tail of complex stuff you can do but but you're right that that inheritance of different models is a really powerful way of defining types. Because in reality, our models are never nice, key and value of different types. They always have complex items deeper down, right? That's really cool. And on top of these, I guess we can add validation like you have the you said the very optimistic validation like if in the friends list it said
15:00 12 , Jane, well, it's probably going to crash when it tries to convert Jane into an integer and say, No, no, no, this is wrong. And it gives you a really good error message. It says something like the third item in this list is Jane and that's not an integer, right? It doesn't just go Well, David
30:00 mission has been fulfilled exactly how it is. And if it has this time, but Well, sometimes it throws errors, and sometimes it doesn't. And that like just report. I don't know, it seems like you could overly complicated as well. I think that is like a really difficult challenge in tragedy of the commons in like, someone wants a nice feature. Someone else wants that feature, you get 10 people wanting that feature, you feel under overwhelming pressure to implement that feature. But you forget that there's, I forget, you know, there's like 6000 people who started but there's like 10,000 projects that use pydantic. Those people haven't asked for that. Do they want it? Or would they actively prefer that 'Pydantic' was simpler, faster? Smaller?
30:39 Right? Part of the beauty of it is it's so simple, right? I do I get the value, I define the class, I **take the data. And yeah, it's either good or it's crashed, right? If it gets past that line, you should be happy. Yeah. And I think that like, I'm pretty determined to keep that stuff that simple. There are those who want to change who say, initializing the class shouldn't do the validation, then you should call a validate method. And I'm not into that at all. There's stuff, I'm definitely going to keep it simple in this stuff, I'm really happy to add more things. So we were talking about custom types before, I'm really happy to add virtually not virtually any, but like a lot of custom types when someone wants it. Because if you don't need it, it just sits there. And mostly doesn't affect people, right? If you don't specify a column or a field of that type, it doesn't matter. You'll never know it or care. Yeah, yeah. So there's a couple of comments in the live stream, I think maybe we can go ahead and touch on them about sort of future plans. So you did mention that there's going to be a kind of a major upgrade of v2 and Carlos out there as as any plans for 'Pydantic' to give support for 'Pyspark' data frame scheme validation, or, you know, let me ask more broadly, like any of the data science, world integration with like 'Pandas' or other, you know, NumPy, or other things like that, there's been a lot of issues on NumPy. NumPy arrays, and like validating them using using less types without going all the way to, like Python lists, because that can have performance problems. I can't remember because it was a long time ago, but people, whoever was our solution, and like 'Pydantic' is used a lot. Now in data science. If you look at the projects, it's used in by Uber and by Facebook, they're like big machine learning projects, fast Facebook, fast MRI library uses it like it's used a reasonable amount in like Big Data Validation pipelines. So I don't know about 'Pyspark'. So I'm not going to be able to give a definitive answer to that. If you create an issue, I'll endeavor to remember to look at it and have a look and give an answer. But you'll start your 'Pyspark' research project. Yeah. Nice. Also from Carlos really, is the what's the timeframe for v2, I, someone joked to me the other day that their release date was originally put down as the end of March 2020. And that didn't get reached? And it's still, the short answer is that like I need to, there are two problems. One is I need to set some time aside to sit there and build quite a lot of code. second problem is the number of 'open prs' and the number of issues, I find it hard sometimes to bring myself to go and work on pydantic when I have time off. Because a lot of it is like the trawl of going through issues and reviewing pull requests. And when I'm not doing my day job of writing code, I want to like write code or something fun and not have to review other people's code because I do that for a job quite a lot. So I've had like I've ever had trouble getting my like, back end and gear to go and like, work on 'Pydantic' because I feel like there's 20 hours of reviewing other people's code before I can do anything fun. And I think one of the solutions to that as I'm just going to start building v2 and ignore some pull requests, and might have to break some eggs to make an omelet. But I think that that's okay. Yeah. Well, and also, in your defense, a lot of things were planned for March 2020. Yeah. And go right. It's very true. I have sat at my desk in my office for a total of about eight hours since then. Yeah. So I haven't been back to the office in London at all. So So yeah, I would hope this year. Yeah, cool.
34:02 Talk Python to me is partially supported by our training courses. You want to learn Python, but you can't bear to subscribe to yet another service at 'talkpython' training we hate subscriptions to that's where our course bundle gives you full access to the entire library of courses. But one fair price. That's right, with the course bundle, you save 70% off the full price of our courses, and you own them all forever. That includes courses published at the time of the purchase, as well as courses released within about a year of the bundle so stop subscribing and start learning at
34:02 'talk python.fm/ everything'.
34:41 And then related to that risky chance asks, Where should people who want to contribute to 'Pydantic' start and I would help you kick off this conversation by just pointing out that you have tagged a bunch of issues as Help Wanted and then also maybe reviewing prs, but you know, what else would you add to that? I think the first thing I would say
35:00 And I know this isn't the most fun thing to do. But like, if people could help with reviewing discussions and issues like 'triage' type stuff, yeah, just but there's but if you go on to onto discussions, we use the the GitHub discussions, which maybe people don't even see. But like, these are all questions, you can go in and answer if someone has a problem. Lots of them aren't that complicated. I know, that's perhaps not what St. John's meant in terms of like, writing code. And that's obviously for some of us where the fun lives, but like, these questions will be enormously helpful. If people can have like, you can see some of them are answered. And that's great. But there are others that aren't. And then yeah, when I reviewing pull requests would be the second most useful thing that people could do. And then if there are hot wanted issues, just checking that we're still still under it, and it's the right time to do it. And then I do love submissions I noticed today there are 200, and something people who've contributed to 'Pydantic'. So I do do my best to support anyone who comes along however inexperienced or experienced building features or fixing bugs. Yeah, fantastic. Another thing I want to talk about is the right one, I believe, no, the validating decorator? Well, let's talk about validators. First, we touched on this a little bit. So one thing you can do is you can write functions that you decorate with this validator decorator and says this is the function whose job is to do a deeper check on a field, right. So you can say this is a validator for name. It's a validator for username, or validator for email, or whatever. And those functions are ways in which you can take better control over, like what is a valid value and stuff like that, right? Yeah, but you can do more you don't, you can't just be strict as in raising error. If it's not how you want it, you can also change the value that you're going to that's come in. So you can see in the first case of name possessive space, we check that the name doesn't contain a space as a dummy example, but we also would return title so capitalize the first letter. So you can also change the value you're going to you're going to put in so coming back to the date case we were we were hearing about earlier, if you knew your users, we're going to use some very specific date format of day of the week as a string, followed by day of the month followed by year in Roman numerals, you can like spot that with a regex have your own logic to do the validation. And then if it's any other date, pass it through to the normal 'Pydantic' logic, which will carry on and do its normal stuff on strings. Cool. Now, this stuff is pretty advanced. But you can also do simple stuff like set an inner class, which is a config, and just set things like any string strip off the whitespace, or lowercase all the strings or stuff like that, right? Yeah. And doesn't allow mutation, which you've got to there, which is super helpful, where we can drop fields, and we modified the extra there, which is something people often want, which is what do we do with extra fields that we haven't defined on our model? Do we is an error? Do we just ignore them? Or do we allow them and just like, bug them on the class, and we won't have any type hints for them, but they are there if we want them? Yeah, very cool. Okay. So the other thing I wanted to ask you about is really interesting, because part of what I think makes 'Pydantic' really interesting is it's deep leveraging of type hints, right? And in Python type hints are a suggestion. They're things that make our editors light up. They are things that if you really, really tried, I don't think most people do this, but you could run something like 'myPy' against it. And it would tell you if it's accurate or not, I think most people just put it as there's extra information, you know, maybe 'PyCharm' or 'VS code' tells you you're doing it right or it gives you a better autocomplete. But under no circumstance or almost no circumstance does having a function called add that says (x: int , y: int) only work if you pass integers, right? You could pass strings to it and probably get a concatenated string out of that Python function. Because there's no it's not like 'C++' or something where it compiles down and checks the thing, right. But you also have this validating decorator thing, which it seems to me like this will actually sort of add that runtime check for the types Is that correct? That's exactly what it's what it's designed to do. It's always been a kind of like interest to me almost a kind of, yeah, just to kind of experiment to see whether this is possible whether we could like have semi strictly typed logic in Python, I should say, before we go any further, this isn't to be used on like, every function, it's not like 'RUST', were doing that validation actually makes it faster, this is going to make calling your function way, way slower. Because inside validate arguments, we're going to go off and do a whole bunch of low logic to validate every field. But there are situations where it can be really useful and where creating that 'Pydantic' model was a bit onerous, but where we can just bang on the decorator and get some validation, kind of for free, right? Because the decorator basically does the same thing. I mean, sorry, the classes do the same thing, as this decorator might. But instead of having a class you have arguments, and under the hood, what validate argument is doing is it's inspecting that function, taking out the arguments, building them into a 'Pydantic' model and then running the impact against the internet pydantic model and then using the result to call on the protocol effect.
40:00 Yeah, that sounds like more work than just calling the function for sure. It depends on how much it does, right? Is it 'cache'? That kind of stuff? Is it like 'Cache' the class that it creates when it decorates a function? Yeah, same as we do in other places. But yes, it's still a lot more like 'Pydantic' faster for data validation. But it's data validation, not compliant. And like, yeah, so maybe this would make sense. If I'm reading a Data Science Library, and at the very outer shell, I pass in a whole bunch of data, then it goes off to all sorts of places, maybe it might make sense to put this on the boundary, entry point type of thing, but nowhere else. Yeah, exactly where someone's gonna find it much easier to see a 'Pydantic' error saying these fields were wrong, rather than seeing some strange matrix that comes out the wrong shape. Because right, something as a as a string, not an int, or none type has no attribute such and such. Yeah, whatever. That standard error, they always try to. Okay, that's pretty interesting. Let's talk a little bit about speed. You have talked about this a couple of times, but maybe it's just worth throwing up. Simple example here to put them together. So we've got 'Pydantic'. We've got 'Adders'. We've got valid year, which I've never heard about, but very cool Marshmallow, and a couple of others like 'Django', 'REST' framework and service. So it has the all of these in relative time to some benchmark code that you have. But it basically gives it as a percentage or a factor of performance, right? Yeah. And the first thing I'll say is that that lies, damned lies and benchmarks like you'll get you might well get different results. But my impression from what I've seen is that 'pydantic' is as fast if not faster than the other ways of doing it in Python, short of writing your own custom code in each place to be like, yeah, to do manual validation, which is a massive pain. And if you're doing that, you probably want to go write it in the proper compiled language anyway, right, right. Or maybe just use
40:00 'Cython' on on some little section, something like that, right? Where so all of 'Pydantic' is compiled with 'Cython' on and is about twice as fast. If you install it with Pip, you will get mostly the the compiled version. There are binaries available for Windows, Mac and Linux, Windows 64 bit not 32. And maybe some other extreme and it will compile from other operating systems. So it's already faster than than just calling Python. Wow. I don't know about when the validation with 'Pydantic' was compiled is faster than raw python, but like, it'll be of the same order of magnitude. Yeah, yeah. Fantastic. Okay. I didn't realize it was compiled with 'Cython' on that's great. Yeah. As part of the magic I'm making it faster. Yeah. So that was David Montague year and a half ago put an enormous amount of effort into it. And yeah, now doubled the performance. Its pacing coupled with 'Cython' rather than real 'Cython' on code. So it's it's not C speed, but it's it's faster than just calling fighting. Yeah, absolutely. And 'Cython'. Taking the the 'typehint' information and working so well with it these days, it probably was easier than it used to be, or didn't require as many changes as it might otherwise. I think it's an open question where the 'Cyphon' is faster with typehints does. It's in places actually adding typehints makes him slower because it does his own checks. That I think is a string when you've said it's a string. But yeah, I think it does use it in places. Yeah, I've seen more like you don't have to rewrite it in 'Cython' on like enough to convert Python code to 'Cython' on code where it has its own sort of descriptor language. But like, if you have Python code that's type annotated, it'll take that in and run with it. These days, I think it isn't any faster or any better because of the typehints much, although someone out there is an expert. And I don't want to say that. So I'm not sure. Alright, another thing I want to touch on is the data model code generator. You won't tell us about this thing. What is it? I haven't used it much. But yeah, so what we haven't talked about him is just what it is. Yeah, is JSON schema, which is what Sebastian Ramirez implemented a couple of years ago, when he was first starting out on a 'Fast API', and is one of the coolest features of 'Fast API' and and 'Pydantic' is that, once you've created your model, you don't just get a model and model validation, you also get a schema generated for your model. And in Fast API that's automatically created with read doc into really smart documentation. So you don't even have to think about documentation most of the time, if it's internal, or it's not widely used API, and if it was widely used, add some doc strings. And you've got yourself like amazing API documentation just straight from your model. And data model cogeneration, as I understand it, is generating those JSON schema schema models. Is that right? Yeah, I think so. It feels to me like it's the reverse of what you described, from what Sebastian has created, right? Like it, given one of these open API definitions, it will generate the 'Pydantic' model for you. Right? So if I was gonna consume an API, and I'm like, Well, I gotta write some 'Pydantic' models to match it. Like you could run this thing to say, well give me a good shot at getting pretty close to what it's going to be. Yeah, yeah, yeah, I had it ran the wrong way. But yeah, my instinct is, I haven't used it but it gets you it does 90% of the work for you. And then there's a bit of like manual tinkering around the edge to change some of the types I suspect but
45:00 Like, yeah, really useful. Yeah. And it supports different frameworks and stuff. And I haven't used it either. But it just seemed like it was a cool thing related to sort of quickly get people started if they've got something complex to do with 'Pydantic' so for example, I built this weather, real time weather live weather data service for one of my classes over at weather.talk python.fm. I built that and fast API, and it exchanges 'pydantic' models. And all you got to do in order to see the documentation, just go to '/docs', and then it gives you the JSON schema. So presumably, I could point that thing at this, and then it would generate go back to the back. Exactly, you get a fairly complicated 'pydantic' model prebuilt for me, which I think is pretty excellent. Yeah. Alright, so let's say maybe you disagree, but I think the real doc version of the documentation or the auto docs is even smarter than that. Why didn't if you've got it? Yeah, that one I think is yeah. Oh, yeah. This is a really nice one. I like this one a lot. Yeah, even gives you the responses, or it could be 200, or 42. Which I did build that into there. But I didn't expect it to actually no, that's pretty interesting. Yeah, it's cool. It's very cool. So they're both they're either '/docs' or '/read Docs', 'Fast API' will pull them you can switch one off or change the endpoints. But yeah, yeah. And by the way, if you're putting out a Fast API, API, and you don't want public documentation, make sure that you set docks, the docks URL and the redox, URL to none. And when you're creating your app, or your API instance, so yeah, that's always on unless you take action. So you better be sure, well, you can do what I've done, which is protected with authentication. So the front end developers can use it, but it's not publicly available. So if you're building like a 'React' app, is really useful to have your front end engineers be able to go and see that stuff and understand what the fields are. But it's a bit of a weird thing to to make public, even if it's nothing like particularly sensitive. So yeah, you can put in mind authentication. Yeah, very good. All right, you already talked about the 'PyCharm' plugin, but maybe give us a sense for why do we need a 'PyCharm' plugin, I have 'PyCharm'. And if it has the type information, a lot of times it seems like it, it's already good to go. So what do I get from this PyCharm plugin? Like, why should I go put this in? So once you've created your model, if we think about the the example on the on the Index page, again, we would, once we've created our model, accessing '.friends' or '.id', or '.name' will work and 'Pydantic' Well, sorry, PyCharm will correctly give us information about we'll say, Okay, first name exists, like FUBAR name doesn't exist, it's a string, so it makes sense to add it to another string. But when we initialize a model, it doesn't know how, like the in it function, if 'Pydantic' just looks like take all of the things and pass them to some magic function that can do it. It looks like '**kW orgs'. Good luck. Go read the docs. Exactly. And but this is where the 'PyCharm' plugin comes in, because it gives you documentation on the arguments. Okay, so it looks at the fields and their types and says, Well, these are actually keyword arguments to the constructor of the initializer. Yeah, okay. Yeah, got it. It's very cool to completion. And it will also, I don't even know what it does it, I just use it the whole time. But it works. You know those things, but you don't even think about them. But yeah, cool. So it gives you auto completion and type checking, which is cool for the initializer. Right. So if you were to try to pass in something wrong, it was you know, also it says it supports refactoring, if you refactor than our keyword, one of the really useful things it does is when we talked about validators which are done by a decorator, they are class methods very specifically, because you might think that the instance methods and you have access to self you don't, because they're called before the model itself is initialized. So the first argument to them should be class CLS, it will automatically give you error if you put self which is really helpful when you're creating the validators because otherwise without it, 'PyCharm' assumes it's an instance method gives you self, and then you get yourself into hot water when you access 'self.userID' in the breaks. Oh, interesting. Okay, yeah, that makes a lot of sense. Because it's converting and checking all the values, and then it creates the object and assigns a field, right? Okay. Yeah. So we can access other values during validation from the values keyword argument to the validator , but not via like, 'self.userID', whatever. Yeah, cool. And risky chance loves that. It works with aliases too just pretty cool. Oh, yeah, it does. It does lots of cool things. I'm really impressed by it. It's one of the coolest things that come out of Alpine antic, awesome. Yeah, I've installed it. And I'm like, I'm sure my 'Pydantic' experience is better. But I just don't know what is nor what is built in and what is coming from this thing. So yeah, that's we're also used to 'PyCharm' I'm just working on so many things that you don't even notice. Like, you only knows when it doesn't work. So yeah, absolutely. So we're getting a little short on time. But I did want to ask you about a Python dev tools, because you talked about having 'Pydantic' work well, with dev tools as well. Yeah. What are these? You're also the author of Python dev tools. Yeah, yeah. What does this for me it's just a debug print command.
50:00 Pretty and if it color tells me what line it was printed on. And I use it the whole time in development instead of print. Obviously, I wanted it to show me my pydantic models in a pretty way. So it has integration, there are some hooks in, in dev tools that allow it to you to customize the way stuffs printed. And I actually know that the author of rich, he's slightly frustrated, he has used a different system all over again. But he's also supported by pydantic. So I don't think we'll also print pretty with with rich as well as with dev tools. Yeah, cool. Okay. Really nice. Yeah. Rich is a great two way terminal user interface library for Python. Yeah, it's cool. It's different from dev tools, I wouldn't say they compete Dev Tools is for me, it's just it does have some other things, some timing tools and the formatting. But like, for me, it's just the debug print command that PyCharm never had. So what's the 'Pydantic' plugin here connection rather here? So if I buy debug out of dev tools, a model, I get a really nice representation? Yeah, exactly. That it's not showing it out, because you're in the dev tools docks and some other docks in Yeah. Give you an example. But it'll give you a nice example of it expanded out rather than like squashed into one line. So use it with that.
51:09 Got it? Yeah. So you see that here? That's being picked out nicely, instead of Yeah, download that. I suppose that that's kind of that demonstrates its usage for me. Yeah, perfect. That looks really good. It's nice to be able to just print out these sorts of things and see them really quickly. What's the just basic string representation of a 'Pydantic' model? Like, for example, if I'm in PyCharm, and I hit a breakpoint, or I'm just curious what something is, and I just print it, right, like 'PyCharm' will like, put a little grayed out string stir representation. It's right, that thing? That's the string representation you're looking at right there. Yeah, perfect. So you get a really rich sort of view of it embedded in the editor, or if you print it, or did you use 'REPO', then you get basically the you wrapped in user? Right. Okay. So it gives you as if it were, yeah, you're trying to construct it out of that data? Yeah. Okay, fantastic. Well, you know, we've covered a bunch of things. And I know, there's a lot more, I don't recall whether we talked about this while we were recording, or whether we talked about before, and we're just setting up what we wanted to talk about. But it's worth emphasizing that this is not just a 'Fast API', validation data exchange thing, it works really great. A lot of the stuff happens there. But if you're using 'Flask', if using 'Pyramid' fusing, I didn't know about Django so much, because the models and stuff there, but there is someone who can go Django admin. So one of the things we haven't talked about as well as settings management, which pyantic has some pretty powerful features for. And actually, one of the things was added in 1.7, or 1.8 was like, basically a system for plugins to do even crazier stuff and setting. So not just you loading them from environment variables, and from dot m files, but also from Docker secrets. Now we have an interface to load them from kind of anywhere. So you can build your own interface for for loading settings from from places, but someone's built a Django settings tool with 'Pydantic' to kind of validate your Django settings using pydantic. But yeah, I think it's what's cool about 'Pydantic' is, it's not part of a kind of walled garden of tools that all fit together well, that have to be used with each other. It fits with pydantic, but it's used in in lots of other big projects, or you can just use it in in Flask or in Django, or wherever you like, right? If you're reading JSON files off a disk, it could totally make sense to use it or you're doing screen scraping, potentially, it makes sense, or just calling an API, but you're the client of that API, it could totally make sense to do that. Yeah. So just want to point out like, it's super broadly applicable, not just where people see it being really used. And Nick, ah, out there, definitely gonna try this with Django. So awesome. That's cool. Yeah. So let's wrap this up with just I know, we spoke a little bit about v2, and the timing, like, what are the major features that you think like? What are the highlights that people should look forward to or be excited about, there's a big problem at hand, which is the Python 3.10. At the moment, in pet, I'm going to try and remind myself of the exact number but like, in PEP 699563, basically, all typehints become strings instead of Python objects. And so and like that's been available in future, right, is that the lazy evaluation of the annotation, something like that, but it's not even a lazy evaluation. It's a non evaluation and it seems like unless python themselves the core team had to move on this and like be practical about things it might be that pydantic becomes either like, hard to use or even not useful in 3.10. It sounds like really the exam that I'm talking at the Python summit in Python US in May in the like, language in the bit where people discuss it. And I'm going to try and put this forward. But like, I had a conversation today, just before I came online now with with someone who's graced the path that should fix this, but the current response from the core developers is to refuse it. So I'm, like, really worried and frustrated. That might happen and lots of tools fast API, 'Pydantic' typer and others are gonna get broken.
55:00 For the sake of principle, effectively that typing should only be used for static type analysis. So we'll see what happens. And normally with open source people find a way around. But like, I think that's really worrying. And I'll create an issue on 'Pydantic' to track this properly, but it's something to be aware of. And it's something that like, I think those of us who use these libraries need to like, it's very easy to wait until after something's released and then then be frustrated. It's important sometimes to notice before they're released and make a point. Wow. Well, I'm really glad you pointed that out. I had no idea. I mean, I knew there were minor behind the scene changes from a consumer perspective of type annotations. But that sounds like there's more going on. For libraries like this, there's a PEP that will fix this, which is PEP 649, which I have not yet read, because I only got the email about it two hours ago. But if anyone's looking into it, I will create an issue on 'Pydantic' to talk about this. But something like this needs to happen. Also, Larry emailed me an hour or two hours ago to talk about this. But this is a really big problem that we need to like, prevent, like breaking. Lots of cool stuff is happening in Python. All right. Well, I agree. First impressions is I absolutely agree. Because I do think what you guys are doing, what you're doing is 'Pydantic'. What is happening with fast API and these types of systems is a really fantastic direction. And really building on top of the type annotation world. And I would hate to see that get squashed. what's incredible about it just briefly, is that it's used by Microsoft and core bits of office. It's used by by the NSA, by like banks, is used by JP Morgan. But it's also really easy to get started with at the very beginning. And it's, it's wonderful for me that we can build, build open source code that can be useful to the biggest organizations in the world and to someone when they're first getting started. Not this idea that it has to be like, dense and mainframe and impossible or like Mickey Mouse and not worth using, right, like fast API and 'Pydantic' seem to be managing to be both. I agree. I think they are. Absolutely. So Well, congratulations on building something amazing. And thank you very much. PEP 649 keeps things rolling, smooth, hopefully, hopefully, that gets ironed out. All right. Now, we're pretty much out of time. But before I let you out here, let me ask the final two questions. So if you're going to write some code, if you're going to work on 'Pydantic', what editor Do you use, I use 'PyCharm', right on and the 'Pydantic' plugin, I'm guessing and the plugin you write on pydantic plug in. And then if you've got a package on 'PyPI' that you think is interesting, maybe not the most popular, but you're like, Oh, I ran across this thing. That's amazing. You should should know about it, I should probably not break the rule and talk about my own but but like dev tools, which I talked about is incredibly useful to me. And so I would spread the word a bit on that. Other than that, I just do a shout out to all of those packages that people don't see that as a bedrock of everything. So from coverage to 'Starlette', which is the other library, that's the basis of Fast API, Sebastian is great. And I mean, no offense to him. But there's, he stands on the shoulders of people who've done lots and lots of other things. And they're really, really powerful. So I would spare a bit of time for them. If you're thinking of sponsoring someone. Think about, like sponsoring Ned who does coverage, or any of those other like, or bits of 'PYtest', all the workhorses that aren't particularly headline, but a really, really valuable to every all of our like, daily life writing code. Yeah. And I'm going flat out there, as has many of us know about Python, right, because of fast API. I agree. But you know, fast API, as you pointed out, absolutely stands on top of 'starlette', which, you know, there's just this whole chain of things that each one has their own special sauce, but right there, yeah, because, but I should say, again, fast API is awesome. I didn't use it initially, I was. I'm a contributor to HTTP, which is also really cool. But I've, over the last year become a complete convert to fast API. I use it. It's my like, go to tool now. So it's awesome. Yeah. Fantastic. All right. final call to action. People want to check out 'Pydantic' maybe they want to contribute to pay down equity, tell them go and have a read through the docs. And yeah, go from there. If you can make a tweak to the docs make it easier to read. You can answer someone's question or even create a feature that will be That's awesome. All right. And if I'm if I'm not there immediately, and I don't reply for weeks, I'm sorry. And I promise to as soon as I can. Fantastic. Alright, Samuel, thanks for being on the show. It's been great to learn more deep information about 'Pydantic' because it's so simple to use it. It's easy to just skim the surface. Awesome, Michael, thank you very much. Yeah, you bet. Bye. Bye. Bye. This has been another episode of talk Python to me. Our guest in this episode was Samuel Colvin. And it's been brought to you by '45Drives' & Us over at talk Python training. solve your storage challenges with hardware powered by open source, check out 45 drives storage servers at talk python.fm /45 drives and skip the vendor lock in and software licensing fees. Want to level up your Python we have one of the largest catalogues of Python video courses over at talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription
01:00:00 an insight check it out for yourself at 'training.talk python.fm'. Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the iTunes feed at /iTunes, the Google Play feed at /play and the direct RSS feed at /RSS on talk python.fm. We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talk python.fm/YouTube. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code