Monitor performance issues & errors in your code

#328: Piccolo: A fast, async ORM for Python (updated) Transcript

Recorded on Thursday, Jul 22, 2021.

00:00 ORMs are one of the main tools to put first class data access in the hands of non-SQL-loving developers. And even for those who do love SQL, making them way more productive. When you hear about ORMs in Python, we often hear about either SQLAlchemy or Django ORM. And we should, they're great. But there are newer ORMs that take better advantage of modern Python. On this episode, you'll meet Daniel Townson, he's the creator of Piccolo ORM. A great ORM that is async first, but also has a synchronous API. It has a super clean query syntax, and it's easy to learn. This is Talk Python to Me Episode 328, recorded July 22 2021.

00:52 Welcome to talk Python to me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm @mkennedy, and keep up with the show and listen to past episodes at 'talkpython.fm' and follow the show on Twitter via @talkpython. This episode is brought to you by Linode & Us over at Talk Python Training and the transcripts are brought to you by 'Assembly AI', please check out what we're offering during our segments. It really helps support the show.

01:21 You want to learn Python, but you can't bear to subscribe to yet another service at Talk Python Training. We hate subscriptions too. That's what our course bundle gives you full access to the entire library of courses. For one fair price. That's right, with the course bundle, you save 70% off the full price of our courses, and you own them all forever. That includes courses published at the time of the purchase, as well as courses released within about a year of the bundle. So stop subscribing and start learning at 'talkpython.fm/everything'. Dan, welcome to talk Python to me. Yeah, thanks for having me. I'm a big fan of the show. So it's kinda like a little dream to be on the show. Oh, how nice. That means a lot. Thank you. It's great to have you here. And you built some really neat software that I'm looking forward to diving into. It's interesting, because when we talk about your ORM, but it's also there's so many different areas in which now that we're kind of past Python 2, and there's a lot of folks who are saying, you know what, we can really put Python 2 behind us. And let's just build for the future that just opens up so many doors, right? Like, Oh, well type hints, of course type hints, why not type hints? Well, Python 2 is why not type hints, but not anymore, right? And async and all these other things. And just it's really great to have these frameworks like yours coming along that just go modern Python, what can we build now? Yeah, I totally agree. I felt like Python 3.6 kind of close the door on the should we still be using Python 2 conversation? Because by that point in how async type annotations, data classes, I think with Python 3.7 enums, it just goes on and on. Like, it just felt like the Python community knocked out of the park, python 3.6. And I have a theory that is completely unsubstantiated, that actually a lot of the the progress has to do with f-strings. So many people are like, I just want to easily format this code. What do I need? I need Python 3.6, fine. Well, actually, I want to switch and it's such a minor feature. So minor, the pain of going back will be horrible. I couldn't imagine using Python 2 now. Actually. I know. Yeah. Same quick. Welcome to some folks that live stream Chris May. Hey, happy to hear Paul Everett. So glad you can stop by Teddy as well. We'll get some questions in from y'all, I'm sure. Now before we get into Piccolo, and all the cool stuff that I was alluding to there. Let's just start with your story. Dan, how did you get into programming and Python? Yeah, so I've been programming for quite a long time. Now, I might look quite youthful on this camera. But that's just because my webcams got a filter on it. But if you've ever touch up, my appearance is checked. Yeah, I've actually been getting programming with Python for about 14 years. So Django was kind of one of my first frameworks and just really fell in love with it. Because in university, I learned C and then it's like, oh, this this world of Python exists and more easy than C. And so I've used it extensively. Since then, I've mostly been working for startups and design agencies, and then really fell down the async rabbit hole over the last few years to really enjoy working with WebSockets. And building kind of interactive front ends and also really kind of excited the way that Postgres continues to grow and have new features. So the things that really excited me in the Python world, yeah, those things are super exciting. And the web frameworks are coming along to embrace them language features, you know, async and await, we had async IO and Python 3, 4, it didn't really gain a ton of adoption, because there was it sort of had this callback style of working which is not the same as you know, a weight a thing and just keep rockin. I totally agree. So I think they pretty much took what twisted it created and ported it over and they say is all callbacks and it was still better than he still good to have that option in the standard library. But I totally agree async await really kind of set it on fire because it was quite hard to approach async i/o first it was you don't have to think about how you write code differently. You just put the awaits in the spots and then you know, everything kind of flows. It's I know it's not the same but from how you think about structuring your code is real similar. Yeah, and I think the maturity is there now with the library.

05:00 Using the standard library and it feels like you can just dive in and it feels a bit like a no brainer now to use async. Yeah, I think it is a no brainer for sure. So it's interesting that you got into Django in the early days, because Django is the only major web framework that says, Here's your ORM. Here's how you use it, as opposed to flask, this has, do what you want, or a pyramid that says, Do what you want. You know, you could use ORM, but you don't have to, but Django is it's part of the culture and the Zen of it, right? Yeah, I'm a big fan of Django, I think it's kind of a masterpiece, because it stood the test of time so remarkably well, and how you could have written a app 12 years ago, and then in an afternoon, you could have upgraded it to the latest version. So there's a lot to be said, For Django. And I really like the tight integration. So people tend to prefer one or the other, the kind of the flask route where he can make all of your choices, but I like really tightly integrated solutions where you come to a problem, and then you pretty much know what to pick. Yeah, so that's definitely inspired Piccolo to some extent. I love the diversity of technology and libraries and ways of doing things in the Python space. But when people come from other technologies, it's a challenge to go, oh, here's 20 different ORMs, you could use in 7 different web frameworks. And they're like, I don't want seven. I want to know what I should be doing. What should I do? All right, and it's a mixed bag. I really like that. But the thing is, if you come from the JavaScript world, then it's a relief, because you've only got 20, not 2000. Oh, my goodness, this one's seven months old. It's like ancient. Can we still use it? Yeah, that's a whole different discussion, the JavaScript churn over there. Yeah. But yeah, let's focus in on what you built here. So let's talk about Piccolo used to sell it, you describe it as a fast async ORM. For Python, that's easy to learn. And I think it seems like it really hits on all those points. Tell us about it. Yes. So all those terms are like fairly aspirational. So the fast is because it's built on Async PG. So this is a really fast then Postgres set up the magic snack guys built by the reason it's so fast as it's written in siphon, so it's compiled. And then there's also a few other features in Piccolo, which kind of help it speed wise, like frozen queries where it doesn't have to generate the SQL each time it kind of caches some of it. So that's kind of where the fast comes from. async are nice. async supports async. io, that was really one of the core reasons we're building it is because I was a Django channels power user. And I had to build a lot of chat applications in my day job. So for me, async was kind of essential and easy to learn. It's also aspirational. I've put, like, as much effort as I possibly can into the documentation and stuff like that, but I'll let the readers be the judge of that. Yeah, that's in the eye of the beholder, right. But at the same time, you have done a lot of things that I don't see others doing like the documentation is good. Obviously, that's pretty standard on on a lot of the different frameworks, although they can be super intense, right? Like I love SQLAlchemy, but when I go and read their I'm like, okay, here's a 20 page Doc, I better pay attention and turn off my music and just like not miss something on the other hand, but what you are doing that's pretty unique is you've got like a playground, you've got a playground for the admin backend. And oh, by the way, there's an admin back and you've got a playground for trying out queries and stuff like that, right? That's pretty unique. Yeah, so I was really excited by the playground concept. So it's something I've borrowed from the swift world. And because Apple is trying to make swift accessible to newcomers, they have this concept of a playground, which is basically a pre populated chunk of code. And then you can just play around with it as kind of the name implies. So I use ipython, to achieve it. So a lot of people know ipython, just running it on the command line like a terminal, but you can actually embed it within your Python code. So the way the playgrounds work is it creates a SQlite database, it loads an example schema, populates it with data and then basically launches the ipython shell. So you basically have all of that set up. Because one of the problems we're living in a learning an ORM is the barrier is huge. When you think about it, for a newcomer, you have to set up the database, you have to understand schemas, and migrations and running them and populating data. So I just wanted to say, Look, I'm going to shortcut all of that you install Piccolo, you do Piccolo playground run, and then you've got an example schema. And then you can actually follow along with the documentation. So whenever there's a example query in the piccolo docs, you can actually run in the playground, and you should see the results. Yes, it's quite excited about that. I like it a lot. And you're right, that it is challenging. You know, it has to be build up experience. So we get used to it. It's like, Oh, yeah, you just you fire up the database server, and then you connect to it, then you do this thing. But how do you get the database server installed? What if you have the wrong version? What if it requires authentication, but if there's a firewall, just like there's layers and layers of places where people get sheared off, my God, this doesn't work for me, right? And so keeping that simple is great. I think I just want to give a shout out to SQlite, because what you just said is such a good example of you probably going to run this against Postgres or something like that. Yeah. But for the literal example, you don't need that you can just use SQL Lite. And it's so nice that that's a serverless in the sense that it's just embedded doesn't require a separate process to run or connect to it's already comes with Python, generally speaking, it's really nice. Yeah, I totally agree. And that's the reason for SQL I support with in Piccolo because as you said, no one's gonna run it in production. But then it's just that frictionless setup, which is so great. Like using SQL lite, asynchronously doesn't really make a lot of sense because you don't really have any network lag talking to the database. So it's great to have it in there for convenience. Yeah, it is. I suspect there

10:00 Maybe some people are running in production in like an early early prototype stage. Yes. Wanting to get something up in here, let's just put it up on, you know, on some hosting place and just check it out real quick. And then we'll go from there. Yeah, one thing that's quite nice is, I used to do this in a design agency we do SQlite for the prototype. And then you could also sync it down to your local machine. So he didn't have to do a database dump and reload it, you could literally just r-sync the file. Yeah, there are a bunch of conveniences with SQL lite, for sure. Yeah, till it gets to be too much data. I really want sort of too much concurrency or Yeah, multi machines scale out and all that sort of stuff. So it's not the final destination. But I do think it's a really interesting and important starting ground there. Yeah, for sure. Yeah. So I gave a shout out to two ORMs already, the Django one and SQLAlchemy. And I feel like those are probably the two big hitters of the Python space, although there are many, many more why why not just use those? Like, why go and create it? Yeah. So when I started it, there was no async options for ORMs at all. It was all very new. And my day job involved async programming all at the time, because of chat apps, online games, that kind of stuff. Yeah. So that was really why I was like, Well, nothing really served my needs at the time, right? Because until recently SQL alchemy was had no async capabilities. Yeah. And Django ORM more or less, still doesn't, right. That's the final part that they're trying to move to async. Yeah, so the workaround is to run it in a thread, but it's just not as performant as running with async IO directly. So there's kind of this vacuum for a time with async ORMs. But then it was also about just starting with a blank sheet of paper and thinking look like we've got all these great tools to work with. Now in Python, 3.6 and onwards, how can I absolutely push into the limit. So the syntax, it uses a lot of direct references to objects rather than strings. So for example, in Django, you do my table, the object and then dot filter, and you name equals down to problem with that is, you can quite easily make a mistake as the code changes. So you might end up having a column which no longer exists, or strings can be a little bit fragile. And also with Piccolo, it's just object references everywhere. So rather than name, it's a nice example here, it's band dot name, where band is the name of the table. Right? So it's like many of the ORMs it's sort of class Oh, p based rise up in class that maps to a table and database, right? Yeah, so so that's one distinction I'd like to make quickly as well. So there's, there's all ORMs and query builders. And in Python, there's not really too much of a distinction. But in other languages, like JavaScript, their query builders are really popular as like a separate idea to ORMs and query builders, they kind of you create a SQL query, you execute it on the database, and then you tend to work more with dictionaries and lists, right and objects. So I'd say that 90% of Piccolo is a query builder, but people are used to ORMs and Python, and it can actually lead you to some quite poor patterns, ORM's because what the first thing people learn with Django is they'll go my table.objects.get, so it returns an object, and then they'll change the attribute of the object, and then I'll call save, but then that's like, that's actually to database queries, because you have to get it and then you have to update it well. And it's not just to database queries, it can be a lot of serialization. Yeah, like, the slowest part of all of my data stack is, if I were to try to go back 10,000 records at a time, it's the D serialization of those 10,000 things, that is actually the slowest part of the entire process. And here, if what you're talking about, like you're talking about the band, if the band has tons of information, and you just want to change the name, right, you're pulling all that data back, converting it or whatever, then change that thing, and then pushing it back down. Yeah, a lot of a lot of the time, what will happen is you'll serialize it into an object, and then you'll De serialized into JSON anyway, and you kind of it's kind of pointless different when you think about it. And there's also the problem with objects getting stale. So you might pull an object into memory, but then some other user might manipulate all those objects. But then when you save it, you're going to overwrite those fields that have been updated in the database. So now, it's problematic in a way, so it is an ORM. But then I'd really encourage people to look at it as a query builder to because in my own apps raw, I use the Select method a lot. So rather than returning the objects, it returns dictionaries, and then it has this option and output methods. So you can just literally decide serialize it straight into JSON. So this is what makes it fast, because the query is going for async, PG, which is super fast. And then it's coming back as a dictionary, straight from async, PG, and then it's using all JSON to stick it straight into a string. So you're kind of missing out, you're skipping all this D serialization nonsense. Yeah, this is super interesting. I'm not now that you pointed out I'm not used to seeing projections in ORMs. What I'm used to seeing is give me back the classes that in this case, a band, right do a query like where the popular is greater than 100, or whatever, and give me a whole bunch of bands, band objects back and then I'm going to work with them. But and sometimes, like the one I work with most commonly is Mongo engine. I know it's not an ORM is ODM, but it was enough as the serialization bit and you can say I don't really want the other parts Just give me the name like the website or something and it won't ship or D serialize. Those things but you just end up with the same objects. They just have none everywhere else right, we're in your ORM you can actually say band a dot select band dot name, band dot URL or something like that. And that will return the dictionary with those two things, right. A list of dictionaries

15:00 List of dictionaries Yeah. And then it also has a feature like Django has this feature, which is invaluable. And it's called values list. And so if you just want one value back, it will then condense it down to just a list of values, right? Like, I want all the IDs of the bands that play pop music or something. Yeah. And it's so much more efficient than doing like in Django, you know, bounded objects.all and then looping through to get the IDs. It's just

15:23 like as if you use Django for ages, you just learn these little tricks, right? Another one that stood out to me is the ability to do set based operations. So is when I think of ORMs, I just for everyone listening, I adore ORM. I think they're really empowering for people, I think they take a lot of the modern tooling that we love, like refactoring and allow you to apply that like over to your query, because your if you wanted to, like change the casing of band.name, you could refactor, rename that, and it would affect your queries, because that's still Python code, right? That said, there are places where people either abusing it, or it's just inappropriate. So the places where it gets abused a lot would be the (n+1) problem, right? Where you've got a lazy reference to something else. And you don't know that that's going to be a separate query for every time you touch one of those and you get a list of objects back and you loop over I mean, you access like, in your example, we got band.manager for be in band band.manager, right? That could be 101 queries for what should have been one, right? Yeah, that's, that's a really good point. And even experienced, developers get this wrong, because they might use serializers, which are calling properties under the hood on the table, which are triggering SQL query. So this is another design intent behind Piccolo is whenever a query is run, it's very explicit. You're literally calling .run or a run sync. So there's none of this magic, you can't accidentally create an(n+1) query, you might accidentally end up with a code routine or something. Yeah. So yeah, it's a really good point, because I think n plus one is kind of the scourge of developers with performance. And also coming back to your point about ORM. s. As a back end developer, we can spend hours a day using ORMs is kind of like one of the main tools in our tool belt. So it's kind of quite nice to start from a blank sheet of paper and drink. How can I make that experience maybe like slightly better if I can.

17:01 This portion of talk Python to me is sponsored by Linode. Visit 'talk python.fm/linode'. To see why Linode has been voted the top infrastructure as a service provider by both G2 and TrustRadius. From their award winning support, which is offered 24/7 365 to every level of user, the ease of use and setup. It's clear why developers have been trusting Linode for projects both big and small, since 2003. deploy your entire application stack with Linode to one click app marketplace, or build it all from scratch and manage everything yourself with supported centralized tools like Terraform. The Linode offers the best price performance value for all compute instances, including GPUs, as well as block storage Kubernetes. And their upcoming BareMetal release. Linode makes cloud computing fast, simple and affordable, allowing you to focus on your projects, not your infrastructure, visit 'talkpython.fm/linode'. and sign up with your Google account, your GitHub account or your email address. And you'll get $100 in credit, as 'talkpython.fm/linode'. Or just click the link in your podcast player show notes. And thank them for supporting talk Python.

18:11 The (n+1) problem I believe is a either there's some tool doing something behind the scenes you don't know. But often it's just a lack of understanding Oh, that actually is a lazy loaded property, which is going to trigger a query. So I should have put a join and then I'd be in a better place. That's a programmer pattern thing that you should pay attention to and work with the one where I don't know how to fix it is more like the serialization thing. Like what if I want to go through my database and go to 10,000 Records and make some changes to them? So often, it's do the query loop over the 10,000 things and make a change, call save? Yeah, maybe it's in one giant transaction that you finally push the changes back, but but you're pulling all the data back. And one of the things I really like about your ORM here is like this update section here, where you can do set based operations without pulling the records back. Yeah, so you can do stuff like so this example here, weight band, update, bundle, popularity. 10,000. But then you can also do band.popularity is (band.popularity+10). And then in the database, it will then just add 10 to all of the numbers. Oh, really. And then it's, it's all just magic around, you know, Python magic methods. It's just as a library offer, it gives you so much power. It's one of the things I love about Python is when you're building like query languages, like ORM I think very few languages can really rival Python with its flexibility. Yeah, that's that's really why a lot this stuff's possible. It's really neat. And I think the ability to push these changes without actually you're still programming in this ORM classes and the models but you're not actually pushing a whole bunch of them back and forth to make the changes but to do the set based operations like delete them where or make this update to this value where this is true and then just push that make that happen in the best way you can a SQL right? Yeah, exactly. I think I'm actually point around the n plus one. I think properties are something that can be a little bit evil, and I really shy away from them in the piccolo code because it can you call it property

20:00 And you think you're getting a value back, but it could be doing any kind of magic. And then once you've defined something as a property, you then can't add arguments to it without breaking your API. So yeah, I think that that's something I've tried to steer away from in general Piccolo properties. Yeah. But a lot of hidden stuff. Yeah, it's not entirely clear, I think they're super useful, but certainly in something where I thought I was accessing a field or the class, but what I actually did is make a network call, like that distinction is possibly too big of a bridge to just make that automatic. A lot of the times, yeah, there's no async properties in Python as well. So that's kind of one of the reasons why it doesn't use any async properties.

20:37 If they add it maybe I'll put a comment on the pep saying don't do it. Yeah, exactly. Another thing that's interesting here is an all this code, every code you've written is 'awaitband.select', or 'awaitband.delete', and so on or update. And then at the end, you say, run, this is the explicit part that you're talking about in your API, like, I know, here's where it's happening. And it probably makes a lot of sense to do that as well. Because on the flip side of it, that's where you have to await it anyway, right? Yeah. So what happens is, you build up this query, you just chain methods to it. And then at any point, you can print out that object, and I'll give you the SQL. And then until you realize, yeah, and then until you're actually await it, there's something under the hood, that they're only publicized, you don't need the dot run, if you're await, it will run as a convenience because people forget, but then it just makes it easier from a documentation perspective to say when it's async, use run. And when it's synchronous, use run sync, right? And then if you do run sync, then I've got like a bit of magic in where it tries to create an event loop to run it or tries to figure out if there's already an event loop if there is run in there. So you can use Piccolo in an old school WSGI app, if you want to, to just synchronously? Yeah, well, let's dive into that. Because that's one of the things that really stood out to me, many frameworks or API's packages tell you, you're going to either have to go, you know, go take a fork in the road, you're going to go down the async fork, and you're gonna use the async library, like HTTP x, or you're going to go on another fork, and you're gonna use the request library that has no async. And you're gonna go down that path, and you choose, and then you just go and with Piccolo, you can actually run, I guess the default behavior would be to be async, and await, but it has this dot run sync, which will kind of cap where the asynchronous behaves, and goes, and it'll just that you could run it in a regular flask app or Django app or whatever, and not worry about it being async at all right? Yeah, that's right. And it's actually one of the design challenges for Piccolo is how do you create an API, which is synchronous, and asynchronous. And there's only really two ways of achieving it is with a method like run or run sync or with context managers. So some of them, you'll create either an async context manager or synchronous, and then that will then impact whether the underlying query is synchronous or not. But then it adds a little bit more boilerplate. If every time you want a query, you need a context manager to tell it to be async. So this was kind of the best outcome I could think of was just how to run or not run sync, I think this is great, especially since even if you forget to run, it'll still run async. But there's a way to kind of cap it. So something I wanted to talk about, it's driven me crazy, ever since async, and await were introduced because I don't find it to be true. But I hear it all the time spoken about in the community as async in a way, they're super neat, but they're like viruses. And in the sense that soon is like one of your functions way way at the bottom has to do something async well, then the thing that calls it has to be async and await it and the things that call that function now all have to await it. And that percolates all the way to the top of your app. And so now you've I use in any async library, you've turned your entire thing into this, like async, vertical call stack, what your example here shows that that's not it doesn't have to be that way. Right? That's sort of the naive, I'm just gonna, like write the code without thinking about it. But if you want to, say, have your data access layer, do three things, it's just got to pull up some stuff from different places, you want that to be async, it doesn't mean that function has to be async. It could just start its own event loop, do the three things faster than without it and then return the answers, right, you can kind of cap it at any level that you want. And your your run sync is kind of an example of that, like you can choose to not have it just turn your entire app async you can jump between them. Just typically, if people use an async, then it's like the argument is if you need async, the whole app probably should be async. Because otherwise, why use an async. But then you can flip between them quite easily. So if you've got a synchronous app, and you want to call some async code, there's async IO .run and you can also do stuff likes spin up an event loop in a different thread, and then send work to that. Yeah, absolutely is quite fluid, you can flip between them quite easily. I mean, just one example that comes to mind is what if I wanted to go web scrape every page at a certain domain? So I've got a function that gives me a domain, I give it a domain, and then I want it to return or store into the database, all the pages. Yeah, right. That would be perfectly reasonable to have that thing. Okay, well, let's do a request, figure out what all the pages are. And then just, you know, recursively, sort of grab them all asynchronously, you would get a huge boost on getting every page off a site, even if that function blocked. You know what I mean? Because it itself could just go crazy against the server. Maybe it shouldn't, but it could, yeah, I'm a huge fan of async io.gov as well. So that's Yeah, that's a really like beautiful API just for saying do these 50 things now please? And

25:00 Let me know when they're done. Yeah. And block, right? You don't give me give me a list of answers or errors. Yeah. Also on the livestream Chris man, Hey, Chris says I'm so excited to use Piccolo with unsync have a workflow that'd be nice to paralyze. And yeah, so I think unsync is another really interesting library that wraps up async i O plus threading, plus multi processing, but then gives you a nice way to cap it as well. Because you can go like be given a task that comes back from there, if you just ask for the result, and it's not done it'll just block like a regular regular thing. And it does what kinda what you're talking about, it'll have a, it has a background thread with its own event loop, and it just pushes all the work over to there. So cool library. Yeah, it is at least for the size, right? Like 126 lines and it unifies those three API's and add some more stuff. That's that's pretty big bang for the the Python byte. Yeah, that's impressive. I wish Piccolo was not. I think 1000s 10s of 1000s of lines by now. Yeah. So another one of the new Python 3.6 onward type of things that's really cool is the type annotations. Yeah, I love type annotations. So part of my day job in the past was using swift and Swift is almost like pythons, like brother or sister, it was very heavily inspired by Python, right? It's like if Python all the sudden decided it was incredibly strict about typing and type definitions. Yeah, it would be a lot like swift right? Didn't work compiled. Yeah, it's a combination of type annotations and the tooling to support it. So in pythons case, it's VS code, and in Swift cases, Xcode. And it just means that when you're refactoring, it just makes it so much more confident about what's going on. Yeah, it provides documentation because previously, people were putting in a doc string anyway, so why not put it in your function definition, and then you can introspect it. And mypy is incredibly powerful. I honestly don't think I could have built Piccolo without type annotations, because it makes the code so much more maintainable. That the click to go to VS code as well. It's just a beautiful usability improvement. And then kind of one of the hidden benefits is it makes tab completion. So good. Yeah. So a lot of python, auto completions, they use a library called Jedi under the hood. So when I was building Piccolo, I had a look at the source code to try to figure out how it does its magic. And if you give a type annotation to something like this is what this function returns is a really strong indicator to Jedi that this is what's going to get returned. I don't need to do any magic anymore. Yeah, I would say that half the time. That's why I do it is to make the editor better. So both VS code and PyCharm, like take a good look at what the type annotations are. Yeah, right. He just say, oh, you're trying to pass a string. And that's really supposed to be an integer. But then also, like you say, tab completion or autocomplete all over the place is fantastic. Yeah, I think there's a distinction as well, where I think if you're building an application, let's say you're building a Django app or a flask app, you don't need to care quite as much like I personally would still add type annotations for libraries. I think it's just absolutely essential. Like I don't think any new Python libraries should be written without it. Because you kind of shortchanging your users in a way because I totally agree. It's worse. Yeah, yeah. In the flask app, for example, that you've mentioned, you know, I would say, on the boundaries, right, like, here's a data access layer, like put the type annotations on those little bits there. And then the rest of the app, usually the editors, and the tools will just pick it up and continue to tell you what you're working with. Yeah, but but if you're doing a library, right, you want every function or every class to be kind of standalone and know everything it can. Yeah, definitely. And one more thing about type annotations is it's probably the greatest source of interview questions ever made. Because you can ask people in an interview, what's the difference between like a sequence and an iterable. And like, when you use type annotations, you really start to think about what's going on. And it's it's a great learning experience to, like, I want to pass a generator here, but it takes a list of things and it says that won't work. Maybe you just need to relax your type annotations to an iterator, iterator, thing, quick question from roller out there. in live stream, hey, roller, can I use Piccolo in place a Mongo engine? Or is it just for a relational stuff? Yeah, it's just relational. And it's, you can use SQL lite locally, but it's mostly Postgres. It was really built to take advantage of Postgres, because Postgres is it's like the fastest growing SQL database in the world, which is remarkable to think it's how old it is. And yeah, I think, you know, I think adding over SQL databases would be quite easy. But adding some, like, Mongo would be a bit trickier. I wouldn't say it was impossible, but a bit more work. Yeah, I would think so. Yeah, it's not certainly not impossible, but like joints and stuff would get tricky. What about SQL injection? I mean, many of us have heard about the little Bobby tables XKCD, which is delightful. You know, sort of SeanFreud a sort of way.

29:17 We all kind of don't relish in somebody else suffering this, but I find that this is actually one of the really nice things about ORMs. Most of the time is that they scrape off the ability to do SQL injection, because you're not building the SQL. Yeah, definitely. So you database adapter. So something like async, PG or psycho PG in the synchronous world, what you want to do is you want to pass it the query string with placeholders for any user submitted values, and then you submit the values that separately like an a list, and it sounds like a parameterised query, basically. Yeah, yeah. And as long as you do that you're safe. But then for a library when people are programmatically creating very complex SQL queries, and then you need to try and make sure that you've got the right values that match the right placeholders to pass to the adapter.

30:00 It is quite challenging. There's some like recursive code where it has to. So we use some query strings internally, we've been Piccolo, so he never concatenate strings for SQL statements, it just uses query strings. And then it compiles them before sending it to the database adapter. And it basically looks through all of the sub query strings it might have if it's a really complex query, and then it kind of passes it to the adapter. But yeah, it's just one of the complex things about building ORMs for sure. And also one of the most dangerous to get wrong. Yeah, yeah, it absolutely is. There's an untold number of bad things that can happen with SQL injection. And it's so easy, all you have to do is put a single little tick to comment out stuff cynical and finish that statement. And then you can run arbitrary code. And a lot of times you can even some database engines will let you run sub process type things, which is even worse, but yeah, you definitely want avoid it. Yeah, it's a good argument for using ORMs and query builders, because it will make it less likely, I think, for sure. Another thing that I wanted to touch on a little bit here is the actual filtering or projection statement type bits. So I mentioned using Mongo engine before, which I'm a big fan of, and it's basically a Mongo equivalent of the Django ORM. So in that regard, they're all similar and you do things like if I wanted to, say, where the band popularity, or say, let's say the band name is PythonIstas, right, you would just say name=PythonIstas, as a part of the filter. And there's two things that are crummy about that. One is you get no autocomplete that there's a column called name, because it doesn't really know what class even though you started out like band.objects, it doesn't in the filter part, it no longer knows that the name came from the band, right? That's not part of the language. And then the other one is you're doing n equals or, you've got like weird operators, like in the name like name__ gt for greater than and stuff like that. Whereas yours, you just write what you would put into an if statement or a loop or something like that. So you would say like Band. popularity, less than 1000? That's the thing you sent in there. Yeah, that's right. So I've been caught up, tripped up so many times in the past with Django, where I've had something like name, double underscore something else, and then it can't really understand that's wrong, like a linter. Or Yeah, when you while you're coding early windows at runtime, and you've got a 500 errors. So the idea here is a linter, would be able to pick up these problems exactly, because so much of the pieces are there just using **kwarcs. And then they figure out how to generate a query out of like, looking for special keywords in the key names, and then turning those columns also the refactoring thing, right, the linters. And if I want to do a refactoring to rename popularity, it's not going to check popular _gt as a keyword argument. It has no idea that was related yet. Yeah, definitely. So the way it's implemented this, the double equals and all these operators is the amazing things about Python is how you can just overload like fundamental things about the language. So you can overload what addition means when someone first tells you that it sounds like the most mental thing in the world, because why would you want one plus one for five, but then it turns out when you're building an ORM, it's golden. And this is one reason why I find Python just so compelling over and over again, is because as a library author, you can do this stuff, you can get closer to more of like a DSL than like a, like normal programming language. Yeah, absolutely. Yeah. So is this be done with descriptors, or what's the magic for the less than there's like Dunder like double underscore lt and you can override that. And then what happens is when you call that method, it returns a where object and then you can also you can do in brackets band.popularity, less than 1000. And then && sign as well. And the popularity greater than 500. So you can combine them with AND and OR statements. So the where statements, and Piccolo can get, like really powerful. So you just have to teach the where clauses how to and and then structure it in a way that Python I'll let it kind of go through, right? Yeah. Or you can do dot where for some stuff, and then another dot where statement, if you've got multiple where statements, it becomes like, it's an and yeah, but yeah, it's just, it's just all Python magic, which is one of the reasons I love Python. Yeah, speaking of over written things, the thing that I think is the most insane, but I'm starting to love but took a while to get used to is the way the path object works for defining paths.

33:59 The / often means like drive separation on the POSIX systems, and you know, it's close enough, you could actually put / in your strings in Python on Windows, and it'll still like, Okay, fine. backslash is what you meant. So they overrode the divide operator in code to allow you to concatenate strings and paths together. And that's just crazy. Yeah, the first time I saw it, I was very confused. But when you understand that, it's okay. But yeah, totally. Yeah. I'm, I've gotten okay with it as well. And I start to use it and I really like it now, but it's like, I don't know if I can get behind this is this is a bridge too far. This is that's division. What are we doing here? Now it's cool. It's talking magic, typical low light all around using metaclasses a lot. And there's something that got added in identify in Python 3.7 or 3.6. But they actually changed metaclasses slightly. So there's now like a Dunder magic method called init subclass. And Piccolo uses this a lot. And it's, it's actually an amazing hidden feature of hyphen where you can now add keyword arguments to a class definition. So if you had the class foo, open brackets, inherits from bar, comma, and then you can start adding keywords

35:00 documents to the class to customize its creation. And that's kind of like a new layer of magic that's been added to Python recently. And Piccolo uses it extensively. But I don't see many other libraries using it yet because it is probably not so well known. But yeah, just kind of spread that bit of magic. So hopefully people can use it too. Nice. Yeah, that sounds awesome. I can certainly see how I'm like trying to create the classes like the band class, or whatever you say that it's going to be would definitely use that. So one of the things that you say is awesome about Piccolo is the batteries included? Yeah. So let's talk about some of the batteries. So yeah, so the main battery by far is the Admin because when I started, I was working for a design agency. And admins are incredibly important for design agencies, because you want to put something in front of a customer that they like the look of, and they're comfortable using. So this is a huge part of the effort that's gone into Piccolo. And so this is a sister project called Piccolo admin. And what happens is, it's it's an ASCII app, so I can maybe go into more detail about it later on. But all you do is you give it a list of Piccolo tables, and then it uses pydantic so pedantic serialization library, and it basically creates a pydantic model from the Piccolor table. And then pydentic models have this really useful method where it's JSON schema, and it creates a JSON schema for the pydantic model, right? Because there by 10 classes, no, this field is an answer. This one is an optional date, time and so on. Yeah, so Piccolo has really good pydantic support, but it's in a sister repo called Piccolo API, and then that, so that creates the pydantic model. And they also have something called Piccolo CRUD. So you give it a Piccolo table, and it creates another ASCII app, which has got all of the CRUD operations for your database. So you can programmatically create a huge API just by giving it a few tables. And then the front end is written in Vue. js and is completely decoupled from the back end. It's just all by API. I'm a huge fan of Vue js, because it's it's very natural for Python developers who are used to template syntax in Django and flask, if you looked at the Vue templates, you'd be like, this looks very familiar to Yeah, that's nice. Yeah, I think it's super close to chameleon, because the attribute driven behavior as well, yeah, yeah. But then like to make a working admin requires so much work, because you've also got the security side. So picollo API has a bunch of really useful asking middleware and has like session authentication, CSRF protection and rate limiting as well, because you don't want people to spam the login. So like, just to get a fairly simple admin is it's like a iceberg to do it properly. So yeah, a lot of efforts gone into the admin bar, I'm really proud of it. And like, this is really what excites me more than anything about the future. Because as we add support for postGIS, and stuff like that, being able to create really interesting widgets around data. So yeah, how can I design a rectangle field for post GIS or location field, or I could see some really cool stuff that are sort of template extensions, you know, like, let's just pick Jinja, for example, like, if you had one of these objects, you could pass it over. And it knew, for example, here's a daytime, you could just say, make a calendar picker here. And it just, you know, logs, you have the JavaScript included, instead of just putting the text it gives you like a nice little a AJAX widget or these, this list goes on a map in the map widget and off it goes. Yeah. So basically our admin, it's just, it's just turns pedantic models into UI. And it's actually quite, it's actually quite interesting. I'd love to get to the point where for a business app, you just use Piccolo admin, you don't have to build UI, you just say, Here are my tables. And then, you know, the truth is so often, like, there's a lot of these little internal apps that people build that are just like forms over data, I just need to see the details, click on one, edit it create a new one and delete one. And like, that's the app I need for this table. Do you build that? Right? Yeah, for sure, is a lot of it because I think we've the approach to talk to Piccolo is you have a lot of Python libraries, and they kind of start from the outside in. So they start from the URL layer, and then the views and like the middleware, and then over time, they then add the data layer and the security but then we Piccolo, it's kind of from the inside out, like I started from the data layer, and then have the admin and some middleware. So it's quite a nice companion to have rasky out. So just kind of pick the framework you like. And then Piccolo kind of gives you the data layer and the admin that's kind of Yeah, what was he talking about the ASCII stuff a little bit, because he did mention that there are some interesting support for those things. And yeah, it's got like, to some degree, native Fast API, flask and even Blacksheep, which is an interesting one support for those frameworks, right? Yeah. So I'm a huge fan of ASCII because I was a Django channels power user and Andrew Godwin created ASCII out of Django channels. It's really like a beautiful thing. If you look so starlet was the one that really built the foundation. So this is a an async library, also kind of like a framework that you can build an app or style it or you can use it to build up frameworks. What's amazing about ASCII is like every component in an ASCII framework is ASCII to ASCII is basically the spec where it's a function that accepts a scope, a send and receive and then if you look at the internals of starlet, everything's ASCII, like the middlewares, asking the endpoints or ask like the whole thing, and it's like super composable. So you can say I've got an ASCII app, and you can mount over ASCII apps within it. So this is what I love about ASCII as a spec. So you can say, take a fast API app or style app and include Piccolo admin. Same with like sheep, right, you could say slash slash catalog is actually handled by this other app written in fast API where the meaning is written in flask or something like that, and just kind of click them together in that cool way. And maybe court would have to

40:00 Either one, but so yeah, and I really loved that as it's quite exciting that you could kind of build an app from multiple frameworks and be like, well, this part of the app will be better served by fast API. But this bit, I just need style to I want black sheep or whichever ASCII framework people can dream up. So it's, I think it's really exciting for the Python community, the ASCII spec, yeah. And we did talk a little bit about the challenges and the like cascading effect of async and await. But if you're already running a framework that has async view methods, there's just nothing to it, right? You just write your code and you just await the bids, you got to await in the VUE method. And it just like the server and the framework handle the event loops and all that business. Yeah, I think as well, what kind of happened is async, io came out, and it doesn't directly affect speed. It's more of a free port. But it's like the Python community took it as a challenge to build faster frameworks. And so there's a lot of them have really quite fast internals, and they do feel quite cutting edge. Yeah, like UV loop and stuff like that. They're like, how can we do this, but have the minimal overhead of adding this, I mean, people do talk about, okay, async, and await will make your code go faster. Well, he won't make CPU code go faster. But so often what we're doing, especially in web apps is waiting, I want to wait on a database and then wait on the network. And then I want to wait on the database again. And then I'm going to wait on an API and then I'm going to send back three lines of JSON, right, like 99.9% of that is just waiting on something else. And when you're using async, and await, like, you can just do other things while you're doing that 99% awaiting Yeah, so you'd have to do a database query, it takes a few milliseconds. But then if you use the time it module in Python, and you see how long basic operations take, they're more like microseconds. So it's like orders of magnitude difference in how long a database query takes to basic Python stuff. But this is why having stuff like UV loops really important because if you had a really slow event loop, it kind of wouldn't make much difference, because the event loops fast as well. And like, a lot of the projects I did in the past, the free ports are really important, because some apps, you won't have a lot of traffic, and then all of a sudden, you'll get 1000 users. So I'm just doing like live events, and you get 1000 people at once. And in that situation, throughput is incredibly important, right? I mean, look at how the whole healthcare rollout in 2008 went right, I just can't help but think there must have been more awaits available to those that those frameworks and those web apps, it just kept crashing and stuff was timing out. And I'm sure it's just like, well, we're just gonna wait on this other slow government API. And we're gonna do it for a lot of people, and it just gonna overwhelm it right. And it's just, it's crazy. It just feels natural. Because like in the web server world, you had Apache and a lot of people move to nginx. That's very similar. It's like event loop driven. And we've kind of seen how beneficial nginx was. So it just makes perfect sense to build your back end in the same way. Yeah, absolutely. Absolutely. All right, a couple more things, we got some time to talk about. Over here, you've got a Django comparison page, which I guess also could be slightly a Mongo engine comparison page, because like I said, they're basically the same thing without the nesting. So if somebody is familiar with Django, and they're like, I would like to consider using this for my framework, or for part of my code, or whatever. It's like one of these, but they already know how to do stuff. In Django, you have like, Well, here's how you would create an object and save it in the different frameworks. Here's how you would update an object and make changes and so on. And you can just go through one at a time and just sort of compare the different pieces, right? Yeah. It's quite heavily inspired by Django. But then I think Django, Django ORM. It's more pythonic. So rather than using where it uses filter for them with Piccolo, it's it's meant to be very, very close to SQL is that the theory is, if you know SQL, it'll be super easy to learn. And when you do need to drop down to a SQL query, there's no like mismatch, you just like, well, I'm always sort of working in a SQL mindset. But there are a lot of similarities with Django still, like, I think people can pick it up quite quickly. Yeah, I agree. Like instead of object that values list, you have a select projection, or instead of filter, you have a wire, but it's honestly not a huge mental jump to make. Yeah. And Piccolo also takes like huge inspiration from Django migrations, which I think is kind of like the gold standard of migrations in any language. So a lot of efforts gone into that. Yeah, that's another one of the batteries that you were kind of touching on right is the migrations. But yeah, but it's migrations are incredibly hard to do, right. I think the Django way is I can only imagine, I don't even want to try to imagine writing that because it seems really hard. Yeah. So the way Django does it is it looks at your tables, and then it creates a migration file, it then adds up the migration files to build a picture of what the schema looks like. And then that's how it then creates subsequent migrations by doing a diff, right? That's why we got to go from this level to that level. So here's the five migrations to basically use an order, yeah. And then you've got it, you've got to do code generation, as well. So like with Piccolo, it has to actually create a Python file. And that's harder than what it seems actually writing a Python file. But if you look at the Piccolo migrations, they're actually really quite beautiful Python code. And there's a little trick I use internally, I use, like the black format or on it before I write out the auto generated code out. That's cool. So at your generated code is pip and all the goodness. Yes. So if you look at it, you're like, that's quite nice. That's clever, actually. Yeah, that's really clever. Yeah, it's cool. Yeah, yeah. I feel like other frameworks, like for example, the cookie cutter stuff, you know, you're just generating code files like crazy. I feel like you could apply that same technique is after we inject all the user enter values, let's just do a quick format on them and then drop them. Yeah, that makes sense because otherwise you'll you'll run your linters or on your product.

45:00 And they'll fail because your migrations aren't correctly formatted. So yeah, that's cool. A quick question from Teddy out in the live stream says I don't use ORM too much in my day to day, what are good use cases besides web apps for them? And where does Piccolo performs better? So some questions? So I think that data science is obviously a big bit. So another reason for building Piccolo is, is data scientists so much on the ascendancy in the Python world, and people are just, you know, still dealing with databases on a day to day basis. So you can use it in a script if you like. And there's maybe a couple of examples in the docs where, you know, you might be scraping some data from a website, and then you just need to stick into Postgres. So that would be another good example for using picollo. Yeah, there's a good example. And then where it performs better, it's, it's really way like you need the async. Or you might still want a web app component. Even if it's just like a data science script, you still might want an admin screen to view that data. Yeah, I think that performs better, you could maybe break into two pieces, where doespciccolo perform better? And I think the async stuff is really important there, like you say, and then where does an ORM perform better? You know, when you talk about performance and getting stuff done, like sometimes it's how fast is the code run? But sometimes it's how fast do I get the final thing built? Right? And I think ORM is even if they're not always the most efficient way, sometimes they're really efficient, but not always. But you know, they could help you safely get to the other side, especially if you don't know SQL super well. Yeah, they hold your hand a little bit. And yeah, I created this while working at design agency. And in speeds really important design agency not not really execution speed in terms of SQL queries, but in terms of scaffolding up and being productive. So there's so Piccolo has something called Piccolo asking you and then to use that command that will basically scaffold your web app. So I support fast API starlet black sheep. That's really cool. So yeah, it's easy to kind of create your starter code and your starter structure from that, right. Yeah, it's a bit like Django where, you know, you create a project on the command line. But with Piccolo, you get an option of different ASCII frameworks. And over time, I'd like to add way more, because there's this, there's many more exciting ASCII frameworks like Quartz, Sonic Django itself is actually an ASCII app, so it could support Django. Django has come along. So if people were out there listening, and their favorite framework wasn't listed in one of those three, or wasn't one of those three prs are accepted, I guess, and they could integrate, you know, their from their favorite Sanic, or whatever they're after? Yeah, definitely. Like any feedbacks, really appreciate it. So like, the community has definitely helped me a lot with Piccolo just just as much as trying it and giving feedback or pull requests are also like really valued, even if you just want to raise an issue to say, Well, how did you do this? Like, you know, that's still welcome. Yeah, awesome. Well, well, maybe that's a good place to talk about where things are going in the future, and kind of wrap up our conversation a bit. Yeah, so I feel like I'll never be bored with Piccolo because Postgres is continuously developing and adding new features. And I almost feel like Postgres has almost like an operating system and awake, like the amount it does is kind of insane. So even has like a pub sub built in, you can do like, Listen notify, well, I'd like to do post j.s support, timescale DB is a really up and coming extension, as well for Postgres, the time series data, but then a lot of the stuff I'm excited about is like on the admin side, so as I mentioned before, Piccolo admin effectively turns pydentic models into UI. So the next thing I want to add is you can basically give it arbitrary pydantic models, and it will render them as forms and the admin. So if you have, for example, you want to send an email, you'll just create a Send Email model give it to picollo admin, and it'll generate a form that stuff I'm really excited about as well just to increase the utility of picollo. admin. So a back end developer could build a functional app for a business without actually writing any code. That's kind of the dream is to build like a really, really great admin. Yeah, these self serve, like once you create the app and hand it off, how far can people go before they have to hire, hire your design agency again, or something like that, the more that they can just run with it, the better I suspect. Yeah, it's such a huge benefit from Django, like having that admin. So I just want to kind of see what I can do with the latest technologies to build really great one. Yeah. What's the story with Django and Piccolo, is there a reasonable way to click them together? Or is it really not too much so far? I think you can. I haven't really tried it much. But it's very configurable Piccolo. So you know, and none of the names they deliberately don't clash with Django. So Django has a settings.py, Piccolo, Piccolo_com.py. And then Django has a migrations folder, but Piccolo has Piccolo underscore migrations. So there's there's no clash there. So in theory, they would work together. There's no like compatibility layer between the admins. So you'd have like two separate admins, or I'd like to add support for Django as it is an ASCII app. And it's the originator of the ASCII standard. And I still think Django is one of the great kind of masterpieces in the framework world that it's lasted so long, and it's still such a rock solid choice. I would like to see what I can do there. Yeah, the closer those could be I think, right like this having the Genesis and so many similar ideas to Django, it seems like they should be somehow working together, which is great. Yeah, that'd be cool. All right. Well, Dan, I think that might be about it for time that we got let me ask you the final two questions that I always ask if you're going to write some code you can work on picollo what editor to use I think I caught a hint of it earlier but go ahead. Yeah, VS code all the way I was a huge Sublime Text and text mate user and out try out this VS code. See what all the hypes about and after five minutes, I was never going back. I just think it's such a great gift to the world for Microsoft. Just

50:00 is better and better as well. Love VS code around, and then notable PyPI package you want to give a shout out to. So I'm gonna cheat and pick two so pydantic because I think it's such a nice serialization library. And I think it could almost be in the standard library, it feels so pythonic and natural and then style it because I think it's just a beautiful like foundation for the ASCII world. And I'd really encourage people to look at the code to see the power of ASCII how it is this like turtles all the way down, every ASCII is quite interesting. So those would be my two shoutouts. Yeah, very cool. You know, fast API so popular now. But fast API is kind of like an opinionated view on top of starlette. to a large degree. Yeah.

50:38 Yeah. Well, I mean, it takes the two things he said, pydantic and Starlit and puts them together and like a nice way, which I think is pretty neat. Yeah, it's got great taste. Yeah, for sure. I just like to say one thing really quick. And just thanks to everyone who's contributed to Piccolo because there's been people who've been contributing for several years by this point, and have put a lot of work in so yeah, just a shout out to anyone in the piccolo community. Yeah, and you know, final call to action. If people are interested in using this. It's, it's good to go it's ready for production, web apps and all that kind of stuff. I didn't really want to promote it before it was ready and I use it in production have done for years and I am quite conservative about pushing stuff out there. Unless I think it's, you know, solid high score 100 unit tests, you know, it's it's solid. I'm not saying there's not some edge case I haven't discovered yet in some version of, you know, Postgres or something, but I use in production every single day. Well, congrats on building a really cool Pythonic ORM. I really like the way that you put things together. And yeah, looks great. And a lot of nice, modern Python features and people should definitely check it out. Cool. Yeah. Thanks a lot, Michael, that you bet. See you later this point. This has been another episode of 'Talk Python to me'. Our guests on this episode was Daniel towncenter. And it's been brought to you by Linode. And Us over at Talk Python Training. and the transcripts are brought to you by Assembly AI. Simplify your infrastructure and cut your cloud bills in half with Linode. Linux virtual machines develop, deploy and scale your modern applications faster and easier. Visit 'talk python.fm/linode' and click the Create free account button to get started. Transcripts for this and all of our episodes are brought to you by Assembly AI. Do you need a great automatic speech to text API get human level accuracy and just a few lines of code visit 'talkpython.fm/assembly AI'. On level up your Python we have one of the largest catalogues of Python video courses over at talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription in sight. Check it out for yourself at 'training.talkpython.fm' Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the iTunes feed at /iTunes, the Google Play feed at / play and the direct RSS feed at /RSS on talkpython.fm. We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at 'talkpython.fm/youtube'. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon