Monitor performance issues & errors in your code

#349: Meet Beanie: A MongoDB ODM + Pydantic Transcript

Recorded on Thursday, Nov 18, 2021.

00:00 This podcast episode that you're listening to right now was delivered to you in part by MongoDB and Python powering our web apps and production processes. But if you're using Pymongo, the native driver from MongoDB to talk to the server, then you might be doing it wrong. Basing your app on a foundation of exchanging raw dictionaries is a Castle built on sand. And by the way, see the joke at the end of the show about that. You should be using an ODM an Object Document Mapper. This time we're talking about Beanie, which is one of the exciting new MongoDB ODMs, which is based on Pydantic and is Async Native. Join me as I discuss this project with its creator, Roman Wright. This is Talk Python to Me, episode 349, recorded November 18, 2021.

01:01 Welcome to Talk Python Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow me on Twitter, where I'm @mkennedy and keep up with the show and listen to past episodes at 'talkpython.fm' and follow the show on Twitter via @talkpython.

01:16 This episode is brought to you by SENTRY and US over at Talk Python Training. Please check out what we're both offering during our segments. It really helps support the show. Hey, folks, it's great to have you listening. As always, a quick bit of news before we talk with Roman.

01:30 I've been looking for a way to explore ideas teaching Python, get some feedback, and then bring what I've learned back to the course content. So I'm kicking off a new initiative over on YouTube. It's called Python Shorts, and the goal is to teach you one thing both interesting and useful, as well as actionable about Python. I have three videos out so far, the first 'Parsing data with Pydantic', the second 'Counting the number of times an item appears in a list with counter'. And third, 'Do you even need loops in Python?' I also published a fourth video about using the 'Stream Deck', that little button device that a lot of gamers use to make Python developers more productive. Check out all four of those videos on my personal YouTube channel, not the Talk Python one links in the show notes. Now let's Talk Beanie.

02:16 Roman, welcome to Talk Python to me.

02:19 Hey, I'm happy to be here.

02:20 Yes, it's great to have you here.

02:23 We get to talk about one of my favorite topics, MongoDB. I'm so excited.

02:27 Yeah, my favorite topic, obviously.

02:29 So, yeah, you definitely put a lot of time into it. We're going to talk about your ODM. People often hear about ORM's Object Relational Mappers, but traditionally MongoDB and other document databases and no SQL databases haven't modeled thanks to relationships. It's more to documents.

02:47 So the D instead of RODM, Beanie which is going to be super fun. It brings together so many cool topics and even relationships. So maybe if you really want it, I guess you could put the R back in there as of recently.

03:01 Anyway, super fun topic on deck for us. Before we get to that, let's just start with your story. How did you get into programming and Python?

03:08 I started when I was a student more than ten years ago now, but I started not with Python. I started it was 2008, I think. 2007 probably. I started with Flash and nobody knew that Flash will die soon.

03:23 Flash was such a big thing when it was new. I remember people were just completely getting amazing consulting jobs and building websites with Flash. I'm like, what is this? I'm not sure I want to learn this, but do I have to learn this? I hope not. But yeah. Anyway, it did kind of get killed by HTML5. And, you know, I think, honestly, maybe it got killed by Steve Jobs really At least earlier than it would have been.

03:46 Yeah.

03:47 But that time I didn't know that. And Action script two and three.

03:53 And then I decided to move to back end, and I chose Django, not Python, but Django, because Jungle was super fancy at the time. And now also at the time, and I was a Django developer, not Python developer, but Django developer here, probably because I knew a few tricks and tips about Django tools, models, use, and etc templating and then somehow I learned Python also. And now I'm more or less good Python developer.

04:31 Yeah, fantastic. I think I just realized, as you were speaking, we might have to tell people what Flash is. I feel like it's just one of these iconic things out of the tech industry, but it might be like talking about Alta Vista at some point. The kids, they won't know what flashes. They don't know. There was this battle getting this thing on people's computer, and it was always like viruses that are being found in him, but it could always do this magical stuff. How interesting.

04:57 Yeah.

04:58 How old we are.

05:00 Exactly. The other thing that is interesting is you said you chose Django, you didn't choose Python.

05:09 Yeah.

05:11 Now people are often choosing Python because of the data science and computational story. But before 2012, there was not this massive influx of data science people. And I think that the big influx was people becoming Django developers. And like, well, I guess I guess I'll learn Python kind of like people becoming Ruby on Rails developers. Like, I don't know Ruby, but I want to do Ruby on Rails. So I guess I got to learn Ruby and Django is kind of our version of that. Right, right.

05:36 Yeah. Then I think Flask appeared after a few years and everyone started to do microservices with Flask, and after that, the game started.

05:48 That's right.

05:50 Oh, my goodness. Out in the live stream are getting some high fives from Pradvan on Django. And he says, Django the right way. Fantastic.

06:00 Awesome. Yeah. I think Django has been massively important for Python. Very cool.

06:07 Yeah.

06:07 How about now? What are you doing day today?

06:09 So I'm principal Python developer now and this is interesting, actually. I changed my job three months ago and I didn't look for a new job. But one day my current manager just texted me into Twitter and said, hey, I saw your library, Beanie. And it's interesting. And we need something like this in our project and probably you would like to participate in and integrate Beanie into project and develop Beanie at the work time. Cool.

06:41 Yeah, sure.

06:42 That's amazing. I mean, on one hand, it would just be cool to have a fun job doing cool MongoDB stuff. Right? On the other, it's like, oh my gosh, I get to use my library and build up my library in a real production environment. That's awesome, right?

06:57 Yeah, I decided the same day, I think.

07:03 Let me think about it.

07:04 Yes.

07:04 Okay.

07:07 I quit my current job. Awesome. Well, Congratulations. That's really cool. Yeah.

07:11 Thank you.

07:12 That only means good things for Beanie, right? It only means more time and energy on it, I would say.

07:16 Yeah. All this, the current release, the future release is possible only because I can work a little bit time on work time, not only on weekends.

07:26 Yeah, absolutely. It's easy to justify, like the Library needs this feature to work right for us. So let me add that feature to the library as just part of the Sprint or whatever. Right.

07:36 And also probably even more important that when I can work on production with Beanie, I can also see what does it mean? Which feature?

07:47 It's so true because there are just these little edge cases. They don't show up under even a complicated little example you build for yourself.

07:56 You've got to put it into production and live with it. Some of that stuff might be migrations, right? Like I never need these migrations. Oh, wait, we have the zero downtime promise we kind of need migrations now or something, right?

08:09 Right, that's correct.

08:11 Yeah. Cool. Well, that's great to hear. Congratulations. Now, I wanted to start our chat off not talking about Beanie precisely, but like a little lower in the tech stack here. Let's just talk about MongoDB for a little bit.

08:28 I'm a huge fan of MongoDB, as I'm sure many of the listeners know out there. I've been running Talk Python training those things on top of MongoDB for quite some time in the very early days. Some of that was SQLAlchemy stuff, but then I switched over to Mongo and whatnot. So I'm a huge fan of it. I definitely think there's a lot of value and there's a lot of these architectures where people talk about, oh, we have a Redis middle tier cache because we got to have our website fast. You know what, we get ten millisecond response time and there's no cache. It's just talking to the database because everything is structured the right way. Anyway, I'm kind of going on too much. But what I wanted to start with was I want to hear your thoughts on why build on top of Mongo. So many people in the Python space build on Postgres, which is fine. It's a good database and it's just a completely different modeling story. So why are you interested in Mongo?

09:26 First of all, Mongo is super flexible by design and I really like to do to do prototypes. So when I just come up with a new idea of new project, home project and etc. It's quite simple to work with Mongo DB instead of Postgres when you have to change painfully schema of your data. But with Mongo you can just do what you want to.

09:53 That's been my experience as well that I remember. Almost every release of my code would involve some migration on the SQL alchemy version, and I think I've done one. What you would consider a migration in like five years on Mongo DB whereas everything else is like, I'm going to add this equivalent to a table, I'm going to add a field to this document, but it's just goes in and it adapts. It's fantastic. It's like plastic instead of something brittle.

10:20 Even in the indexes. Yeah, that's why it's great for me.

10:23 Yeah. It's really easy for prototyping. Right?

10:25 Instead of trying to keep the database in sync or whatever, you just work on your models and magic happens.

10:31 Yeah, you're right. And then for sure, for production, you have to understand your profiling, your needs and then you can move to positive or you can move to something like Timestamp database, click House for example. But in most cases Mongo is enough. And for some cases Mongo is the best choice because of flexibility and because of many cool stuff like indexes and search.

10:57 Yeah, absolutely. I think indexes. I don't know, this is maybe getting ahead of ourselves, but I think indexes are just so underappreciated in databases. I know a lot of people out there make sure their queries have the right indexes and stuff in it. But there's also so many websites I visit that take 3 seconds to load a page and like, there's no way there's an index on this query, there's just no way. I don't know what it's doing, but somebody has just not even thought about it. And if it was a weird little oh, here's the reporting page. Fine. But it's like the home page or something. You know what I mean? How are they not making this faster even in Postgres story, right.

11:37 I feel like there's one thing to have a database that does something, there's another to tune it to do the right thing, regardless of whether it's relational or no SQL.

11:47 Totally correct.

11:48 Yeah. You have a lot of experience with databases. I mean, you must have that feeling as well. You go to some website, you're like, what is it doing? Why is this thing spinning what could it possibly be doing here?

11:58 Yeah, probably somebody is going to get data.

12:01 Exactly. You're thinking through the idea. Okay. Is it just not have an index or is it an int plus one problem with some ORM?

12:10 Why am I waiting here? What mistake have they made?

12:14 Fantastic. All right, so the next thing I want to sort of touch on is this tweet from Scott Stolzman sent us out yesterday. I don't think he knew that we were coming up with this conversation, actually.

12:26 So there was this programming humor joke. It says, I guess one of them is like a kitchen, the other is an office, but it doesn't really matter. The kitchen is super organized and it says MySQL, everything's little buckets and put away nice. And then there's a desk area that it looks like a hoarder house or like an earthquake hit and destroyed this area. And it says Mongo DB.

12:55 And it's actually true.

12:58 It is.

12:59 It can be true. And the reason I bring this up is Scott said, you know, I know a guy who made this course that saved me from this chaos with Mongo engine, because it can happen. It can happen in about 15 seconds without a strong plan. And so out of the box. The way MongoDB the Dev folks there suggest that at least provide let's put it that way. The tools they provide for you to work with MongoDB are dictionaries. Like, you can give us dictionaries and put them somewhere and then you get dictionaries back in Python. Dictionaries are just whatever, right? There's zero structure, there's zero discoverability, there's zero type safety. Sometimes it's a string that looks like a number, other times it's a number. Good luck those don't match in a query. It gets horrible. And so for me, I feel like what you need to do when you're working with databases that have less structure in the thing itself, like this is MySQL. I'll say Postgres, Postgres says the table looks like this. This column is that size of a string. This is a number, and that's it. The structure is in the database where in these document databases the structure is in the code.

14:16 And so you should have some kind of code that helps you not end up in a situation like this, right? Yeah.

14:21 Honestly, Postgres also came up with JSON B fields now. Yeah.

14:26 So maybe they fall into that bottom bucket, like in a small little area.

14:30 Yeah.

14:32 This portion of Talk Python to Me is brought to you by SENTRY. How would you like to remove a little stress from your life? Do you worry that users may be encountering errors, slowdowns or crashes with your app right now? Would you even know it until they sent you that support email? How much better would it be to have the error or performance details immediately sent to you, including the call stack and values of local variables and the active user recorded in the report with Sentry. This is not only possible, it's simple. In fact, we use Sentry on all the Talk Python web properties. We've actually fixed a bug triggered by a user and had the upgrade ready to roll out as we got the support email. That was a great email to write back. Hey, we already saw your error and have already rolled out the fix. Imagine their surprise, surprise and delight your users. Create your Sentry account at 'talkpython. fm/sentry and if you sign up with the code 'talkpython' all one word. It's good for two free months of Sentry's business plan, which will give you up to 20 times as many monthly events as well as other features. Create better software, delight your users and support the podcast. Visit 'talkpython.fm/sentry' and use the coupon code 'Talkpython'.

15:48 It's also one of the things why Schemas about structure.

15:53 So the thing I think saves you from Scott mentioned Mongolia, which is pretty good. But I think the thing that saves you from this are these ODM, like you have a lot more structure in your classes and your Python layer, right?

16:07 Yeah, correct.

16:09 And also Beanie ODM based on Pydantic. Pydantic is Python library data and stuff.

16:17 Yeah. Let me read the little introduction bit here, because I think there's so much in this first sentence. So Beanie is an Asynchronous Python object document Mapper. Odm from MongoDB.

16:30 So ODM, we talked about Mongo, we talked about Asynchronous, and also I didn't finish sentence based on Motor, which comes from MongoDB and Pydantic. So it's also Asynchronous, right. Which is pretty awesome. As in Async and Await. And so often the models that we build for the databases are their own special thing. Then you got to build maybe an API and you might use Pydantic or something like that, but Pydantic has been really coming on strong as a super cool way to build object trees and object graphs and stuff. And so Pydantic is a perfect thing to say. Well, let's just use that. Everyone already knows how to use that, and things like APIs can already exchange those on the wire.

17:19 And that's why I chose this.

17:23 It's just such a neat combination of bringing the Async and Await stuff together along with Pydantic and saying let's see if we can use those ideas for basically for the ODM. So there's other ones. Scott mentioned Mongo engine. That's actually what I'm using right now for my MongoDB stuff in Python.

17:42 Mongo engine is pretty good. I feel like it's kind of wherever it's going to be. There's not a ton of excitement in terms of new features and pushing stuff forward. For example, to my knowledge, there's no asynch and Await stuff happening in there.

17:59 I think it's synchronous. Possibly there's something that's happened that's changed up, but the last time I looked, it was synchronous.

18:05 Still, weeks ago it was synchronous.

18:07 Yeah.

18:07 Okay.

18:08 That's much more recent than I've looked and what are some of the other ones? I'm trying to think of some of the other MongoDB ODMs out there.

18:15 I know in another language it's probably like Mongoose in JavaScript. Actually record stuff for Ruby.

18:24 Yeah. So a lot of these systems were based on the Django ORM model. So for example, like Mongo engine is basically Django ORM but adapted for documents and Mongo. Right. Like terminology and everything is quite similar being based on Pydantic. Yours is a little bit different. I feel like there's a lot of interesting things you've choices you've made. One, to be based on Pydantic and how that works. But two, we'll get into the API and stuff. But when you look at the API, I feel like you've chosen to be very near MongoDB's native query syntax or query language. So for example, instead of doing a select, you would do like a find or find one or the updates. And like a set operator is one of the things you can do. Like a set a value on there was that conscious I'm going to try to be really close to MongoDB or what was the thinking there?

19:18 When I started to work on Beanie, it was not Beanie, it was just a side project. Because in the very beginning I started to play with FastAPI. It was a very modern framework that time. Now it's still modern and great.

19:36 But that time it was quite new around a few years ago. And this is synchronous also and uses Pydantic under the hood. And there are no Async ODM for Python and MongoDB.

19:49 I had the same experience. I wanted to use MongoDB with some FastAPI stuff. I'm like there's not a great library that I can pick here. So I guess I'll just use SQL. Alchemy or something. Right.

20:00 And what I did, I just started with Pydantic, I've got Motor.

20:10 So yeah, this is an engine over by Mongo and it is a asynchronous engine.

20:16 Yeah. Let's talk about Motor a little bit because I suspect that most people who work with MongoDB work with Pymongo.

20:25 When I opened the conversation, I said the tools they give you are just here's a dictionary to put in. And then I get Dictionaries back out. I was exactly thinking of the Pymongo library. Right. And so what's this Motor thing? This is also from MongoDB.

20:41 This is also from MongoDB. Right. And it reflects each method and function from Pymongo, but it converted, let's say into asynchronous method function. So it uses the same mostly the same syntax as MongoDB itself, as PyMongos.

21:01 Okay. So it's a lot like PyMongo. It says Motor supports nearly every method PyMongo does. But Motor methods that do network IO are coroutines. So as I'm going to weight type of things. Right. Yeah.

21:18 And what I did, I just combined together Pydantic and Motor without any query builder stuff and fancy stuff. Just querying using Dictionaries. The same dictionaries as Motor do and PyMongo for models and nothing more. Small library. It was not busy time it was an internal project just to play with FastAPI. And then after a few months of working, I decided probably I can make it open source.

21:51 I came up with named Beanie and that's why I'm following MongoDB name and not select, not join, etc. Just find the menu, find one update and etc. Because I started to use directly motor functions inside of identical stuff. And only after that I just started to update business stuff.

22:16 More offensive stuff.

22:19 Yeah. So it was a pretty close match to like, how do I take motor and just make it send and receive Pydantic instead of sending receiving straight dictionaries, right.

22:28 Yeah. So the first project was just a parser from motor to Pydantic. Hands back.

22:34 Cool. Well, I think it's really neat. And however you got there, I think it's really nice that you have the API that matches that, because then I can go to the Mongo DB documentation or I can find some other example that somebody has on. Well, here's how you do it with PyMongo and you're like, well, that looks really close to the same thing over here. So we can talk about that. Now I want to dig into some of the other features and stuff there, for example, data and schema, migrations support and whatnot. That's pretty cool.

23:05 Yeah.

23:05 But let's talk about modeling data here. That's the first letter in ODM, the object bit.

23:14 So what does it look like? I know. Well, maybe not everyone listening knows what it looks like to model something with Pydantic. So maybe we could give us a Pydantic example and then we could talk about how to turn that into something that can be stored in MongoDB.

23:27 Yeah. Pydantic is based model, the main class of Pydantic, base model, and you inherit everything from base model. And it looks like data classes of Python, but it also supports validation and parsing. Which data classes?

23:43 Yeah, the conversions and stuff. It's really cool. That's one of the things I think Pydantic is so good about.

23:49 Yeah.

23:49 You look at their example, the way you define these classes is you have a class and you just as a sort of class level, not instance level. You'd say name colon or variable colon type, variable type. So in this example, you've got a category, say name colon, stir, description, colon, stir. And this just means this thing has two fields. They're both strings. But the Pydantic example on their website has got some kind of model where it's got multiple fields and one of them is a list of things that are supposed to be numbers. And if you pass it data, and that list happens to have a list of strings that can be parsed to numbers, it will just convert it straight to numbers as part of loading it's really nice, right?

24:34 Yeah, it's a great feature.

24:36 Yeah. This exchange across, especially across either files or API boundaries. Right. Somebody writes some code and they send you some data. Well, if they didn't really use the native data type, but it could be turned into it, then that's really.

24:51 And you also can add your own parser, like your own validation step, and then it will convert with Hero from string to number to date.

25:03 So, yeah. And the Beanie uses the same scheme. Document is the main class of Beanie, and it's inherited from Pydantic based model and inherit all the methods, all the aspects of base model of Pydantic. So it can use the same validation stuff, the same personal stuff and etc.

25:24 Right. So whatever people know and think about Pydantic, that's what the modeling looks like here. I guess the one difference is when you talk about when you model the top level documents that are going to be stored and queried in MongoDB, you don't drive from base model. Right.

25:43 From something else.

25:44 Yeah, from Document.

25:46 Right. A document, it comes from Beanie. But that document itself is derived from base model ultimately. Right. So even though you got to do this little more specialized class, it's still a Pydantic model in its behavior, right?

25:59 Yeah, correct. That's why you can use it as a response model. For example, in FastAPI, if you're familiar with because it uses Pydantic based models and sub models of.

26:11 Yeah, if people haven't seen that example, one of the things you can do with FastAPI, that's super neat, as you can go and say as part of returning the values out of the API, one of the things you can do in the decorator is you can just say the response model is some Pydantic class. And if you just add that one line and that class happens to be a Pydantic model, you get all sorts of live documentation and API definition stuff and there's even code that will consume that. So I built this little weather thing in FastAPI for one of my classes over at 'weather.talkpython.fm'. And you just go to docs, it pulls up here's all the APIs you can call and here's the return value with exactly what the schema is and all of that just from that one line of code. So what you're saying is you could do that with these database models. You could just say I'm going to return this directly from my API back to you. If you just say response model equals your data entity model, you get all this for free, right?

27:17 Yeah. It's also validated based on this model. If you, for example, using some external source of data and they changed schema and you received it and responded back. And if the format is not correct, it just will raise an error and you can gracefully handle this.

27:35 Nice.

27:35 Yeah.

27:36 The error that it returns is also meaningful. Like the third entry in this list cannot be parsed to an integer rather than 400 Invalid data. Good luck.

27:48 Okay, so we've modeled these objects in here. And I guess one more thing to throw in while we're talking about modeling. In your example, right at the bottom of the GitHub page, you have a category which has the name and description, and you have a product that has its own name and description and then a price and the product has a category. And here you just say colon category, right. And that would make, I guess would make this in this case an embedded document inside of the product document, right?

28:18 Yeah, totally correct. It will just create embedded documents.

28:21 Yeah. So the way that you model this is why Pydantics are really good match, because Pydantic allows you to compose these Pydantic models in this hierarchy, you can have a list of other Pydantic models within a Pydantic model, which is exactly the same modeling you get with document databases like Mongo, right?

28:37 Yeah.

28:38 So you don't have to do anything. It keeps modeling the things I wanted to model. Cool.

28:43 In other database systems like SQLAlchemy or something, you would be able to say this field is nullable or not or it's required or something like that. So how do you say that in this model. If it's nullable?

29:00 Yeah, I can just use Python type in optional class.

29:05 But optional mark as optional bracket stir versus just stir, right?

29:14 Yeah.

29:14 And I guess for a default value you just said it equal to its default value, right?

29:18 You're right. But if it's optional, it's explicitly marked here. But Pydantic allows you to use optional without default value. And default would be none in that case because Pydantic is.

29:30 I'm thinking like false or none or something like that. In this case, you probably wouldn't make it optional. You would just give it a default value, right?

29:37 Yeah, correct.

29:38 Now, one of the things that I can do and say Mongo engine is I can say the default is a function because maybe this is incredibly common in my world is I want to know when something was created. When did this user created an account? When was this purchase made? When did this person watch this video or whatever? And so almost all of my models have some kind of created date type of thing. And the default value is datetime.datetime.now, without parentheses.

30:08 I want you to call the now function when you do an insert. How would I model that in Beanie?

30:13 Again, you can use Pydantic stuff here. So I really like Pydantic, they did half a fork.

30:19 You can use field class, equal field, and inside that you can use default factory parameter where you can just provide function you want to.

30:29 Okay, that's right. So instead of setting it like you could say it's a string, but in its value, as you say field factory or something, you set it to be one of these things that Pydantic knows about didn't get you.

30:42 Sorry.

30:42 Is that thing that I'm setting it to equal to? Does that come out of Pydantic? Like that's not a Beanie thing.

30:47 This is Pydantic stuff. Yeah, default factory stuff. But there is interest in being a feature about this I can show you in as a code example here. For example, if you want to use not great at field, but updated at. So and each time you update, you want to update time. And for that case, default factory will not work because it's already got a value.

31:12 Yeah.

31:12 It's only on create and Beanie allows you to use event based actions for this. And you can just create a method of the class and they marked it like run before event, before event, decorator before event. And inside of this, you can just write your logic self. Updated ad field equals current time and it will work for events you will choose like insert, replace.

31:42 And yes, this is one of the new features. We're going to talk about some of the new things, but one of them are these event actions. So you can say before the insert event or before the save event happens or after the thing has been replaced or any of those types of things. You can sort of put a decorator and say run this function when that happens on this type or in this collection effectively, right?

32:06 Yes.

32:06 Very cool. All right. We're going to dive back into that because that's good. And so that would be actually a pretty good way, wouldn't it? Just do an event on insert and when insert happens, set my created date to be daytime.now.

32:18 Yeah. Okay, good.

32:20 And then I guess the other part that's interesting now is doing queries and inserts on this. So you would create your objects just exactly the same as you would Pydantic. Right. Just last name, key value, key value, key value.

32:33 Like that, or even parse object like category.parse object and dictionary with values. And if you want to parse.

32:44 Okay. Or if it's coming out of an API or something like that. Right. And then you would say here's where it gets interesting. You say await object.insert or await class.find one. Right. Or await set some value there.

32:59 Yes. Because this is synchronized library. So that's why you have to use await here based on the synchronous nature of the library.

33:07 Sure. And that's the whole point, right. It's that it's built around that. I think there's ways in which you could use it in asynchronous situations. You could always create your own event loop and just run the function and just block. Right. That way. Or use something like unsync, which maybe we'll touch on a little bit later.

33:25 But if you're doing something like Flask or FastAPI, where the functions themselves, the thing calling by being called by the framework is already handling it. It's basically no work, right? You just make an Async method and then you just await things and you unlock the scalability right there.

33:45 Yeah.

33:46 I think modern Python world is pretty everywhere as already. I don't know if I think framework development now, like most of them are asynchronous.

33:57 Yeah, exactly. I think with Flask there's limited support for Async and then if you want full Async, you have to use Court for the moment. But maybe stuff is happening. I know Django is working on an Async story as well. I think the big hold up for Full on Async and Django has actually been a Django ORM, so this would fix that. Although does it make any sense to use Beanie or another ODM or something like that with Django?

34:25 It's so tied into its ORM itself, right?

34:28 I think for Djangi it's a little bit tricky. Probably things changed. But Django works with relative databases here, with SQL databases and also the Django Model Stuff.

34:44 It's possible. Definitely possible.

34:49 Yeah.

34:49 It's just if you're fighting against the system, then you maybe should just choose a different framework rather than try to fight the way that it works.

34:58 Right.

34:59 Like if you're going to choose something like Django, that gives you a lot of opinionated workflow, but a lot of benefit. If you stay in that workflow, then just, I'd say follow that. But the other frameworks are pretty wide open, right? You could easily use this with Flask. You could use this with pretty much anything.

35:16 It's better if it supports Async, right? There's not a synchronous version, is there?

35:21 Yeah, there is no synchronous version of this unless it uses Motor inside as we said, I'm thinking about Mongo support without Motor. In that case it would be just synchronized.

35:33 I would think that would be great. Honestly, I'm very excited that this is Async first. I think that's really good. But let me give you an example. So for my website, I would love to be able to make all the view methods be Async. That would give it a little bit more scalability.

35:50 It's pretty quick like I said, but it would still be way more scalable if it could do more while it's waiting on the server. But at the same time I have all these little scripts that I run and like here's how I want to go and just show me all of the podcast episodes who is sponsoring them to make sure that if I had to reorder things, I don't mess up some sponsorship detail or show me all the people who have signed up for this class this month and whatnot those little scripts. It would be nice if I could just say these little things are going to be synchronous because it's the easier programming model. I don't have to deal with this loop stuff, but then keep the website using the same models in the Async version. What do you think of that?

36:37 I do agree totally. But I have limited time. That's why.

36:42 Yes, of course you do. And I guess that pull requests are accepted or contributions are accepted if there's meaningful. Good work, right?

36:51 Yeah, sure.

36:52 Yeah. Cool. I'm not suggesting that it's like a super shortcoming. Right. It's not that hard to create an async method and just call async IO.run on it. But having this ability to say this situation is really asynchronous one, I don't need to go through the hoops to make that happen.

37:09 Even I had this situation in the past and I had to create a loop inside of synchronous function.

37:16 Yeah, exactly.

37:18 I think that's worth touching on a little bit because people say that async and await is like a virus or something. Like once one part of your code base gets async, like, it sort of expands upward so that anything that might touch that function itself has to become async and then its colors have to become async and so on in the most naive, straightforward way. That's totally true. But it's not true if you don't want it to be. Right. Like halfway through that function, that call stack, you could say on this part, I'm going to create an async IO event loop and just run it and just block.

37:54 Anyone who calls that function doesn't have to know it's async io.

37:57 Right.

37:58 You would sort of stop that async propagation. It sounds like that's what you're talking about. Like creating a loop and run it inside of asynchronous function.

38:05 Yeah, but it looks really super ugly.

38:10 You even don't have any chance to avoid this.

38:18 Yeah, it is a little bit weird if you haven't seen it. Okay, so it's time for my requisite required shout out to the 'Unsync' library, which I think is just so neat in the way that it simplifies async in a way, in Python, we were talking about this just a little bit before I hit record, but it has basically two things that are frustrations that make this kind of stuff we're talking about a little bit harder. One, wouldn't it be nice if you could just call an async function and it just runs like, I want to write this to the log, but let's do that asynchronously and just go right to the log. I don't want to see from you again. I want to hear from you again. Go put it in the log. I'm going to keep on working. Right. You can't do that with Pythons async you've got to put it in a loop and make sure the loop is running so this fire and forget model doesn't work. And the other is you can't block, you can't call.result to make it block. If it's not done, it's going to throw an exception. Right. So this unsync thing lets you put an at unsync decorator on an async method, and then if you need to stop the async propagation, you just call that result and it'll block and then give you the answer till it's done. And there's also sorts of cool integration with threads plus async IO plus multi processing.

39:31 I think this fixes a lot of those little weird edge cases.

39:34 I think I will try this after this podcast and we'll create a page on documentation. So if you need yeah.

39:42 Try it out and see if it's a good idea. It might not be a good fit, but I think it is actually. I think it would be.

39:47 So if it does not work, I will not.

39:50 So basically the way it works is when it starts up, it creates a background thread that is only job is to run an async. Io event loop. And then when you do stuff, when you call functions on this instead of running the main thread, it just runs on that background thread. And when you block, it just waits for that background thread to finish its work and stuff like that. So that's sort of the trick around it. But super big fan of unsync. I think it does a lot of good for these situations that we're talking about. Like, okay, almost all the time and definitely in production. I want it async, but every now and then I want to just stop.

40:26 Talk Python to me is partially supported by our training courses. We have a new course over at Talk Python. Htmx + Flask modern Python Web apps hold the JavaScript HTMX is one of the hottest properties in web development today, and for good reason. You might even remember all the stuff we talked about with Carson Gross back on episode 321. Htmx, along with the libraries and techniques we introduced in our new course, will have you writing the best Python web apps you've ever written. Clean, fast, and interactive, all without that front end overhead. If you're a Python web developer that has wanted to build more dynamic interactive apps but don't want to or can't write a significant portion of your app enrich front end JavaScript frameworks, you'll absolutely love HTMX.

41:10 Check it out over at talkpython. Fm/htmx or just click the link in your podcast player. Show notes.

41:18 So let's talk about some of the other features. Back to this example here. There's one thing I wanted to highlight that I think was really neat that I saw.

41:27 So you've got the standard. I've created an object that goes in the database and I call insert and I wait. That puts it in the database, right?

41:36 Yeah.

41:36 So I don't see a return value here. Does that actually set the ID on this thing that's being inserted after that function?

41:42 Call it updates.

41:45 Yeah. So ID is good to go after that, right? Yeah.

41:49 Okay, cool.

41:50 And then the other one, you've got your wait, find one. The filter syntax is the first thing I wanted to talk about, which I think is really nice.

41:58 Thank you.

42:02 Does it have to be one of these indexed ones? Or can you do these queries of the style on any of these? Could I do a category or a description equals something or name equals something? Yeah, sure.

42:13 You can do it with everything and you can do it even in the same line and find one price less than ten, and name equals, I don't know your name.

42:27 Nice. So the way that you would say I want to find the product or all the products that has a price less than ten, as you say, in this case, the product is a class with a price. We'll just say product.price less than ten. Right. Just like you would in a statement or a while loop or something like that. Yeah. This is really nice because the alternative is something like what you have in Mongo engine, where what you would say is you would say price__ Lt = 10. Yeah, right. You would say like separate the operators on the field with double underscores. And so Lt means less than. And then you equal the value you want it to be less than. And that is entirely not natural. It's not horrible. You can get used to it. But it sure isn't the same as price less than ten. Right. That is really nice.

43:17 On the very beginning, when I told that I was Django developer, not Python developer, it was about this because I knew how to do this stuff about Django. But it's not Python syntax, honestly, it's Django syntax, which most of us.

43:32 Exactly. So you can do these natural queries and like less than, greater than, equal, not equal to and so on.

43:38 Yeah.

43:39 If it was none, thing is, none is the most natural way. But you would just say equal, equal, none. Is that how you would test this?

43:47 Yeah. You cannot hear. So it's not supported to use Es. That's fine.

43:53 It's better than price equals ten equals none. Just as an assignment. That's even weirder. So that's cool.

44:03 The other thing that I thought was neat is so often in these ORMS, and it is worse in the Mongo story because each record that comes back represents more of the data. Right. In this case you've got a product and the product has a category, whereas those might be two separate tables in a relational database. So the problem is I get one of these objects back from the database, I make a small change to it. Like I want to change the name and then I call save and it's going to completely write everything back to the database.

44:36 You can override everything, which can be a big problem. There's a couple of solutions you have for that. One is you have the in place update operators like set. And I'm guessing do you have like increment and decrement and add to set and those kinds of things?

44:52 Yeah. Literally everything from which MongoDB supports.

44:56 Yeah, right. So in this case you can say product dot set and then product name is Gold bar. Right. Rather than what it was before it was Tony's or something like that. Right.

45:08 Yeah.

45:08 And that'll do a MongoDB dollar set operation, which is an atomic operation. So somebody else could be updating, say the category at the same time, sort of transactionally safe. And so this way you're both way more efficient and it's also safer that you're not possibly overriding other changes.

45:28 Yeah. And also in current version, it's possible to Beanie tracked all the changes of the object. And when you call instead of set, you can call save changes and it will call set inside for all the changes which was happened with this object.

45:48 Yeah. This was the other way that I was hinting at and it's super cool. Where is this save changes? There we go. So on all of these documents, you can optionally have an inner class called settings. And then you can do things like use state management equals. True. And you don't have to figure out how to write those set operations or increment operations or whatever. You can just make changes and call save changes and off it goes. Right.

46:14 Yeah. It's quite simple.

46:17 Yeah. This is really cool. I like that about this. That it gives you that option to sort of use the most natural way of making very small changes to the data. Right. Because so often ORMs and ODM's, give me the object back and make a change to it, put the whole thing back wherever it came from.

46:35 Yeah, I agree. And also, if you don't want to fetch object at all and you want to set something to object, you can use update query here, right?

46:44 Yeah.

46:45 And you will not even fetch object into your application.

46:49 That's a really good point because so often with ORMs and ODMs the set based behaviors are super hard to do. Let's suppose I've got 100,000 users. I want to go and set some field to a default value that didn't previously exist in the database. Or I want to compute something that's a computed field that wasn't previously there. So I've got to go to each one and make a change or something. If it's always the same value, you still would need to go in the ORM stories like do a query, get the 100,000 records back, loop over them, set the little one value on the class, and call save. But what you're saying is you could just do like product or in my case, users update value equals what you want it to and just update all of them. Right. Many you might have to call or something like that.

47:40 You can even find something and then dot update and it will update on only.

47:45 Oh, really? So you could do like a find all then a dot update. It wouldn't actually pull them back from the database.

47:51 It will not fetch them there.

47:52 Oh, my goodness. Okay.

47:55 Very magical. That's awesome, actually.

47:58 Then again, the other things you have on here that are really just simple as like you can do a find and then just a to list on it.

48:05 I don't want to loop over it or what. I just give me the list back. That's also nice. Yeah. Let's see. How are we doing on time? We're getting a little short on time. Let's talk through a little bit of there's a really nice tutorial here that starts out with defining documents, setting up the code, which is pretty much just standard MongoDB. Right. Like you have to create one of these clients. But what you're really quit creating is just a motor client. So I'm guessing you can send as complex of a MongoDB configuration as you need to and it doesn't affect Beanie.

48:36 Yeah. And also now it's not documented here because I'm lazy. But Anthony Shaw, you know him.

48:44 I think Anthony Shaw makes common appearances here. Yes.

48:48 Suggested me to add optional. So you can pass connection string instead of all this stuff with database just add Beanie in connection string. And it will work nice. But I have documentation about this because.

49:02 Sure. So in the documentation you create a motor client and then you pass the client over to Beanie, you just create the client first. Right. But if you just call init beanie with the right connection string, it'll do that behind the scenes for you. Yeah. Thanks, Anthony. That's really good.

49:20 But if you're working with like sharded clusters and replica sets and all that kind of stuff that is like on the outer edge of these use cases that should be supported. Right. It's just behind the scenes, you don't have to know about it. The other thing that's interesting is when you initialize it, you pass it all the document classes like product or user or whatever. Right. As a list.

49:43 Yeah. Because you have to under the hood document must know to which database it's picked. Because for some use cases you can use different databases in the same location. And in that case you have to inadvertently for different databases with different set of models. So you have to pass models there. In that case, yeah.

50:04 I do that in mind. I have multiple databases like logging and analytics and all that kind of stuff goes to one database that gets managed and backed up less frequently because it's like gigs and gigs of data. But if you lost it, the only person who would care in the world is me. I lost my history of stuff. Right. As opposed to the thing that the website needs to run or user accounts or whatever, those need to be backed up frequently and treated really specially. So I actually have those as two separate databases based on classes. So I guess what you're saying here is you can also call in Beanie multiple times with different databases and different list of documents.

50:46 Yeah. This will work nice.

50:49 Yes, that's really cool. Can you give it like a star type of thing? Like everything in this folder in this module or this sub package?

50:58 Not yet, unfortunately.

51:01 But it's nice feature. I think it's like a feature request.

51:05 Okay, sounds like a feature. So if you could give it something like all my models live in this sub package of my project or in this folder. There you go. That might be nice because one of the things that happens to me often is I'll add a view to some part of my site and I'll forget to register it somewhere. Like, why is it I got to go and make sure the thing can see this file?

51:32 It will raise an exception or less and then we will call an end point.

51:39 Nice.

51:39 If you try to patch any documents without installation information error.

51:43 Yeah. Cool. So let's see, let's talk indexes. I started our conversation with my utter disbelief that there are websites that take 5 seconds to load. And I'm like, I know they don't have more data than I have.

51:56 I just know they've done something wrong. There's no way that says or data. So indexes are critical, right? What is the index story? How do you create them over here?

52:06 It's interesting story about indexes. Honestly. Like, I published my first version of Beanie and one guy texted me probably it's possible to add indexes there. I don't see if it's supported or not.

52:22 And in a few days I added them.

52:24 Yeah, that's right. We covered a Beanie when it first came out on Python Bytes. I was like, this is awesome. But where are the indexes? I'm bit of a stickler for those. That's awesome. So yeah, the way you do it is instead of saying when you define a class save, the type is say a STR or an Int, you would say it's an indexed of Int and that just creates the index. And it looks like in Mongo you have all these parameters and control.

52:50 Is it a sparse index? Is it a uniqueness constraint as well? Is it ascending? Is it descending and whatnot? And so you can pass additional information like that it's a text index or something like that, right?

53:03 Yeah, it supports all the parameters and uniqueness.

53:05 This isn't super important, right? Like your email on your user account had better be unique. Otherwise a reset password is going to get really weird.

53:14 You support multi field indexes, which is something that's pretty common, like a composite index. If I'm going to do a query where the product is in this category and it's on sale, you want to have the index take both of those into account to be super fast. Right. So you have support for that?

53:30 Yeah, it's also. But it's not that neat. Let's say it's not that beautiful. But it's supported.

53:37 Yeah.

53:39 The payoff is worth it. It's also in this class called collection. Right. So it's kind of in its own special inner class of the model, in which case a lot of the IDE's have a little Chevron. You can just collapse that thing and not look at it anymore.

53:54 So it's easy to hide the complexity, I guess there. Yeah. Cool. All right. What else? Aggregation. It sounds like that when we talk about I'll have to get to it pretty quick when we talk about relationships and stuff. You said that this is super efficient because it's using the aggregation framework. So MongoDB has two ways to query stuff, right. It's like the straight query style, and then it has something that's honestly harder to use but more flexible called aggregations. And so you guys support your library supports aggregation queries as well, right?

54:29 Yeah. And also, as before this updates, it also supports find queries together with aggregations. Like an example. For example, there are some presets of aggregation average here and you can use this average with find queries together and you will see the result. And also for sure, you can pass list pipeline in MongoDB terms. You can pass by plan of your aggregation steps there. And it will work.

54:55 Yeah. It's not super easy to write if you haven't done it before, but yes, it will work.

55:00 Yeah.

55:01 And also the thing is with segregations, you have to set up what schema of the results would be because everybody knows it would be the same schema of the document as original document. But with segregations, it definitely can be any schema of results.

55:16 Right. Because the whole point of aggregation and other people might know something similar with Map reduce is I want to take, say, a collection of sales and I want to get a result of show me the sales by country and the total sales for that country. Right. So you're not going to get a list of sales back. You're going to get a list of things that has a country and a total sales, right?

55:39 Yeah, for sure. It's optional. And you cannot pass output model. And in that case, it will return dictionaries. But it's not that fancy. So it would be better.

55:50 Yeah.

55:50 This is super cool. I love this projection model idea. Mario in the audience says, I created a model loader. Speaking of the passing the documents to beanie in it, I created a model loader as a utility function that pulls dot separated pass and then passes it to document models. Works really well. Great.

56:10 There you go.

56:13 Right on. So let's talk about relationships because I started out talking about you don't use the R US, the D because you model documents, not relationships. And yet Beanie supports relationships.

56:26 I'm super excited about this.

56:28 Yeah. Tell us about this.

56:29 Yeah. So it took around three months to come up how to do Relations.

56:35 MongoDB does'nt support Relations, but Relations is a very popular feature in ORM and ODM's and I had to implement it finally. And I did it for now, it's supported limited version of relations like only top level fields are supported and only two kinds of relations, direct relation and list of relations.

57:02 Right. One to one or a one to many?

57:05 I guess one too many. Yes. And so the syntax is Pythonic. I'd say it is a generic class link inside of square brackets, pass your document.

57:19 Right. So maybe you would specify normally you would say an optional Int here, you would say like link Int. And that might it doesn't make sense. But that type of thing would be the relationship. Right. It's like the same syntax. It's optional basically. Here.

57:31 Yeah. It's a little bit tricky with the black magic under the hood.

57:37 As long as I don't have to know about it. Thanks for creating the Blackmagic. So you could say here your model is there's a door and a house and then the door is a type link of door. And then you have another one. You have windows where the house has many windows. And you would say the windows is a list of link of window, which is it's a little bit intense on the nesting there, but it's not bad. Right. It's a list of relationships. Yeah.

58:04 Yeah. And for sure it's possible. And I think later it would be implemented. I will shorten this list of links to another.

58:15 Links don't do that.

58:18 Although kind of awesome.

58:21 I think it would be less discoverable.

58:25 Yeah. And it works. You can insert data inside of this linked documents to linked collections and you can fetch data from linked collections. Yeah.

58:37 And you can even have it cascade things. So for example, you could have create a house object and say windows is this list of window objects. And then you would say house save. And if you pass the link rule, then that says write the cascade the changes, it will also go and insert all those windows and associate them, right?

58:58 Yeah, correct. And I didn't use cascade term because it's not SQL database and it's a little bit different installations, it would be a little different.

59:10 How does this look in the database itself? So if I go to MongoDB and I pull up the house, what is in its windows? Is that a list of the IDs of the window objects or what is that?

59:23 In MongoDB, there is a special data type called REF ID.

59:28 Okay.

59:28 It's under the hood, it's binary data type, but under the hood it's a combination of ID of the document, name of the collection and name of the database. So it's a two point.

59:40 Oh, interesting. Okay.

59:43 In the document, what I end up with is a list of those things.

59:48 And you will see a few collections like here, house, window and door, three collections with separate objects.

59:57 Okay, cool. You can also tell it that you don't want to propagate those changes as you save the house, which is interesting. Let's talk about prefetch. So I told you, when I see those websites that are just dragging super slow, I go through my thing. Did they forget the index?

01:00:16 Are they doing some terrible seven way join, probably without indexes? Or the third thing is it an int plus one or in problem where they get one thing, but then they've got to go back and back and back because they're touching this related field, which you could potentially run into that problem as well. So you have this prefetch idea, which is kind of like a joined load or something like that, right?

01:00:37 Yeah. It uses lookup aggregation in terms of MongoDB. It's not just find query, but aggregation. And it avoids this and plus one problem, as you said.

01:00:49 Right. So you just do and you're fine. You just say fetch links equals true. And that will just go get the doors, the windows, everything in your house example, right?

01:00:57 Yeah. And I like the speed of this. It's much faster than do it one by one, especially for list of objects.

01:01:05 Yeah, absolutely. So you also have the ability to say fetch all links retroactively if you go, I should have done this join, but I didn't. That might sound silly. Like, why not just always do the join, right. That's one probably slower than not doing it, I would guess.

01:01:23 And two, this happens to me all the time. Like, for example, on the courses website, I want to be able to get the courses, but the courses have chapter information and other stuff inside of them, and then those have links effectively over to say all the details about each chapter, like the lectures and videos and all that on, say, the page that lists the courses, I do not want those things. But on the course details page where it shows you, like, here's all the stuff in the course and how long it is, I definitely do want those things. So in my data access layer, I have a thing that says, should you get all the data or just the top level data, basically. And this would be exactly the code you'd write. Like, well, if you want all the data, you say fetch all links on it, right?

01:02:09 Yeah, correct.

01:02:11 Is there a way to do that on a set like this is on one record? Is there a way to say I got 20 houses back fetch all of their links, or do I have to do it? Is that 20 calls?

01:02:22 So it wouldn't be 20 calls, unfortunately.

01:02:24 That's okay. I think in my example it's also 20 calls, or however many the same setup is exactly the same.

01:02:32 But I will improve it, I hope.

01:02:34 Yeah. This is worth pointing out. This is a brand new feature. Right. You announced this as one of the new features just two days ago on Monday.

01:02:43 Yeah.

01:02:43 Yeah. And for the people listening, we're recording on Wednesday morning. So, yeah, this is like your first pass. But I really like how this works with the relationships and the query and whatnot.

01:02:55 It would be nice, I think if you could have some way to kind of globally configure to say in general, if I called save, the rule to write the relationships is to not do nothing or to always write them or something and then only have to override it potentially.

01:03:10 Yeah. Sounds like default default rule.

01:03:12 Yeah, exactly. That'd be pretty cool. So let's see a few other things we could talk about. We talked a little bit about the event based actions, but you want to just kind of talk about them directly because it's like a quick how do I add my default value? What's the overall picture with these event actions?

01:03:31 Yeah, somehow a lot of people wanted this.

01:03:39 I didn't know about this pattern before it's implemented something like this, implemented an active record pattern of Google Frails.

01:03:48 So finally I was inspired by this. I like this one. Inspired by not stolen, but inspired by now it's supported only four types of events. There are events on each insert replace, save, changes and validate on save. It creates an event and two events before it called and after it called and based on these events, actions already registered to the document would be called. Also it supports synchronous and asynchronous methods for this and you can do a lot of stuff for this.

01:04:30 Yeah, that's cool. You can put the decorator on just an Async version or a non Async version and being able to just call it correctly. Right? Yeah, that's cool. I'm guessing if you're not doing any awaits. It would be better if it was not asynchronous. But it doesn't matter that much, right?

01:04:46 It's one and a half fast if it's not asynchronous. Yeah.

01:04:51 It's critical. Yes.

01:04:52 If it's doing something where it's waiting on something else, then maybe it should be. Okay. Another feature that just came out is cache.

01:04:59 Yeah, cache.

01:05:01 It's also interesting feature like so. Yeah, it's cache. I don't know if I have to explain what is cache it's when you save data somewhere locally and use a copy of data for some time.

01:05:16 Yeah. What you're talking about is not using MongoDB as a caching back end, but caching the queries that would run through Beanie to not hit the database again if it knows the answer.

01:05:27 Yeah. And somehow it's really important feature even for my project. For example, if you have to validate stuff with user and you already asked for a user in this application with this IDE, but you don't know the place where you did it and you don't want to provide this object through the whole pipeline because probably it will not be used in the end of this pipeline. But User is already cached in Beanie. And if you just ask in the end of the pipeline again about the same user data. So it will be there and with a bigger fine queries work the same way.

01:06:05 Yes, the more complicated the query, the better. This makes a lot of sense, I think, for data that doesn't change much. Like if you've got a bookstore, you might have categories and books. And maybe the books change often, the reviews of the books change often. The books that are in different categories change. But the categories themselves very rarely change. Right. So that could be something like the category query could just be like, you know what this is cached.

01:06:29 Yeah, definitely. So for now, it supports only local cache, like dictionaries of Python. But I plan to add another cache backend, like Redis and something.

01:06:41 Yeah, that's cool. Or even possibly you could put it might even make sense, use MongoDB itself as a cache.

01:06:47 Yeah. Because it's monetary.

01:06:50 It would be weird to kind of store the same thing back into it. But at the same time, if you've got a complicated query, what you're storing is these are the three things that came back from running that query against possibly hundreds of millions of records. Right. So in that case, it might make sense to just put it back in Mongo. So it's just a straight table scan read.

01:07:09 Something like a Lambda architecture inside of the single database.

01:07:12 Yeah. So I wanted to ask you, your example says, okay, what we're going to do is say sample find num greater than ten to list. And then if you call it again with the caching on, then you get the same thing.

01:07:26 It looks at the actual query. Right. So if I said NUM greater than eleven, I would get that would be a separate result in a separate cached thing, right?

01:07:34 Yeah, definitely. Cool.

01:07:36 The other thing I guess that's worth noting is you can set an expiration date on the cache.

01:07:41 I want this to live for ten minutes or whatever to get the answer back.

01:07:47 Yeah.

01:07:47 Cool. All right, let's talk about revisions. And then I'm going to propose one more idea that I think I could build out of revisions and events. So what are revisions?

01:07:59 So it's not my idea again, either one user asked me for this, and I really like users of Beanie because I have not that many ideas.

01:08:11 What is this?

01:08:12 Sometimes you have to protect document inside of the database of changes. So sometimes you have old version of the data in your back end and you do some updates. And if you update your document with this old data, you will lose data updated by another mechanical document.

01:08:34 Let me give people some examples, because I think understanding the context is really important.

01:08:39 It could be even the same function, basically, right. So I could have a function that do complicated things. I could say, get me my user object for the current user who wants to make some changes, I could do a bunch of work, call a function with the pass, say like the user ID over. Maybe that gets the user back, makes some changes, saves it to the database, and then I go to the end, I make some more changes, not realizing that to my in memory version, and then call Save and it overwrote what that intermediate function might have, whatever was there is gone now. You know what I mean? So you would want to know is the thing that I got back, if I'm going to replace it in the database, is it still the same thing or somebody somewhere behind the scenes changed it?

01:09:25 Often I think that people think about very complicated with some other process, did some other thing, but it could just be some other part of your code that you didn't realize called save after a query.

01:09:35 Yeah. And for this case, I'm using Revision ID, special token which generates each time when data saves into the database. And when you save again, it will check if this idea is the same or it's already updated. If it's updated, it's always an error, like you have all data in memory, but if it's the same, it will allow you to nice.

01:09:59 That's really cool. I think it's a great feature because the alternative of this pattern is to do a blocking transaction. Right. And that's also potentially possible. I think MongoDB does have transactions now, but I haven't used them. I don't really have a use for that.

01:10:17 But the alternative model and databases is to say we're going to do a transaction that blocks and anyone else who tries to do a database thing whatsoever, they just wait until we're done. And that way there's no chance of them seeing it in this intermediate state. A lot of the scalable systems don't end up doing that even in relational databases, because this blocking model can really kill the concurrency.

01:10:42 So they end up doing optimistic concurrency with these types of revisions anyway.

01:10:48 So I think it's a really cool pattern. I love it.

01:10:50 Yeah, it is great.

01:10:53 It's not my idea, but it's great.

01:10:54 It's really good. It's also, I guess, worth discussing, like set and increment and those types of things. So if I say like I want to add a category to products, I could do like add to set on that thing and pass just put this thing in its category list.

01:11:13 Will that also increment the revision?

01:11:15 So only set, I think. Yeah. Because if you will use like internal methods of MongoDB, it will not understand that you need to update another field.

01:11:25 Got it. Okay. Interesting. But yeah, that's a good feature. I like it. This is a big release.

01:11:30 It's a huge one.

01:11:32 Cool. Well, we've been talking for a long time.

01:11:35 As you can tell, I'm very excited about it.

01:11:38 What's coming next?

01:11:40 So next, I have a big brand . I really like Pydantic. I'm a huge fan of identical. You can see it.

01:11:48 But for some cases Pydantic is heavy too. And probably I don't know how I will implement it, but I want to add support of native Python data classes here or here, or to separate it smaller projects like Beanie data classes. I don't know yet. But anyway, I have planned to add that classes without Pydantic, just fancy parsing stuff without that great validation stuff.

01:12:18 But somewhere in between you're just getting dictionaries back. Good luck. And you're getting pydantic somewhere in the middle is you're getting classes back. But they don't necessarily. Not as precise as say Pydantic.

01:12:33 Yeah, there are cases, and people use these cases also when you have a lot of huge JSON as a document and when you parse it on a fetching, it keeps a lot of time.

01:12:48 Yeah.

01:12:49 So it takes a lot of time just for parson and then for encoding back to dictionary, two stores. Dictionary.

01:12:56 Yeah.

01:12:57 It's not pydantic fault. Definitely. Because again, it's great.

01:13:01 But I need to avoid this step somehow. And probably I will use data classes for this.

01:13:11 Sure. They look very similar. Another scenario where I find that Pydantic is not a good fit is where I might be getting bad data, but I don't want it just to be an exception. I want to be able to get all the bad data and say, here's the three errors that you made passing me this data, and I'm not going to accept it. But here's what you gave me.

01:13:32 If you're doing, like, form exchange, right, like from an HTML form, what you need to do is put the old value back in and say, that value right there, that's wrong.

01:13:42 But with Pydantic, if you get the value from the form and you try to read it, it's just go, no, it's wrong. And you're like, wait, but I need to give it back to them. Just don't run away. Come back Where'd you go.

01:13:54 There are situations like that where you need to kind of keep that exchange going. But you'll want some sort of you just got to do the validation yourself. But anyway, there's certainly times where Pydantic is as cool as it is, is not the right fit for that situation.

01:14:08 Yeah, I don't agree. Totally nice.

01:14:12 Thanks. All right, well, this is a very cool project, as you hinted at earlier. I've seen it from the beginning, at least when you open sourced it, and it's really come a long ways. It's super compelling. It looks like something that I could possibly use on my next project. I constantly as many people out there, I'm sure, as you are, I'm constantly resisting the urge to go, you know, I should rewrite that. I should rewrite that in FastAPI, or I should rewrite that in this.

01:14:42 I should rewrite my Mongo engine stuff with Beanie. But maybe one day I'll break down and do it. It'd be fun. Yeah.

01:14:49 Thank you.

01:14:51 Yeah. Very cool. So nice work on this project. You're looking for contributors in PR. Would you be PR? Would you be happy to have people make contributions?

01:15:00 I make a bunch of issues. Now a lot of people found something doesn't fit to different use cases. And so, yes, it would be great to have other contributors here. I already have 15 contributors. Nice.

01:15:18 Yeah, I saw there's quite a few people in the each one Sidebar there. Yeah, that's awesome.

01:15:25 Yeah.

01:15:26 But I need more, especially for documentation because I really much better. But in Python then in English and the condition is my big point.

01:15:40 Okay, cool. Well, definitely neat project. So thank you for building it. Now before you get out of here, you have to answer the final two questions.

01:15:49 If you're going to write some Python code. What editor are using these days? What did you use to create Beanie with? For example?

01:15:55 I'm using PyCharm. Jetbrains also gave me progression for Beanie support program, I think.

01:16:04 Fantastic.

01:16:05 It's a really great tool.

01:16:07 I'm expecting new one. I don't remember the name of, but they have new idea. I don't remember the name Fleet.

01:16:16 Yeah, I was about to mention Fleet. Fleet is interesting. Fleet is like the JetBrain's response to VS Code.

01:16:24 I'm pretty excited about it. I really love PyCharm.

01:16:29 Nobody's going to be surprised that I say that. But if I've got just one file, I'm going to open that in VS Code because all the JetBrains IDE tools, they expect like a project and they're going to create all these things. I'm like, I just want to look at the files. Just the files, please. Just not too much. And this is kind of like that where you can later turn on some of the ID features. Right. It looks pretty cool. You're going to try it out?

01:16:53 Not yet. I asked for the flight, but so JetBrains, if you hear me, please.

01:16:58 Yes, exactly. Jetbrains. I'm already on the early access list too. And emails, maybe. I haven't checked it this morning. Maybe it's there.

01:17:07 Fantastic. Yeah.

01:17:09 All right. So PyCharm, maybe Fleet in the future. Potentially cool. And then notable PyPI package. I mean, we've talked about a bunch of libraries already.

01:17:18 Yeah. So I really like one.

01:17:21 It's not about identical to FastAPI, but it's a great package called Yarl Y-A-R-L.

01:17:31 It's like a pass library but for URLs and it's great.

01:17:36 You can combine strings with this URL stuff together and you can parse it.

01:17:42 That's cool. Yeah. So you can pass it a URL as a string, just like whatever you would expect. But you can say give me the scheme which is like http, https, the host, the path, the query string.

01:17:54 Yeah.

01:17:55 I really like how they use this device operator you probably see in the bottom of your page now like URL divide, full divide bar.

01:18:04 Oh, interesting.

01:18:05 It's like path. Lib style.

01:18:07 Like Pathlib. Yeah, it's like path lib for URLs and it's super great.

01:18:12 This is totally new to me. Awesome. Good recommendation.

01:18:16 Yeah. I use it in each project now.

01:18:19 I don't know how to live without this.

01:18:21 All right. Well, I'm going to check it out for sure. One quick final follow up from the audience here. Mario says thank you for this project. I'm about to launch my FastAPI Beanie blog soon. Couldn't have done it without it.

01:18:32 Thank you.

01:18:33 Yeah. And all he's on it. He says path Libs for URLs. Indeed is path Lib for URLs.

01:18:40 Cool.

01:18:40 That's a great one. All right. Final call to action. People want to get started with Beanie. What do you say?

01:18:45 I would say have a fun.

01:18:49 Awesome. I would recommend that people go and check out if you go to the documentation, there's a tutorial that walks you through this pretty well on the left. It just starts by defining a document and then initialization and so on.

01:19:03 Yes. We try to do this as much simple to understand.

01:19:08 Fantastic. All right. Roman, thank you for being here.

01:19:11 Thank you very much for being here.

01:19:14 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show.

01:19:22 Take some stress out of your life. Get notified immediately about errors and performance issues in your web or mobile applications with Sentry. Just visit Talkpython.fm/sentry and get started for free and be sure to use the promo code 'Talkpython' all One Word When you level up your Python, we have one of the largest catalogs of Python video courses over at talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription in site. Check it out for yourself at 'Training.Python. fm' be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the itunes feed at /itunes, the GooglePlay Feed at /play and the Directrssfeed at /rss on talkpython.fm.

01:20:11 We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to YouTube channel@'talkpython.com/youtube' this is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon