Learn Python with Talk Python's 270 hours of courses

#80: TinyDB: A tiny document db written in Python Transcript

Recorded on Thursday, Oct 13, 2016.

00:00 No SQL and document databases like MongoDB have made building fast, scalable software that is easy to evolve and maintain much easier for a broad class of applications. embeddable, file based databases like SQL lite are made shipping an application that requires a database a no brainer. The database just runs in process. So there's no setup or maintenance. Yet, when you try to intersect these two excellent capabilities, you'll find the options are very limited. There just aren't many embeddable document databases. If you're a Python developer, and you want a native Python solution, the options are much slimmer still. And that's why I'm excited to introduce you to Marcus Siemens in tiny dB. Tiny DB is a 100% pure Python embeddable pip installable document DB for Python. This is talk Python to me, Episode 80, recorded October 13 2016.

00:53 in many senses of the word because I make these applications vows and use these verbs to make this music I constructed to think when I'm coding another software design, in both cases, it's about design patterns, anyone can get the job done. It's the execution that matters.

01:14 Welcome to talk Python, to me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm at m Kennedy. Keep up with the show and listen to past episodes at talk python.fm and follow the show on Twitter via at talk Python. This episode is brought to you by hired thank them for supporting the show on Twitter where they're at hired underscore HQ. Thanks for listening to my podcast. I've heard from so many of you that the insight into the industry you gain from each week, it's very important to you. I want to take this moment and tell you about my online courses for those of you who want to go deeper and convert your enthusiasm to work in knowledge. At talk Python, we currently have three courses available Python jumpstart by building 10 apps for those of you who are just getting into Python and write pythonic code like a seasoned developer covering over 50 hard won coding tips, and Python for entrepreneurs available for early access already. Covering web development, design and everything else you need for an online Python based business. See these courses and more at training dot talk python.fm to hone your Python skills, no matter your experience level. Now, I hope you enjoy this week's interview. Marcus,

02:29 welcome to talk Python. Thanks for having me.

02:31 Yeah, it's really great. I was talking to Austin, I think it was show 63 possibly about mutation testing. And he said, I'm doing this really cool work with mutation testing. Oh, and for the database, I'm using this cool embedded document oriented database called tiny dB. And since then, I was like, wow, an embedded document database. How awesome and it's 100% in Python. So even cooler. And since that I've wanted to have you on the show and talk about it. So welcome. Yeah, thank you. Yeah, it's gonna be fun to talk about this database. It's tiny. From what I can tell it. Yeah. But before we get into that, of course, let's talk about your story. How did you get into programming in Python?

03:12 When I was about 10 years old, 10 years old, I kind of got into programming because my dad bought me the book called C++ for kids. I was 10 years old, I didn't really understand much. I just copied the examples and made sure it compiled. And that was basically it. So later, when I had a book, a Python for kids and Java for kids, then was around 14 or 15 years, I started really doing some programming in Java on my own. So that's when I actually got really started with programming.

03:49 Cool, and what kind of what were some of the first apps he built with Java? What kind of work was that?

03:53 My first application was a hangman implementation in Java. So I had some problems with how I structured the code, because I had to make everything public static in Java. So it compiled. And when I something didn't work, and I asked my dad for help. And he was horrified when I when he saw the code, because everything was public, static. And he even helped me out with sorting out about to pass, pass an object and so on. So yeah, I got some help from him. And well, that's how I first got into programming with Java. And a couple of years later, I applied to university to study electrical engineering. And before that, you had to do an eight week internship. So six of these weeks at a small company, which manifests as servers, which face I suppose this support contract for enterprises, and they asked me to create an application so they can configure the service via the web and the project. But they asked me to do it in Python. So that's how I got started with Python and for my first serious Python code, that's cool. And what was your impression? Were

05:09 you like, Oh, yeah, pythons? Awesome. Are we like, Whoa, with public static void? Go? Let it go.

05:16 Yeah, it was, I think it was way easier than I expected. Because after like, one or two weeks, I was already quite confident that I could do the whole product. So I didn't really have any experience with beforehand with Python. And with that project, after one or two weeks, I was really confident that I could could do the whole project without big problems.

05:39 Yeah, that's great. Very nice. I find a number of people get into Python that way, they're like, working on some project. And people will come along their co workers or boss or somebody will say, Hey, can you do this thing in Python? They're like Python? Well, I don't know anything about it. But let me give it a try. Yeah, it turns out to be really easy. So that's really great.

05:58 Yeah, it is definitely very easy to start. And I would even say it's one of the best programming languages to learn programming. Yeah,

06:08 I certainly think it is, as well as the United States, they used to teach Java as the primary computer science 101. first course you take programming languages, and the recent years, it switched to be the primarily Python. So that's great.

06:23 at my university, people have to start with C and C++. And they have most people have quite a lot of problems with that. And I guess it would be kind of an easier route to go with Python first. But I don't know. It's their their choice. Yeah, of course, I

06:39 think there's something to understanding a C oriented language, where you actually work with pointers and directly with the memory and stuff, but I'm not sure you should start there. You know what, I'm here as your exposure, just for? Yeah, just for comparison, my first computer science 101 course in college was Lisp. So that was really different. Alright, so what do you do? So that's how you got started? What do you do day to day

07:06 university? I'm still studying. And next week, my lecture lectures start again, to then I do some freelancing work. Mainly web development. Nice. And is that web development in Python? No, it isn't. It's mainly JavaScript for right now I'm doing some front end development, which are mainly JavaScript, and sometimes I wish it were Python, but I can always choose.

07:29 That's right. Sometimes, if there's good work to be done, you just got to go do it. Alright, so the database that you created called tiny DB is a document oriented database. And that falls into the realm of no SQL databases. And so just for everyone out there listening, I realize many of you know, but maybe some of you don't really, totally know the difference between, say, relational databases and new SQL databases. So relational ones I think we're all pretty familiar with, there's a bunch of tables, there's foreign key relationships between them, we try to normalize our data, minimize duplication, and sort of set it up for querying from any particular angle, right, you have a wide range of queries you can write. But in the no SQL world, there's a different set of trade offs that those databases are making, right? Like the document. databases, don't have as much of a flexible set of queries you can write. But if you build the documents, right, you can really optimize them for exactly the use case that you're looking for. Right? So your data is more hierarchical and more directly models, like how you might actually want to work with objects and relationships in your code rather than in storage. So there's also other ones like key value stores, and even graph databases, although in my mind, those are not no SQL databases. Yeah, that's a that's a different debate for a different time. Yeah, it is. Yeah. But so yours, tiny DB falls into this document database. So why don't you tell us what is tiny DB?

08:53 Yeah, tiny DB is like, it's a no SQL basically, database. And, actually, I didn't plan to do and no SQL database, I just plan to do to do something for that is easily usable, I happen to write tiny to be for another project, which was in total, I think, like 300 lines. And I didn't really feel like doing SQL lite, SQL lite, database Forge. Because if I had to change anything about the data structure, I would have to write migrations, and so on. So I just wrote a really simple database with a bunch of dictionary objects, which contained the data and then added some query language for it. And it happened to become quite popular.

09:47 Yeah, that's really cool. And it is quite popular. It's got over 1000 stars on GitHub, which is very cool. Yeah, you must be proud of that. That's great.

09:55 Yeah. What was interesting is that I didn't really do anything A marketing or promotion at all, I just put it on GitHub, uploaded some documentation, and just left it as it was. And after a couple of months, I got the first issues. And like a year later, I got the first pull requests without any involvement on my side, which was really a real great example of open source, how it works.

10:22 Yeah, that's a great validation that people see what you've made. And they actually want to contribute back, you're like, Wait, what? Yeah, what this person come from, they are working on my project, how great. Yeah, that's nice. You made a really interesting comment, when you described Why you didn't want to use MySQL, or SQL lite, whichever one you were referring to. And that's you don't want to deal with migrations and all the headaches of evolving your schema. And I think one of the really, really powerful things about document databases that you don't really appreciate until you try it, and you've sort of lived with an application over time, is the ability to easily evolve the schema, right? Like, if you want to add a field, you add a field to either your dictionary or class in your application. And that just becomes part of your schema. It's usually much easier, I think, to maintain these applications that are based on document databases. That's my experience anyway. Yeah,

11:19 it was my experience, too, was the project I was working on originally. Because I didn't really know how the data would look like. So I needed something that would allow me to change the data as I need, or as I find out, I need another page. So it wouldn't be hard to edit.

11:37 Yeah, absolutely. And I think it's really cool that you have this as an embedded database. Because I would say probably the most popular choice for these types of databases would be MongoDB. But then you've got to deal with another server. And you know, configuring that thing and the connections and just you know, sometimes you just want a small embedded thing that's just part of your app.

11:59 Yeah. What I really like about sequel lite is that it's embedded. So you don't need any server. You don't need any maintenance or anything. So basically, Python, basically, tiny DB is combines the best of sequel light being embedded. And MongoDB being a document database.

12:19 Yeah, that's really cool. So you said it's written 100%? In Python? How many lines of code is that?

12:25 Yeah, it's about 1200 lines of code, of which 40%, approximately our documentation. And in addition to that, like 1000 lines of test code is awesome.

12:39 Yeah, and you have 100% test coverage as well on your project, which is really nice. When you've got something that's actually managing your data, you want to be really pretty sure that it's working right. And it's definitely Okay, so if it's 100%, in Python, how do I get it into my application? Like, can I pip install it?

12:57 Yeah, definitely. installing it should be very easy, just run pip install tiny dB. And then you just have to open an Python repple to get started. Import tiny to be and basically, you are ready to go.

13:14 Yeah, that's, that's really nice. And so there's no server to start like if your app is running, databases running, right? Yeah, yeah. Very cool. So you had some reasons why you should use tiny DB and some reasons maybe why you might not. So what use cases are really good for considering tiny dB. I think it's

13:37 just like the project I was working on Originally, I think it works best for small projects, where you don't really have or you don't really want to use a bigger database. So small, that application, I think some guy used it for a password manager or something. So really, projects better having MySQL database or something like that would just add too much complexity.

14:04 Yeah, certainly, if you're going to deploy an application or library and you don't want to have it the getting started. steps, say, now you need to set up this server. And you need to, you know, run this script to build the schema. And then you need to set the connection string here, and then you can get going right, so to me, it feels pretty similar. Not exactly not an exact fit, but similar to say SQL lite, when you might use SQL Lite.

14:31 Yeah, it's similar and that you can really easily get started. And you don't have to manage a server or something like that.

14:40 Sure. So because it is a pretty small database, it doesn't try to do everything, right. It's not like MongoDB rewritten in an embedded Python variation. It really does do less. So there's some things that doesn't do. And maybe if you really depend upon them, you should not focus you should. You should Pick something else, what are those cases just so we know people can decide if it fits for them.

15:04 I think the main disadvantage of tiny DB is when you're using multiple threads. In that case, there is an extension for tiny dB, you can use. If you have multiple, multiple processes, then you are really out of luck because there is no no type of looking for file you're writing to. So you would probably run into issues with that. And also, it might turn out to be a bottleneck and performance. If you have hundreds of thousands of entries in your database, and your search them, because tiny DB doesn't have any kind of index that will have to scan an entire list every time you do a search. So you probably shouldn't do really involved data processing with it. Okay. Yeah.

15:51 And that, that definitely those some of those are serious. And some of them might not be it depends on your use case right. Now, I don't want to talk about it yet. We'll get to it. But there's a bunch of extensions. And I think there are some that will come and solve some of those performance problems, which are interesting to some degree. And also, they they open the possibility for a listener and enterprising listener out there to come and add something like multiprocessor, multi thread support, things like that, right. So we'll get to the extensibility.

16:24 There are a number of extensions. Now. I hope it's quite easy to add a new one. So if there are a couple of things that are tend to be doesn't support them out of the box, but if you want to, you can write an extension, you can write your own storage mechanism, and so on. Yeah.

16:42 Yeah. Great. Yeah, we'll talk about that a little bit. Awesome. So when I think of competitors, you have a few listed, and we've talked about a couple of them. And competitors, we like we said, it's not exactly the right word, because it's all open source. But you know, choices you might choose from on one hand, like MongoDB. If you're doing document databases, and you're doing like real large data, production apps, maybe this is you want to scale out, maybe this is the right choice. But there's also some other pure Python databases that I had not heard of that I thought were pretty interesting. So do you want to tell us about the two that you listed on your site? What are their names? Before

17:21 starting with tiny dB, I first did some research, if there already is some kind of embedded conduct database I can use. And that fits my use case. And there wasn't really I found two projects, which are somewhat similar, but not really, for example, there's a project for clarity dB, which is also pure Python and no SQL. But as far as I know, it's much more complex, you have the ability to use indexes, you have HTTP support. So it might be a great fit for a project. But in my case, it was just too much code. I think it's like 7000 lines of code, which would be seven times the size of my project. So

18:09 yeah, and I think that's without, without unit tests, the sides look good. So it's even bigger. Yeah, yeah. And the other one was buzz hug.

18:17 Fast hack is also quite interesting, because it's optimized for speed. But what I didn't really the reason why I didn't use Pathak was that it didn't have the kind of query language I wanted to use. So why it is really fast. And it's also pure Python. It just didn't have API the way I wanted to have for the kind of API I implemented I imagined to use. So I chose to prod my, my own project. And also it still has, it's like twice the size of tiny DB without tests.

18:53 Yeah, definitely. So the codec unity, db, that one is a key value store. So in some ways, maybe not quite as capable as the document databases. It depends. They do have some additional indexes and queries you can write, but yeah, maybe not quite as flexible. And then buzz hug is relational. So you have again, all the trade offs with managing the schema and evolving it and migrations. So yeah, to me, it feels like sequel lite is kind of the biggest competitor to you guys.

19:24 Yeah.

19:24 Yeah. Cool. So can you maybe I don't necessarily want to talk about code exactly on an audio format, because that doesn't usually go so well. But just give us a sense of like, what is the API like, like if I wanted to create get started with tiny dB, I wanted to create a database and insert some records like what does that look like?

19:46 Yeah, so you start with importing tiny dB. And there are two main classes you will probably use, which is the tiny DB class and the query class. So you basically created a new instance of tiny dB. And you already have half your database, you don't need any further setup code or anything. And then you can just call DB dot insert or the instance and call the Insert Function, pass some Python dictionary, and you already inserted your first object into the database.

20:23 Yeah, very nice. And so it just basically, you either can directly or into a table, insert, more or less Python dictionaries, right?

20:31 Yeah. And basically, you have all the basic functions, create, read, update, and delete, you can do some interesting searching, search for objects with specific parameters, you can do some quite involved searching with the API provided. But yeah, that's, I think, besides the database itself, the query language is like the second biggest hassle, the second most code and the whole project. If you have a list, you can search for an item in that list, which with specified properties, and so on. So if you want to, you can really go crazy with searching. But still, the basic API is insert, delete, and event, of course, querying an update.

21:21 Yeah, nice. And, to me, the query language looks a little bit like SQLAlchemy, and that you do it sort of in a Python language, and it's translated to the query that goes down to the system, right?

21:32 I actually found out about the sequel alchemy project, I think I just discovered after I wrote Tony Dewey. So it happened to have a very similar query language. Yeah, he didn't take it straight from there. But it's, it is a very natural way of doing it. So

21:49 basically, the fact that you use the the native Python sort of equality and things like that, and then it translates it over is really nice. The exchange of data, if you just go against your API directly, is really in dictionaries. And sort of extended a little bit with your primary key being on on there as a separate type of attribute and so on. But more or less as dictionaries is their object data mapper layer, kind of like Mongo has Mongo engine or Mongo kit, where you work in higher level stuff, like I would write, say, like, who you have, in your example, you had like a person, like, can I create a person class and like, maybe put rules or columns or structure on that, and then use that to query with,

22:32 yeah, and obey, you can already use the query language of tenure to be as an ORM. So you can create a new instance of the query object and call it user. And then you can call something like user dot name equals to the name you're looking for. And you already have a query that you can use to search the database. So you can do that, or use that as some kind of ad hoc modeling.

23:04 Okay. Yeah. Nice. And do you support Python three, Python two, what versions of Python is use for it?

23:11 I try it or I still try to support as many versions of Python as possible. I'm not sure about Python two, six, but upwards from Python to seven, I think I support every version. Also, because I need to be doesn't have any external dependencies even on other Python packages. So it's quite easy, or it was quite easy for me to write code with, which works in both Python two and three. So if you're using Python three, or Python two, either way, you're good to go.

23:43 Yeah, that's really excellent. So I'm very happy to see the increased Python three support for your project. Other projects, I had the B word guys on last time, and they're like, that's all. It's all Python three. So that's great. The other thing, if it's pure Python, the very next question I had was, well, if it supports Python three, what's the story with pi pi, the JIT compiled version of the interpreter.

24:07 I didn't myself really use it with pi pi. But I still run every commit, I do this continuous integration, which tests all the Python two and three versions, and also test that it works with pi pi. But myself, I didn't really have a use case where I would need to use pi pi. So I, I know if it's faster or slower than you would expect, okay. It definitely works.

24:34 It definitely works. Okay, that's really cool. And so maybe it's a lot faster, actually. We'll get to some stuff that that talks a little bit about speed ups that you can do and, you know, maybe that just takes pi out of the loop in terms of being m necessarily important. But anytime you have something like this, that's pretty central to how your application performs. And you can do it on pi phi. That's that's pretty interesting to see.

25:10 This pushing a talk by the enemies brought to you by hired, hired is the platform for top Python developer jobs, create your profile and instantly get access to 3500 companies who will work to compete with you and take it from one to hired users who recently got a job and said, I had my first offer on Thursday after going live on Monday, and I ended up getting eight offers in total. I've worked with recruiters in the past, but they've always been pretty hit and miss. I tried LinkedIn, but I found hired to be the best. I really liked knowing the salary up front and privacy was also a huge seller for me. Sounds awesome, doesn't it? We'll wait to hear about the signup bonus. Everyone who accepts a job from hard gets $1,000 signing bonus. And as talk Python listeners, it gets way sweeter. Use the link higher.com slash talk Python to me and hired will double signing bonus to $2,000 opportunities knocking, visit hire.com slash talk Python to me and answer the door.

26:06 Speaking of that, one thing that I saw that was cool. And this is something I learned today, I hadn't really paid any attention to it before. As he said, one of the speedups you can do is if you install a package called you JSON. Yeah, you can get tiny DB to run a lot faster. Because as you can imagine, right? It's like all about JSON parsing multi directional JSON parsing, right. And so what's new JSON? And why is it faster?

26:32 You Jason is a really cool project. I think the main, or the reason why JSON parsing matters so much is that basically, due to the way tiny DB works, every time you do, every time you interact with the database, you have to read the database file from the file system. So you really need a fast JSON parser parser. And the JSON parser that comes with Python is really powerful, you can extend it in a few different ways. But if you really need performance, these kinds of, or this kind of extension really only gets underway, because probably too implemented in Python, the whole power thing. And then you lose a lot of speed. So you Jason is one project I found, which also implements Python, I mean, which also implements JSON parsing. But it also it wasn't all in C. So it's really fast, you do have a trade off, but it's not as extensible. But if you don't need to over override some internals of Jason power, when you're really, it's really good to use Jason to get more performance out of it.

27:50 Yeah, that's, that's really cool. And you can just pip install that. And you said that tiny DB will look for its existence and prefer it if it's there. But if it's not there, it's fine. It'll just use, you know, the standard library JSON, right? Yeah.

28:04 So if you install it, it works like a drop in you just install it and Python and tiny to be auto detected. And you get the speed up instantly. Yeah,

28:16 nice. Let's talk about extensibility. A little bit. So there was a couple of things asked about, like, Well, I'm not sure it's really in there, maybe. But you've written a pretty cool framework for extending tiny dB. And so some of the things you can extend are like you could write a custom storage engine. You can write custom middleware, custom table classes. And then there's a bunch of projects that people have written that then you can plug into tiny DB to make it nicer, faster, etc. Right? Yeah.

28:50 extensibility was something I first eyes Actually, I started with having support for multiple storages. So you could swap out the JSON storage and use memory only storage, I decided to provide even more extensibility. So you can write a middleware, which modifies the way tiny DB works with any kind of storage. And also, there are quite a bit of extra extensions out there, you can use to modify tiny DB works and total.

29:24 So there's a bunch of extensions, and one of them is tiny record. What's the story a tiny record,

29:30 tiny record was basically written, because tiny dB, its left didn't have any real support for multi threading. So it's tiny record, you basically have some sort of transactions and also uses locking so you can use tiny DB from multiple threads. And then you have an object where we can add a bunch of modifications you want to use. And yeah, basically, it's transaction support for tiny tweaks. Which is quite handy if you need multi threading. It depends on your use case, of course,

30:05 yeah, of course. But if you mean it that that is really cool. So another one that someone wrote, these are all all separate projects on GitHub, by the way. Another one that somebody wrote is called tiny Mongo.

30:16 Yeah, tiny Mongo.

30:17 So does that like match the MongoDB API or something? And but then actually store it as through tiny dB, or what's the story with that?

30:27 Yeah, tiny Mongo, as far as I know, allows you to use an interface or an API, which matches the MongoDB API, but instead uses tiny dB. I think one day, I just got got an issue on GitHub, or someone said, Hey, I have this extension for 10 db. So it can replace MongoDB. And I included it into the list of extensions.

30:53 Yeah, that's really cool. So it does look like it more or less just replaces the PI Mongo API. So you've got like, insert and find one. And it even has like the dollar operators. So like, dollar set, and so on. Yeah. So one of the things that comes to mind for this for me is because tiny DB stores its files, at least, by default in a simple JSON document. And there's no setup for it. If you were doing unit testing, and you needed something better than just simple mocking, and stubbing out your test data against MongoDB, it might be cool to switch it to this and use this as a way to kind of have a slightly better test back end for a real system or something like that. If you don't want to

31:42 start up a separate server just for testing, that might be a really cool solution for that problem.

31:48 Yeah, nice. So then there's a couple of others. One is tiny DB serialization. And I noticed that when I was playing around with tiny DB that I, for example, put like a date time, and as part of my record, and of course, you can't, people may or may not know, you can't just go to the JSON module go save this thing. If it has a raw daytime in it, it just doesn't support it. And that's what happened when I tried to save my thing, a tiny dB. So does this address that kind of problem and other stuff.

32:18 Yeah, I even think that torturing datetime objects was the reason why the extension came into existence. Because for tiny DB to four other objects appendix of for the Jason module, really to support it, you have to modify it yourself. So the extension allows you to register your own theorisation code. So you can store all kinds of objects that you want to store. And it allows you to specify how exactly it will be stored.

32:54 I see. So I can I can basically tell it like, hey, if you see a date time serialize it like this. And if you see some other type that it doesn't necessarily know about store like that. Okay, yeah. Yeah, very nice. What about tiny DB smart cache

33:09 is also a really interesting project. I think it was created after a pull request, or, actually, yeah, so there was an addition to tiny B, because, as I said, before, searching can get really slow, if you have a lot of objects, so much tiny DB does is that as long as you don't modify the database, it stores the results of your search query queries. So doesn't have to redo the work if you didn't change anything at all. So really handy in some cases. So if you do like a query, and you give it some parameters, and it comes back with 20 Records, the smart cache will say, if I see this query again, just give them the 20 Records. And more or less, actually, tiny DB itself will already do that caching for you. So what smart cache does is that if you basically takes it one step farther, if you search query, it stores the result. And then if you do some updates on the database, it doesn't throw away the search results, but instead replaces of em or update the cache with the new result. So if you have a query, which matches some elements, and if you insert a new element, which also matches that query, it will go directly to the cache. So it doesn't have to redo the host searching, which tended to be itself would have to do. But of course, it's also a trade off because it uses more memory, because it has to show more results and also do more. There's some overhead on every insert an update because it has to take every query if it matches, and update the cache in place. Yeah, but it's in some cases, it can really be really handy thing to use, if you have a lot of updates and deletes in your code, and you don't want to redo or to process all the entries in your database for research, that might be like an extension you want to use.

35:16 Okay, yeah, that sounds really cool. Very nice. So let's also talk a little bit about the extensibility. So you said there's three basic places that we could extend tiny dB, and one of them is writing a custom storage engine. So by default, it's just a single, your data is more or less stored in a single JSON file. And that's more or less the connection string. When you create the database, right? You say, here's the file.

35:45 Yeah, you just pass up the file name. And you lose it.

35:49 Yeah. Nice. Again, very, sequel lite, like. But then you said you can create alternate storage engines. And this got me thinking about some ideas. But there are some that you talked about already, right?

36:02 Yeah, there are, for example, terrorism storage, which which uses or just store the data in memory. The first storage I wrote even was in for yamo. So you have these yamo files later switched to Jason because it was possible, because the AMA can get really complex. In some cases. If you get really fancy, you could even do some kind of HTTP stuff. But I don't think that there isn't an extension for that yet. But that would would be something really interesting to try.

36:36 Yeah, absolutely. And so when I saw this, I was thinking, Well, okay, maybe, maybe there's certain things you can do to like, leverage some of the shortcomings that you talked about, when you're like, maybe not so much from multiple process, stuff like that. So for example, like MongoDB, recently switched from storing binary JSON in a variety of files to something called wired tiger, which is apparently much faster. And they they support multiple processes from there through certain types of locking and so on, I was thinking, hmm, could you take one of these storage engines, and somehow, you know, somewhat, plug it into tiny dB.

37:16 Yeah, but that should be possible. Writing a custom storage is really easy, because you just need to create a class with a constructor, which takes the parameter step, you need to create the storage, and then method for reading and one method for writing the data to the storage. So you could do some locking there, I think that would be an interesting thing to do. Locking on storage level. So it basically even it has to wait until the locks lock, or the database file set is locked or unlocked to return the data. That would be interesting,

37:55 and would be interesting, right? And so I mean, it definitely seems like you could unlock some potential there as well. Okay, cool. So custom search engines, and you could even do crazy stuff, right? Like, you could say, I'd like to use an s3 bucket has my storage location, or like, those types of things, right? Whatever you're looking at,

38:13 as long as you can read it, and write it, it would be possible to use it as a storage.

38:19 Interesting. Okay. The other thing he talked about was custom table classes.

38:24 What's that? Yeah, custom table classes, mainly came into existence, to modify the way the database implementation works. So it's basically in other cases you would have to over are to use a subclass. And so with custom table classes, you basically, what you do is have a subclass of tiny dB. And then you can really do a lot of modifications. For example, the smart cache extension uses a custom table class to provide or to intercept every insert, update and delete and so on to add its own logic behind it.

39:04 Okay, could you could use something like a custom table class to say like, This field is required? Or this one must match, like some kind of constraint? Like it must be an email address or something like that. And then like, not let it insert if it if you tried to save the wrong one?

39:20 Yeah, that's definitely an interesting idea. A backup would be a really interesting extension for our tiny DB validation, because you can intercept all the inserts and updates. And then you could check if it matches and rise and raise the next exception if it's not valid.

39:38 Yeah. Yeah, that's really nice. The other way I guess, that we haven't really spoken about yet is middleware. So what kind of things can you do with regard to extensibility and middleware?

39:49 Yeah, middle Beth Alta survey to do modifications or to modify the way tiny box. Basically, it acts as a layer between tiny DB and the storage capacity. So you can do some interesting things over there, for example, that are intended to be better as a caching middleware, which provides a base. So it doesn't have to hit the PI System every time on every read and write only every couple of reads and writes. So you don't have that overhead. And yeah, so you can do some medications. They're different ones from a customer ratio. I think the main way in which middlewares are interesting, because you can use multiple of them as at the same time, you can use it with any kind of search behind it. And do some modifications over over there.

40:40 Yeah, for example. Yeah. Okay. Very cool. So what's the future of tiny DB? If you've got anything you're adding, or, or anything like that?

40:49 I don't really have any big plans for tiny dB. I think, right now, I'm quite happy with how the project works and how the API is working. Like, there might be some renaming of methods. So the intent of the method becomes clearer. But apart from that, right now, I don't have really any big plans to change how it works, because I mean, the project is quite small. So there is not much I would want to add into the tiny, tiny to be core core itself. But of course, there's still room for a lot of extensions, too, right?

41:28 Yeah, that sounds great. Are you looking for people to create extensions, or suggest changes are basically open source contributors? Yeah, I

41:36 think writing extensions for tiny DB can be can lead to some really interesting projects, as we've already already talked about it a couple of extensions. And apart from that, I'm always glad people have improved the documentation. Because I'm not I'm not a native English speaker. So there might be a few rough places in the documentation. So if people think that the boarding might need some improvement, I'm definitely happy to accept that pull request. And also, if you write an extension, just open an issue on the project on GitHub, and I will add it to the list. Okay, excellent. Well, I

42:16 think it's a very cool project, I would love to see a nice robust, embedded document database. So thanks for taking us a little closer to that. That world. Yeah. Awesome. All right. So before I let you go, I have two quick questions for you, as always. So first of all, you're welcome to name your own stuff, if you want. What's your favorite pie package? I just noticed, we now have over 90,000 pie pie packages. And so you must have some exposure. That few that are pretty interesting.

42:44 Yeah. I think the package I would recommend is the one we already already talked about, which is Jason, because in many cases, it can improve performance by a lot. Without if you don't need to customize the way the JSON power of Python works. When using you, Jason can be really cool.

43:06 Yeah, that's great. And so many people processing JSON in one way or another day. So it's, it's very broadly applicable. A good choice. I'll throw in tiny DB for for you as well. So pip install tiny dv. And how about your editor? If you can write some Python code? What do you open up?

43:22 It depends on the on the project, really, for small projects, I probably use Sublime Text. But if it's more than two or three files, I probably fire up pi charm and use it. Okay. Also, because, yeah, that's an interesting development in Python with gradual typing type hints. Yeah. And pi charm, as far as I know, or already supports them to some insert to some extent. So it's really handy for that, as well.

43:52 Yeah, that's really cool. Like if I'm using if I'm trying to understand some code, and like, Oh, these three things that are coming in are these types. If you tell it, it'll give you a lot more assistance, trying to understand something if you're jumping into something you don't totally know the API for. That's cool. All right. Yeah. Awesome. Pie charm. It's good one. So Sublime Text. All right, final call to action. People should get out there. Right. Some extensions. What do you think?

44:16 Yeah, definitely, for some extensions, open an issue on GitHub. I added to the list, and people will use your extension. Yeah. And also, feel free to discuss the API, the API, discuss the documentation. I'm always happy to hear feedback, man. Also, if you find some difficulties, or even just difficulties to to get the how to get started started with tiny dB. I think that's worth of an issue on GitHub. so other people can improve if you or other people can have an easier way to get started. If you run into problems. Just open it and shoot on GitHub, and we can discuss things over there.

44:59 All right. That's it. Sounds great. So Marcus, thanks so much for being on the show. Yeah.

45:04 Thank you for having me. Yep. Bye.

45:07 This has been another episode of talk Python to me. Today's guest has been Marcus Siemens, and this episode has been sponsored by hired. Thank you for supporting the show. hardwoods help you find your next big thing, visit hire.com slash talk Python to me to get five or more offers with salary and equity presented right up front, and a special listener signing bonus of $2,000. Or you are a colleague trying to learn Python. Have you tried books and videos that just left you bored by covering topics point by point, well check out my online course Python jumpstart by building 10 apps at talkpython.fm/course to experience a more engaging way to learn Python. And if you're looking for something a little more advanced, try my write pythonic code course at talkpython.fm/pythonic. You can find the links from this episode at talkpython.fm/episodes/show/80. Be sure to subscribe to the show open your favorite pod catcher and search for Python we should be right at the top. You can also find the iTunes feed at /itunes, Google Play feed at /play in direct RSS feed at /rss on talk python.fm. Our theme music is developers developers, developers by Cory Smith Goes by some mix. Corey just recently started selling his tracks on iTunes. So I recommend you check it out at talkpython.fm/music. You can browse his tracks he has for sale on iTunes and listen to the full length version of the theme song. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Let's mix. Let's get out of here.

46:37 Dealing with my boys. There's no going back and having been sleeping. I've been using lots of rest got the mic back

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon