WEBVTT

00:00:00.001 --> 00:00:09.820
NoSQL and document databases like MongoDB have made building fast, scalable software that is easy to evolve and maintain much easier for a broad class of applications.

00:00:09.820 --> 00:00:16.500
Embeddable, file-based databases like SQLite have made shipping an application that requires a database a no-brainer.

00:00:16.500 --> 00:00:20.040
The database just runs in process, so there's no setup or maintenance.

00:00:20.040 --> 00:00:25.040
Yet, when you try to intersect these two excellent capabilities, you'll find the options are very limited.

00:00:25.040 --> 00:00:28.100
There just aren't many embeddable document databases.

00:00:28.380 --> 00:00:33.600
If you're a Python developer and you want a native Python solution, the options are much slimmer still.

00:00:33.600 --> 00:00:37.120
That's why I'm excited to introduce you to Marcus Siemens and TinyDB.

00:00:37.120 --> 00:00:43.000
TinyDB is a 100% pure Python, embeddable, pip-installable document DB for Python.

00:00:43.000 --> 00:00:48.440
This is Talk Python To Me, episode 80, recorded October 13, 2016.

00:00:56.760 --> 00:01:18.220
Welcome to Talk Python To Me, a weekly podcast on Python.

00:01:18.500 --> 00:01:42.120
Thanks for listening to my podcast.

00:01:42.320 --> 00:01:47.880
I've heard from so many of you that the insight into the industry you gain from each week is very important to you.

00:01:47.880 --> 00:01:55.320
I want to take this moment and tell you about my online courses for those of you who want to go deeper and convert your enthusiasm to working knowledge.

00:01:55.320 --> 00:01:58.760
At Talk Python, we currently have three courses available.

00:01:58.760 --> 00:02:03.440
Python Jump Start by Building 10 Apps, for those of you who are just getting into Python.

00:02:04.000 --> 00:02:08.760
Write Pythonic Code Like a Seasoned Developer, covering over 50 hard-won coding tips.

00:02:08.760 --> 00:02:18.140
And Python for Entrepreneurs, available for early access already, covering web development, design, and everything else you need for an online Python-based business.

00:02:18.140 --> 00:02:24.820
See these courses and more at training.talkpython.fm to hone your Python skills, no matter your experience level.

00:02:25.300 --> 00:02:27.200
Now, I hope you enjoy this week's interview.

00:02:27.200 --> 00:02:30.540
Marcus, welcome to Talk Python.

00:02:30.540 --> 00:02:31.520
Thanks for having me.

00:02:31.520 --> 00:02:32.660
Yeah, it's really great.

00:02:32.660 --> 00:02:39.700
I was talking to Austin, I think it was show 63, possibly, about mutation testing.

00:02:39.860 --> 00:02:42.140
And he said, I'm doing this really cool work with mutation testing.

00:02:42.140 --> 00:02:49.180
Oh, and for the database, I'm using this cool, embedded, document-oriented database called TinyDB.

00:02:49.180 --> 00:02:52.800
And since then, I was like, wow, an embedded document database?

00:02:52.800 --> 00:02:53.300
How awesome.

00:02:53.300 --> 00:02:54.920
And it's 100% in Python.

00:02:54.920 --> 00:02:56.400
So even cooler.

00:02:56.400 --> 00:02:59.000
And since then, I've wanted to have you on the show and talk about it.

00:02:59.000 --> 00:02:59.700
So welcome.

00:02:59.700 --> 00:03:00.640
Yeah, thank you.

00:03:00.640 --> 00:03:03.900
Yeah, it's going to be fun to talk about this database.

00:03:03.900 --> 00:03:05.980
It's tiny, from what I can tell.

00:03:05.980 --> 00:03:06.340
It's great.

00:03:06.340 --> 00:03:06.840
Yeah, it is.

00:03:07.720 --> 00:03:11.020
But before we get into that, of course, let's talk about your story.

00:03:11.020 --> 00:03:12.460
How did you get into programming in Python?

00:03:12.460 --> 00:03:21.320
When I was about 10 years old, I kind of got into programming because my dad bought me a book called C++ for Kids.

00:03:21.320 --> 00:03:23.720
I was 10 years old.

00:03:23.720 --> 00:03:25.240
I didn't really understand much.

00:03:25.240 --> 00:03:29.140
I just copied the examples and made sure it compiled.

00:03:29.140 --> 00:03:31.120
And that was basically it.

00:03:31.120 --> 00:03:36.480
So later when I had a book, Python for Kids and Java for Kids,

00:03:37.060 --> 00:03:44.140
then with around 14 or 15 years, I started really doing some programming in Java on my own.

00:03:44.140 --> 00:03:49.260
So that's when I actually got really started with programming.

00:03:49.260 --> 00:03:49.980
Cool.

00:03:49.980 --> 00:03:52.220
And what were some of the first apps you built with Java?

00:03:52.220 --> 00:03:53.320
What kind of work was that?

00:03:53.720 --> 00:03:57.740
My first application was a Hangman implementation in Java.

00:03:58.740 --> 00:04:05.980
So I had some problems with how I structured the code because I had to make everything public static in Java.

00:04:05.980 --> 00:04:06.980
So it compiled.

00:04:06.980 --> 00:04:10.860
And then something didn't work and I asked my dad for help.

00:04:11.720 --> 00:04:16.900
And he was horrified when he saw the code because everything was public static.

00:04:16.900 --> 00:04:23.180
And he then helped me out with sorting out where to pass an object and so on.

00:04:23.180 --> 00:04:25.720
So, yeah, I got some help from him.

00:04:26.800 --> 00:04:30.880
And, well, that's how I first got into working with Java.

00:04:31.400 --> 00:04:38.000
And a couple of years later, I applied to university to study electrical engineering.

00:04:38.660 --> 00:04:41.740
And before that, you had to do an eight-week internship.

00:04:41.740 --> 00:04:51.980
So six of these weeks at a small company, which mainly sells servers, which they sell servers with support contracts for enterprises.

00:04:52.480 --> 00:04:58.640
And they asked me to create an application so they can configure the servers via the web.

00:04:58.640 --> 00:05:01.820
And the project was they asked me to do it in Python.

00:05:01.820 --> 00:05:08.040
So that's how I got started with Python and wrote my first serious Python code.

00:05:08.040 --> 00:05:08.640
That's cool.

00:05:08.640 --> 00:05:09.600
And what was your impression?

00:05:09.600 --> 00:05:11.340
Were you like, oh, yeah, Python's awesome.

00:05:11.340 --> 00:05:13.680
Or were you like, whoa, where'd public static void go?

00:05:13.680 --> 00:05:15.320
Where'd public static go?

00:05:15.320 --> 00:05:16.580
Yeah.

00:05:16.580 --> 00:05:20.340
It was, I think it was way easier than I expected.

00:05:20.840 --> 00:05:26.680
Because after like one or two weeks, I was already quite confident that I could do the whole project.

00:05:26.680 --> 00:05:30.860
So I didn't really have any experience beforehand with Python.

00:05:30.860 --> 00:05:39.380
And with that project, after one or two weeks, I was really confident that I could do the whole project without big problems.

00:05:39.380 --> 00:05:40.400
Yeah, that's great.

00:05:40.400 --> 00:05:41.100
Very nice.

00:05:41.100 --> 00:05:44.920
I find a number of people get into Python that way.

00:05:44.920 --> 00:05:50.820
They're like working on some project and people come along, their co-workers or boss or somebody.

00:05:50.820 --> 00:05:53.260
will say, hey, can you do this thing in Python?

00:05:53.260 --> 00:05:54.040
They're like, Python?

00:05:54.040 --> 00:05:55.940
Well, I don't know anything about it, but let me give it a try.

00:05:55.940 --> 00:05:57.440
It turns out to be really easy.

00:05:57.440 --> 00:05:58.580
So that's really great.

00:05:58.580 --> 00:05:59.300
Yeah, it is.

00:05:59.300 --> 00:05:59.820
Definitely.

00:05:59.820 --> 00:06:01.020
It's very easy to start.

00:06:01.140 --> 00:06:07.680
And I would say it's one of the best programming languages to learn programming.

00:06:07.680 --> 00:06:10.460
Yeah, I certainly think it is as well.

00:06:10.460 --> 00:06:16.680
In the United States, they used to teach Java as the primary computer science 101.

00:06:16.680 --> 00:06:19.120
First course, you take programming languages.

00:06:19.120 --> 00:06:22.620
And in the recent years, it's switched to be primarily Python.

00:06:22.880 --> 00:06:23.620
So that's great.

00:06:23.620 --> 00:06:27.700
At my university, people have to start with C and C++.

00:06:27.700 --> 00:06:31.920
And most people have quite a lot of problems with that.

00:06:31.920 --> 00:06:36.180
And I guess it would be kind of an easier route to go with Python first.

00:06:36.180 --> 00:06:37.240
But I don't know.

00:06:37.240 --> 00:06:38.960
It's their choice.

00:06:38.960 --> 00:06:39.840
Yeah, of course.

00:06:39.840 --> 00:06:45.800
I think there's something to understanding a C-oriented language where you actually work

00:06:45.800 --> 00:06:48.180
with pointers and directly with the memory and stuff.

00:06:48.180 --> 00:06:49.920
But I'm not sure you should start there.

00:06:49.920 --> 00:06:50.520
You know what I mean?

00:06:50.520 --> 00:06:52.240
Yeah, it's your very first exposure.

00:06:52.240 --> 00:06:59.060
Just for comparison, my first computer science 101 course in college was Lisp.

00:06:59.060 --> 00:07:02.100
So that was really different.

00:07:02.100 --> 00:07:03.320
All right.

00:07:03.320 --> 00:07:04.980
So that's how you got started.

00:07:04.980 --> 00:07:05.980
What do you do day to day?

00:07:05.980 --> 00:07:07.960
University, I'm still studying.

00:07:07.960 --> 00:07:11.560
And next week, my lectures start again.

00:07:11.560 --> 00:07:15.520
Till then, I do some freelancing work, mainly web development.

00:07:15.520 --> 00:07:16.320
Nice.

00:07:16.320 --> 00:07:17.640
And is that web development in Python?

00:07:17.640 --> 00:07:18.460
No, it isn't.

00:07:18.460 --> 00:07:19.880
It's mainly JavaScript.

00:07:20.320 --> 00:07:24.280
Right now, I'm doing some front-end development, which is mainly JavaScript.

00:07:24.280 --> 00:07:29.020
And sometimes I wish it were Python, but I can't always choose.

00:07:29.020 --> 00:07:30.300
That's right.

00:07:30.300 --> 00:07:33.260
Sometimes if there's good work to be done, you just got to go do it.

00:07:33.260 --> 00:07:34.540
All right.

00:07:34.540 --> 00:07:40.100
So the database that you created called TinyDB is a document-oriented database.

00:07:40.100 --> 00:07:42.460
And that falls into the realm of NoSQL databases.

00:07:42.460 --> 00:07:46.800
And so just for everyone out there listening, I realize many of you know, but maybe some

00:07:46.800 --> 00:07:51.240
of you don't really totally know the difference between, say, relational databases and NoSQL

00:07:51.240 --> 00:07:51.780
databases.

00:07:51.780 --> 00:07:55.100
So relational ones, I think we're all pretty familiar with.

00:07:55.100 --> 00:07:55.920
There's a bunch of tables.

00:07:55.920 --> 00:07:57.880
There's foreign key relationships between them.

00:07:57.980 --> 00:08:03.700
We try to normalize our data, minimize duplication, and sort of set it up for querying from any

00:08:03.700 --> 00:08:04.860
particular angle, right?

00:08:04.860 --> 00:08:07.560
You have a wide range of queries you can write.

00:08:07.680 --> 00:08:12.500
But in the NoSQL world, there's a different set of trade-offs that those databases are making,

00:08:12.500 --> 00:08:12.760
right?

00:08:12.760 --> 00:08:20.160
Like the document databases don't have as much of a flexible set of queries you can write.

00:08:20.160 --> 00:08:24.740
But if you build the documents right, you can really optimize them for exactly the use case

00:08:24.740 --> 00:08:26.920
that you're looking for, right?

00:08:26.920 --> 00:08:32.140
So your data is more hierarchical and more directly models like how you might actually want to work

00:08:32.140 --> 00:08:35.840
with objects and relationships in your code rather than in storage.

00:08:35.840 --> 00:08:41.900
So there's also other ones like key value stores and even graph databases, although in my mind,

00:08:41.900 --> 00:08:43.160
those are not NoSQL databases.

00:08:43.160 --> 00:08:46.600
That's a different debate for a different time.

00:08:46.600 --> 00:08:47.360
Yeah, it is.

00:08:47.360 --> 00:08:51.120
Yeah, but so yours, TinyDB, falls into this document database.

00:08:51.120 --> 00:08:53.560
So why don't you tell us, what is TinyDB?

00:08:53.560 --> 00:08:59.020
Yeah, TinyDB is, like you said, a NoSQL, basically, a database.

00:08:59.020 --> 00:09:03.600
And actually, I didn't plan to do a NoSQL database.

00:09:03.600 --> 00:09:09.880
I just plan to do something that is easily usable.

00:09:09.880 --> 00:09:16.900
I happened to write TinyDB for another project, which was in total, I think, like 300 lines.

00:09:16.900 --> 00:09:24.860
And I didn't really feel like doing SQLite or SQLite database for it, because if I had to change

00:09:24.860 --> 00:09:30.020
anything about the data structure, I would have to write migrations and so on.

00:09:30.020 --> 00:09:40.440
So I just wrote a really simple database with a bunch of dictionary objects, which contain the data,

00:09:40.440 --> 00:09:43.360
and then added some query language for it.

00:09:43.360 --> 00:09:46.920
And it happened to become quite popular.

00:09:47.300 --> 00:09:48.000
Yeah, that's really cool.

00:09:48.000 --> 00:09:49.100
And it is quite popular.

00:09:49.100 --> 00:09:52.860
It's got over a thousand stars on GitHub, which is very cool.

00:09:52.860 --> 00:09:54.020
Yeah.

00:09:54.020 --> 00:09:55.160
You must be proud of that.

00:09:55.160 --> 00:09:55.600
That's great.

00:09:55.940 --> 00:09:56.100
Yeah.

00:09:56.100 --> 00:09:56.180
Yeah.

00:09:56.180 --> 00:10:02.220
What was interesting is that I didn't really do any marketing or promotion at all.

00:10:02.220 --> 00:10:07.840
I just put it on GitHub, uploaded some documentation, and just left it as it was.

00:10:07.840 --> 00:10:12.440
And after a couple of months, I got the first issues.

00:10:12.440 --> 00:10:18.220
And like a year later, I got the first pull requests without any involvement on my side,

00:10:18.480 --> 00:10:22.600
which was really a real great example of open source, how it works.

00:10:22.600 --> 00:10:27.140
Yeah, that's a great validation that people see what you've made, and they actually want to contribute back.

00:10:27.140 --> 00:10:28.000
You're like, wait, what?

00:10:28.000 --> 00:10:28.540
Yeah.

00:10:28.540 --> 00:10:30.380
Where did this person come from?

00:10:30.380 --> 00:10:31.820
They are working on my project.

00:10:31.820 --> 00:10:32.340
How great.

00:10:32.340 --> 00:10:33.940
Yeah, that's nice.

00:10:33.940 --> 00:10:38.900
You made a really interesting comment when you described why you didn't want to use MySQL or SQLite,

00:10:38.900 --> 00:10:40.520
whichever one you were referring to.

00:10:40.520 --> 00:10:45.320
And that's, you don't want to deal with migrations and all the headaches of evolving your schema.

00:10:45.740 --> 00:10:54.380
And I think one of the really, really powerful things about document databases that you don't really appreciate until you try it

00:10:54.380 --> 00:11:01.900
and you've sort of lived with an application over time is the ability to easily evolve the schema, right?

00:11:01.900 --> 00:11:08.420
Like if you want to add a field, you add a field to either your dictionary or class in your application,

00:11:08.420 --> 00:11:10.480
and that just becomes part of your schema.

00:11:10.860 --> 00:11:17.020
It's usually much easier, I think, to maintain these applications that are based on document databases.

00:11:17.020 --> 00:11:18.120
That's my experience anyway.

00:11:18.120 --> 00:11:18.920
Yeah.

00:11:18.920 --> 00:11:26.040
It was my experience too with the project I was working on originally because I didn't really know how the data would look like.

00:11:26.040 --> 00:11:37.060
So I needed something that would allow me to change the data as I need or as I find out I need another field so it wouldn't be hard to edit.

00:11:37.380 --> 00:11:38.060
Yeah, absolutely.

00:11:38.060 --> 00:11:48.640
And I think it's really cool that you have this as an embedded database because I would say probably the most popular choice for these types of databases would be MongoDB.

00:11:48.640 --> 00:11:54.140
But then you've got to deal with another server and, you know, configuring that thing and the connections.

00:11:54.140 --> 00:11:59.160
And just, you know, sometimes you just want a small embedded thing that's just part of your app.

00:11:59.440 --> 00:11:59.600
Yeah.

00:11:59.600 --> 00:11:59.600
Yeah.

00:11:59.600 --> 00:12:07.720
What I really like about SQLite is that it's embedded so you don't need any server, you don't need any maintenance or anything.

00:12:07.720 --> 00:12:19.600
So basically Python, basically TinyDB combines the best of SQLite being embedded and MongoDB being a document database.

00:12:19.940 --> 00:12:20.600
Yeah, that's really cool.

00:12:20.600 --> 00:12:23.760
So you said it's written 100% in Python.

00:12:23.760 --> 00:12:25.400
How many lines of code is that?

00:12:25.400 --> 00:12:32.900
Yeah, it's about 1,200 lines of code of which 40% approximately are documentation.

00:12:32.900 --> 00:12:37.560
And in addition to that, like 1,000 lines of test code.

00:12:37.560 --> 00:12:38.480
That is awesome.

00:12:38.480 --> 00:12:45.620
Yeah, and you have 100% test coverage as well on your project, which is really nice when you've got something that's actually managing your data.

00:12:45.620 --> 00:12:48.440
You want to be really pretty sure that it's working right, yeah?

00:12:48.440 --> 00:12:48.720
Definitely.

00:12:48.720 --> 00:12:49.920
Definitely.

00:12:49.920 --> 00:12:56.180
Okay, so if it's 100% in Python, how do I get it into my application?

00:12:56.180 --> 00:12:57.860
Like, can I pip install it?

00:12:57.860 --> 00:12:58.640
Yeah, definitely.

00:12:58.640 --> 00:13:01.440
Installing should be very easy.

00:13:01.440 --> 00:13:14.020
Just run pip install TinyDB, and then you just have to open a Python repl to get started, import TinyDB, and basically you are ready to go.

00:13:14.020 --> 00:13:15.780
Yeah, that's really nice.

00:13:15.780 --> 00:13:20.300
And so there's no server to start, like if your app is running, the database is running, right?

00:13:20.300 --> 00:13:21.720
Yeah.

00:13:21.720 --> 00:13:22.860
Yeah, yeah, very cool.

00:13:22.860 --> 00:13:30.880
So you had some reasons why you should use TinyDB and some reasons maybe why you might not.

00:13:30.880 --> 00:13:35.980
So what use cases are really good for considering TinyDB?

00:13:35.980 --> 00:13:39.920
I think it's just like the project I was working on originally.

00:13:39.920 --> 00:13:48.260
I think it works best for small projects where you don't really have or you don't really want to use a bigger database.

00:13:48.260 --> 00:13:56.100
So a small web application, I think some guy used it for a password manager or something.

00:13:56.480 --> 00:14:04.200
So really projects where having MySQL database or something like that would just add too much complexity.

00:14:04.200 --> 00:14:13.120
Yeah, certainly if you're going to deploy an application or library and you don't want to have the getting started steps, say,

00:14:13.120 --> 00:14:18.900
now you need to set up this server and you need to run this script to build this schema,

00:14:18.900 --> 00:14:24.040
and then you need to set the connection string here, and then you can get going, right?

00:14:24.160 --> 00:14:31.580
So to me it feels pretty similar, not an exact fit, but similar to, say, SQLite, when you might use SQLite.

00:14:31.580 --> 00:14:32.060
Yeah.

00:14:32.060 --> 00:14:40.540
It's similar in that you can really easily get started and you don't have to manage a server or something like that.

00:14:40.820 --> 00:14:41.060
Sure.

00:14:41.060 --> 00:14:45.800
So because it is a pretty small database, it doesn't try to do everything, right?

00:14:45.800 --> 00:14:51.180
It's not like MongoDB rewritten in an embedded Python variation.

00:14:51.180 --> 00:14:52.440
It really does do less.

00:14:52.440 --> 00:14:59.380
So there are some things it doesn't do, and maybe if you really depend upon them, you should not focus.

00:14:59.380 --> 00:15:00.660
You should pick something else.

00:15:00.660 --> 00:15:01.480
What are those cases?

00:15:01.480 --> 00:15:04.540
Just so we know people can decide if it fits for them.

00:15:04.920 --> 00:15:11.200
I think the main disadvantage of TinyDB is when you're using multiple threads.

00:15:11.200 --> 00:15:15.300
In that case, there is an extension for TinyDB you can use.

00:15:15.300 --> 00:15:25.160
If you have multiple processes, then you are really out of luck because there is no type of locking the file you're writing to.

00:15:25.160 --> 00:15:28.280
So you would probably run into issues with that.

00:15:28.280 --> 00:15:33.160
And also, it might turn out to be a bottleneck in performance.

00:15:33.600 --> 00:15:45.540
If you have hundreds of thousands of entries in your database and you search them, because TinyDB doesn't have any kind of index, it will have to scan the entire list every time you do a search.

00:15:45.540 --> 00:15:51.160
So you probably shouldn't do really involved data processing with it.

00:15:51.160 --> 00:15:51.480
Okay.

00:15:51.480 --> 00:15:51.840
Yeah.

00:15:51.840 --> 00:15:54.840
And that definitely, some of those are serious.

00:15:54.840 --> 00:15:55.940
Some of them might not be.

00:15:55.940 --> 00:15:57.300
It depends on your use case, right?

00:15:57.300 --> 00:15:59.180
Now, I don't want to talk about it yet.

00:15:59.180 --> 00:15:59.780
We'll get to it.

00:15:59.780 --> 00:16:01.260
But there's a bunch of extensions.

00:16:01.260 --> 00:16:10.660
And I think there are some that will come and solve some of those performance problems, which are interesting to some degree.

00:16:10.660 --> 00:16:22.560
And also, they open the possibility for a listener, an enterprising listener out there, to come and add something like multi-process or multi-thread support, things like that, right?

00:16:22.640 --> 00:16:24.120
So we'll get to the extensibility.

00:16:24.120 --> 00:16:26.920
There are a number of extensions now.

00:16:26.920 --> 00:16:29.800
I hope it's quite easy to add a new one.

00:16:30.380 --> 00:16:42.400
So if there are a couple of things where TinyDB doesn't support them out of the box, but if you want to, you can write an extension, you can write your own storage mechanism, and so on.

00:16:42.400 --> 00:16:42.620
Yeah.

00:16:42.620 --> 00:16:43.080
Yeah.

00:16:43.080 --> 00:16:43.360
Great.

00:16:43.360 --> 00:16:43.520
Yeah.

00:16:43.520 --> 00:16:44.680
We'll talk about that in a little bit.

00:16:44.680 --> 00:16:45.320
Awesome.

00:16:45.820 --> 00:16:52.220
So when I think of competitors, you have a few listed, and we've talked about a couple of them.

00:16:52.220 --> 00:16:59.000
Competitors, like we said, is not exactly the right word because it's all open source, but choices you might choose from.

00:16:59.000 --> 00:17:09.720
On one hand, like MongoDB, if you're doing document databases and you're doing real large data production apps, maybe this is, you want to scale out, maybe this is the right choice.

00:17:10.340 --> 00:17:15.760
But there's also some other pure Python databases that I had not heard of that I thought were pretty interesting.

00:17:15.760 --> 00:17:19.740
So do you want to tell us about the two that you listed on your site?

00:17:19.740 --> 00:17:20.680
What are their names?

00:17:20.680 --> 00:17:31.720
Before starting with TinyDB, I first did some research if there already is some kind of embedded Python database I can use, and that fits my use case.

00:17:32.520 --> 00:17:38.340
And there wasn't really, I found two projects which are somewhat similar, but not really.

00:17:38.340 --> 00:17:46.260
For example, there's a project called CodernityDB, which is also pure Python and NoSQL.

00:17:46.260 --> 00:17:49.240
But as far as I know, it's much more complex.

00:17:49.240 --> 00:17:52.480
You have the ability to use indexes.

00:17:52.480 --> 00:17:54.460
You have HTTP support.

00:17:54.940 --> 00:17:57.540
So it might be a great fit for a project.

00:17:57.540 --> 00:18:01.860
But in my case, it was just too much code.

00:18:01.860 --> 00:18:07.900
I think it's like 7,000 lines of code, which would be seven times the size of my project.

00:18:07.900 --> 00:18:10.280
So, yeah.

00:18:10.280 --> 00:18:13.280
And I think that's without unit tests, the size of the code.

00:18:13.280 --> 00:18:14.320
So it's even bigger, yeah?

00:18:14.320 --> 00:18:14.720
Yeah.

00:18:14.720 --> 00:18:16.840
And the other one was BuzzHug.

00:18:16.840 --> 00:18:20.600
BuzzHug is also quite interesting because it's optimized for speed.

00:18:21.060 --> 00:18:29.920
But what I didn't really, the reason why I didn't use BuzzHug was that it didn't have the kind of query language I wanted to use.

00:18:29.920 --> 00:18:43.180
So while it is really fast and it's also pure Python, it just didn't have API in the way I wanted to have or the kind of API I imagined to use.

00:18:43.720 --> 00:18:47.340
So I chose to build my own project.

00:18:47.340 --> 00:18:53.300
And also, it's like twice the size of TinyDB without tests.

00:18:53.300 --> 00:18:54.000
Yeah, definitely.

00:18:54.000 --> 00:18:59.340
So the CoderityDB, that one is a key value store.

00:18:59.340 --> 00:19:04.160
So in some ways, maybe not quite as capable as the document databases.

00:19:04.160 --> 00:19:05.020
It depends.

00:19:05.020 --> 00:19:08.560
They do have some additional indexes and queries you can write.

00:19:08.880 --> 00:19:10.960
But maybe not quite as flexible.

00:19:10.960 --> 00:19:12.700
And then BuzzHug is relational.

00:19:12.700 --> 00:19:17.580
So you have, again, all the trade-offs with managing the schema and evolving it in migrations.

00:19:17.580 --> 00:19:24.280
So, yeah, to me, it feels like SQLite is kind of the biggest competitor to you, you guys.

00:19:24.280 --> 00:19:24.800
Yeah.

00:19:24.800 --> 00:19:26.160
Yeah, cool.

00:19:26.160 --> 00:19:34.420
So can you maybe, I don't necessarily want to talk about code exactly in audio format because that doesn't usually go so well.

00:19:34.420 --> 00:19:37.900
But just give us a sense of, like, what is the API like?

00:19:37.900 --> 00:19:44.800
Like, if I wanted to create, you know, get started with TinyDB, I wanted to create a database and insert some records.

00:19:44.800 --> 00:19:46.080
Like, what does that look like?

00:19:46.080 --> 00:19:46.620
Yeah.

00:19:46.620 --> 00:19:49.800
So you start with importing TinyDB.

00:19:50.540 --> 00:19:59.420
And there are two main classes you will probably use, which is the TinyDB class and the query class.

00:19:59.420 --> 00:20:03.400
So you basically create a new instance of TinyDB.

00:20:03.400 --> 00:20:06.980
And you already have your database.

00:20:06.980 --> 00:20:09.900
You don't need any for the setup code or anything.

00:20:10.480 --> 00:20:23.240
And then you can just call DB.insert or the instance and call the insert function, pass some Python dictionary, and you already inserted your first object into the database.

00:20:23.240 --> 00:20:24.340
Yeah, very nice.

00:20:24.340 --> 00:20:31.140
And so it just basically, you either can directly or into a table insert more or less Python dictionaries, right?

00:20:31.140 --> 00:20:31.920
Yeah.

00:20:32.240 --> 00:20:36.660
And basically, you have all the basic functions, create, read, update, and delete.

00:20:36.660 --> 00:20:39.900
You can do some interesting searching.

00:20:39.900 --> 00:20:43.720
Search for objects with specific parameters.

00:20:43.720 --> 00:20:48.440
You can do some quite involved searching with the API I provided.

00:20:48.820 --> 00:21:01.500
But yeah, I think besides the database itself, the query language is like the second biggest or has the second most code in the whole project.

00:21:01.500 --> 00:21:08.600
If you have a list, you can search for an item in that list with specified properties and so on.

00:21:09.220 --> 00:21:12.120
So if you want to, you can really go crazy with searching.

00:21:12.120 --> 00:21:21.020
But still, the basic API is insert, delete, and then, of course, querying and update.

00:21:21.020 --> 00:21:21.700
Yeah, nice.

00:21:21.700 --> 00:21:32.560
And to me, the query language looks a little bit like SQLAlchemy in that you do it sort of in a Python language and it's translated to the query that goes down to the system, right?

00:21:32.560 --> 00:21:36.700
I actually found out about the SQLAlchemy project.

00:21:36.700 --> 00:21:39.000
I think I just discovered it.

00:21:39.200 --> 00:21:45.580
After I wrote TinyDB, so it happened to have a very similar query language.

00:21:45.580 --> 00:21:47.040
Yeah, you didn't take it straight from there.

00:21:47.040 --> 00:21:49.580
But it is a very natural way of doing it.

00:21:49.580 --> 00:21:57.340
So basically, the fact that you use the native Python sort of equality and things like that and then it translates it over is really nice.

00:21:57.340 --> 00:22:04.300
The exchange of data, if you just go against your API directly, is really in dictionaries.

00:22:04.720 --> 00:22:11.100
And sort of extended a little bit with your primary key being on there as a separate type of attribute and so on.

00:22:11.100 --> 00:22:12.480
But more or less as dictionaries.

00:22:12.480 --> 00:22:21.480
Is there an object data mapper layer, kind of like Mongo, has Mongo Engine or Mongo Kit, where you work in higher level stuff?

00:22:21.480 --> 00:22:25.740
Like I would write, say, like, here you have, in your example, you had a person.

00:22:25.740 --> 00:22:32.600
Like, could I create a person class and maybe put rules or columns or structure on that and then use that to query with?

00:22:32.860 --> 00:22:39.520
Yeah, in a way, you can already use the query language of TinyDB as an ORM.

00:22:39.520 --> 00:22:45.520
So you can create a new instance of the query object and call it user.

00:22:45.520 --> 00:22:52.840
And then you can call something like user.name equals to the name you're looking for.

00:22:52.840 --> 00:22:58.220
And you already have a query, which you can use to search the database.

00:22:58.220 --> 00:23:03.940
So you can do that or use that as some kind of ad hoc modeling.

00:23:04.520 --> 00:23:05.800
Okay. Yeah, nice.

00:23:05.800 --> 00:23:08.800
And do you support Python 3?

00:23:08.800 --> 00:23:10.020
Python 2?

00:23:10.020 --> 00:23:11.540
What versions of Pythons do you support?

00:23:11.540 --> 00:23:16.940
I tried or I still try to support as many versions of Python as possible.

00:23:16.940 --> 00:23:18.720
I'm not sure about Python 2.6.

00:23:18.720 --> 00:23:23.620
But upwards from Python 2.7, I think I support every version.

00:23:23.620 --> 00:23:29.860
Also because TinyDB doesn't have any external dependencies even on other Python packages.

00:23:30.720 --> 00:23:37.300
So it's quite easy or it was quite easy for me to write code which works in both Python 2 and 3.

00:23:37.300 --> 00:23:43.000
So if you're using Python 3 or Python 2, either way, you're good to go.

00:23:43.000 --> 00:23:44.360
Yeah, that's really excellent.

00:23:44.360 --> 00:23:50.400
So I'm very happy to see the increased Python 3 support for your project, other projects.

00:23:50.400 --> 00:23:55.200
I had the Bware guys on last time and they're like, it's all Python 3.

00:23:55.200 --> 00:23:55.900
So that's great.

00:23:56.220 --> 00:24:07.420
The other thing, if it's pure Python, the very next question I had was, well, if it supports Python 3, what's the story with PyPy, the JIT compiled version of the interpreter?

00:24:07.900 --> 00:24:21.000
I didn't myself really use it with PyPy, but I still run every commit I do with continuous integration, which tests all the Python 2 and 3 versions and also tests that it works with PyPy.

00:24:21.000 --> 00:24:26.620
But myself, I didn't really have a use case where I would need to use PyPy.

00:24:27.120 --> 00:24:31.820
So I don't know if it's faster or slower than you would expect.

00:24:31.820 --> 00:24:32.780
Okay.

00:24:32.780 --> 00:24:34.640
But it definitely works.

00:24:34.640 --> 00:24:35.480
It definitely works.

00:24:35.480 --> 00:24:35.740
Okay.

00:24:35.740 --> 00:24:36.600
That's really cool.

00:24:36.600 --> 00:24:38.260
And so maybe it's a lot faster.

00:24:38.260 --> 00:24:42.800
Actually, we'll get to some stuff that talks a little bit about speedups that you can do.

00:24:42.800 --> 00:24:48.320
And maybe that just takes PyPy out of the loop in terms of being necessarily important.

00:24:48.320 --> 00:24:57.920
But anytime you have something like this that's pretty central to how your application performs and you can do it on PyPy, that's pretty interesting to see.

00:25:10.080 --> 00:25:12.720
This portion of Talk Python To Me is brought to you by Hired.

00:25:12.720 --> 00:25:15.740
Hired is the platform for top Python developer jobs.

00:25:15.740 --> 00:25:20.560
Create your profile and instantly get access to 3,500 companies who will work to compete with you.

00:25:20.560 --> 00:25:23.420
Take it from one of Hired's users who recently got a job and said,

00:25:23.420 --> 00:25:28.720
I had my first offer on Thursday after going live on Monday, and I ended up getting eight offers in total.

00:25:28.720 --> 00:25:32.180
I've worked with recruiters in the past, but they've always been pretty hit and miss.

00:25:32.180 --> 00:25:35.020
I tried LinkedIn, but I found Hired to be the best.

00:25:35.020 --> 00:25:37.100
I really liked knowing the salary up front.

00:25:37.100 --> 00:25:39.460
Privacy was also a huge seller for me.

00:25:40.160 --> 00:25:41.160
Sounds awesome, doesn't it?

00:25:41.160 --> 00:25:43.140
Well, wait until you hear about the sign-in bonus.

00:25:43.140 --> 00:25:46.560
Everyone who accepts a job from Hired gets $1,000 sign-in bonus.

00:25:46.560 --> 00:25:49.240
And as Talk Python listeners, it gets way sweeter.

00:25:49.240 --> 00:25:54.480
Use the link Hired.com slash Talk Python To Me and Hired will double the sign-in bonus to $2,000.

00:25:54.480 --> 00:25:56.280
Opportunity is knocking.

00:25:56.280 --> 00:26:00.040
Visit Hired.com slash Talk Python To Me and answer the door.

00:26:07.040 --> 00:26:11.640
Speaking of that, one thing that I saw that was cool, and this is something I learned today,

00:26:11.640 --> 00:26:16.300
I hadn't really paid any attention to it before, is you said one of the speed-ups you can do is,

00:26:16.300 --> 00:26:23.260
if you install a package called uJSON, you can get TinyDB to run a lot faster.

00:26:23.260 --> 00:26:28.940
Because as you can imagine, it's all about JSON parsing, multi-directional JSON parsing, right?

00:26:28.940 --> 00:26:32.040
And so what's uJSON and why is it faster?

00:26:32.560 --> 00:26:34.180
uJSON is a really cool project.

00:26:34.180 --> 00:26:43.780
I think the main or the reason why JSON parsing matters so much is that basically due to the way TinyDB works,

00:26:43.780 --> 00:26:53.660
every time you do, every time you interact with the database, you have to read the database file from the file system.

00:26:54.080 --> 00:26:57.800
So you really need a fast JSON parser.

00:26:57.800 --> 00:27:02.320
And the JSON parser that comes with Python is really powerful.

00:27:02.320 --> 00:27:05.660
You can extend it in a few different ways.

00:27:05.660 --> 00:27:13.340
But if you really need performance, this kind of extension really only gets in the way,

00:27:13.340 --> 00:27:18.020
because probably to implement it in Python, the whole parsing,

00:27:18.720 --> 00:27:21.260
and then you lose a lot of speed.

00:27:21.260 --> 00:27:27.040
So uJSON is one project I found, which also implements Python,

00:27:27.040 --> 00:27:29.320
I mean, which also implements JSON parsing,

00:27:29.320 --> 00:27:32.480
but it does it all in C.

00:27:32.480 --> 00:27:34.680
So it's really fast.

00:27:34.800 --> 00:27:37.860
You do have a trade-off that it's not as extensible.

00:27:37.860 --> 00:27:45.260
But if you don't need to overwrite some internals of the JSON parser,

00:27:45.260 --> 00:27:50.840
then it's really good to use uJSON to get more performance out of it.

00:27:50.840 --> 00:27:52.200
Yeah, that's really cool.

00:27:52.200 --> 00:27:53.580
And you can just pip install that.

00:27:53.580 --> 00:27:59.260
And you said that TinyDB will look for its existence and prefer it if it's there.

00:27:59.260 --> 00:28:00.440
But if it's not there, it's fine.

00:28:00.440 --> 00:28:03.480
It'll just use the standard library JSON, right?

00:28:03.840 --> 00:28:08.220
Yeah, so if you install it, it works like a drop-in.

00:28:08.220 --> 00:28:12.960
You just install it, and Python and TinyDB auto-detects it,

00:28:12.960 --> 00:28:16.140
and you get the speedup instantly.

00:28:16.140 --> 00:28:17.160
Yeah, nice.

00:28:17.160 --> 00:28:19.560
Let's talk about extensibility a little bit.

00:28:19.560 --> 00:28:23.080
So there was a couple of things I asked about,

00:28:23.080 --> 00:28:25.520
and you're like, well, I'm not sure it's really in there, maybe.

00:28:25.520 --> 00:28:30.840
But you've written a pretty cool framework for extending TinyDB.

00:28:31.860 --> 00:28:36.600
And so some of the things you can extend are like you could write a custom storage engine.

00:28:36.600 --> 00:28:40.100
You can write custom middleware, custom table classes.

00:28:40.100 --> 00:28:49.080
And then there's a bunch of projects that people have written that then you can plug into TinyDB to make it nicer, faster, etc.

00:28:49.320 --> 00:28:58.820
Yeah, extensibility was something I first started with having support for multiple storages.

00:28:58.820 --> 00:29:03.780
So you could swap out the JSON storage and use memory-only storage.

00:29:03.780 --> 00:29:09.820
I decided to provide even more extensibility so you can write a middleware,

00:29:09.820 --> 00:29:15.100
which modifies the way TinyDB works with any kind of storage.

00:29:15.100 --> 00:29:24.040
And also there are quite a bit of extensions out there you can use to modify TinyDB works in total.

00:29:24.040 --> 00:29:28.780
So there's a bunch of extensions, and one of them is TinyRecord.

00:29:28.780 --> 00:29:30.160
What's the story of TinyRecord?

00:29:30.160 --> 00:29:39.360
TinyRecord was basically written because TinyDB itself didn't have any real support for multi-threading.

00:29:39.920 --> 00:29:46.660
So with TinyRecord, you basically have some sort of transactions and also uses locking.

00:29:46.660 --> 00:29:49.660
So you can use TinyDB from multiple threads.

00:29:49.660 --> 00:29:57.060
And then you have an object where you can add a bunch of modifications you want to use.

00:29:57.060 --> 00:30:03.240
And yeah, basically it's transaction support for TinyDB, which is quite handy if you need multi-threading.

00:30:03.240 --> 00:30:05.200
It depends on your use case, of course.

00:30:05.560 --> 00:30:05.960
Yeah, of course.

00:30:05.960 --> 00:30:07.740
But if you need it, that is really cool.

00:30:07.740 --> 00:30:13.240
So another one that someone wrote, these are all separate projects on GitHub, by the way.

00:30:13.240 --> 00:30:16.060
Another one that somebody wrote is called TinyMongo.

00:30:16.060 --> 00:30:17.320
Yeah, TinyMongo.

00:30:17.320 --> 00:30:25.360
So does that like match the MongoDB API or something, but then actually store it as through TinyDB?

00:30:25.360 --> 00:30:26.680
Or what's the story with that?

00:30:26.680 --> 00:30:35.400
Yeah, TinyMongo, as far as I know, allows you to use an interface or an API which matches the MongoDB,

00:30:35.400 --> 00:30:38.200
API, but instead uses TinyDB.

00:30:38.200 --> 00:30:45.800
I think one day I just got an issue on GitHub where someone said, hey, I have this extension

00:30:45.800 --> 00:30:48.980
for TinyDB so it can replace MongoDB.

00:30:48.980 --> 00:30:52.900
And I included it into the list of extensions.

00:30:52.900 --> 00:30:54.200
Yeah, that's really cool.

00:30:54.400 --> 00:30:59.500
So it does look like it more or less just replaces the PyMongo API.

00:30:59.500 --> 00:31:05.140
So you've got like insert and find one and it even has like the dollar operators.

00:31:05.140 --> 00:31:07.660
So like dollar set and so on.

00:31:07.660 --> 00:31:08.460
Yeah.

00:31:08.460 --> 00:31:15.260
So one of the things that comes to mind for this for me is because TinyDB stores its files,

00:31:15.260 --> 00:31:21.080
at least by default, in a simple JSON document and there's no setup for it.

00:31:21.080 --> 00:31:26.240
If you were doing unit testing and you needed something better than just simple mocking

00:31:26.240 --> 00:31:33.600
and studying out your test data against MongoDB, it might be cool to switch it to this and use

00:31:33.600 --> 00:31:39.740
this as a way to kind of have a slightly better test backend for a real Mongo system or something

00:31:39.740 --> 00:31:40.220
like that.

00:31:40.220 --> 00:31:46.720
If you don't want to start up a separate server just for testing, that might be a really cool

00:31:46.720 --> 00:31:48.140
solution for that problem.

00:31:48.140 --> 00:31:49.120
Yeah, nice.

00:31:49.120 --> 00:31:50.940
So then there's a couple of others.

00:31:50.940 --> 00:31:53.100
One is TinyDB serialization.

00:31:53.800 --> 00:31:59.880
And I noticed that when I was playing around with TinyDB that I, for example, put like a

00:31:59.880 --> 00:32:02.560
date time in as part of my record.

00:32:02.560 --> 00:32:07.160
And of course, you can't, people may or may not know, you can't just go to the JSON module

00:32:07.160 --> 00:32:10.500
and go save this thing if it has a raw date time in it.

00:32:10.500 --> 00:32:11.760
It just, it doesn't support it.

00:32:11.760 --> 00:32:16.280
And that's what happened when I tried to save my thing in TinyDB.

00:32:16.280 --> 00:32:18.680
So does this address that kind of problem and other stuff?

00:32:18.680 --> 00:32:19.280
Yeah.

00:32:19.440 --> 00:32:25.900
I even think that touring date time objects was the reason why the extension came into

00:32:25.900 --> 00:32:26.420
existence.

00:32:26.420 --> 00:32:35.820
Because for TinyDB to tour other objects than dicts or for the JSON module really to support

00:32:35.820 --> 00:32:39.280
it, you have to modify it yourself.

00:32:40.280 --> 00:32:45.900
So the extension allows you to register your own serialization code.

00:32:45.900 --> 00:32:50.320
So you can store all kinds of objects that you want to store.

00:32:50.320 --> 00:32:54.100
And it allows you to specify how exactly it will be stored.

00:32:54.100 --> 00:32:54.640
I see.

00:32:54.640 --> 00:32:58.700
So I can, I can basically tell it like, hey, if you see a date time, serialize it like this.

00:32:58.700 --> 00:33:03.240
And if you see some other type that it doesn't necessarily know about, store it like that.

00:33:03.240 --> 00:33:04.060
Okay.

00:33:04.060 --> 00:33:04.340
Yeah.

00:33:04.800 --> 00:33:05.080
Yeah.

00:33:05.080 --> 00:33:05.680
Very nice.

00:33:05.680 --> 00:33:08.640
What about TinyDB SmartCache?

00:33:08.640 --> 00:33:10.660
It's also a really interesting project.

00:33:10.660 --> 00:33:16.780
I think it was created after a pull request or actually, yeah.

00:33:16.780 --> 00:33:24.620
So there was an addition to TinyDB because as I said before, searching can get really slow

00:33:24.620 --> 00:33:26.680
if you have a lot of objects.

00:33:27.260 --> 00:33:34.000
So what TinyDB does is that as long as you don't modify the database, it stores the results

00:33:34.000 --> 00:33:35.300
of your search queries.

00:33:35.300 --> 00:33:40.140
So it doesn't have to redo the work if you didn't change anything at all.

00:33:40.140 --> 00:33:43.440
So really handy in some cases.

00:33:43.440 --> 00:33:49.520
So if you do like a query and you give it some parameters and it comes back with 20 records,

00:33:49.520 --> 00:33:54.700
the SmartCache will say, if I see this query again, just give them those 20 records and more

00:33:54.700 --> 00:33:55.000
or less.

00:33:55.000 --> 00:33:58.400
Actually, TinyDB itself will already do that caching for you.

00:33:58.400 --> 00:34:06.980
So what SmartCache does is that if you basically takes it one step farther, if you search query,

00:34:06.980 --> 00:34:09.180
it stores the result.

00:34:09.180 --> 00:34:15.240
And then if you do some updates on the database, it doesn't throw away the search results, but

00:34:15.240 --> 00:34:21.280
instead replaces them or updates the cache with the new results.

00:34:21.280 --> 00:34:27.460
So if you have a query which matches some elements and if you insert a new element which also

00:34:27.460 --> 00:34:30.880
matches that query, it will go directly to the cache.

00:34:30.880 --> 00:34:37.760
So it doesn't have to redo the host searching, which TinyDB itself would have to do.

00:34:38.440 --> 00:34:47.220
But of course, it's also a trade-off because it uses more memory, because it has to show more results and also do more.

00:34:47.220 --> 00:34:57.000
There is some overhead on every insert and update because it has to check every query if it matches and update the cache in place.

00:34:57.840 --> 00:35:13.400
Yeah, but in some cases it can really be a really handy thing to use if you have a lot of updates and deletes in your code and you don't want to redo or to process all the entries in your database for every search.

00:35:13.400 --> 00:35:16.780
That might be an extension you want to use.

00:35:16.780 --> 00:35:18.580
Okay, yeah, that sounds really cool.

00:35:18.580 --> 00:35:20.140
Very nice.

00:35:20.140 --> 00:35:23.880
So let's also talk a little bit about the extensibility.

00:35:23.880 --> 00:35:33.680
So you said there's three basic places that we could extend TinyDB, and one of them is writing a custom storage engine.

00:35:33.680 --> 00:35:40.360
So by default, it's just a single, your data is more or less stored in a single JSON file.

00:35:40.760 --> 00:35:44.260
And that's more or less the connection string when you create the database, right?

00:35:44.260 --> 00:35:45.100
You say, here's the file.

00:35:45.100 --> 00:35:49.520
Yeah, you just pass the file name and it uses it.

00:35:49.520 --> 00:35:50.340
Yeah, nice.

00:35:50.340 --> 00:35:53.040
Again, very SQLite-like.

00:35:53.040 --> 00:35:58.020
But then you said you can create alternate storage engines.

00:35:58.020 --> 00:35:59.780
And this got me thinking about some ideas.

00:35:59.780 --> 00:36:02.140
But there are some that you talked about already, right?

00:36:02.140 --> 00:36:09.700
Yeah, for example, there is a storage which uses or just stores the data in memory.

00:36:10.140 --> 00:36:14.080
The first storage I wrote even was for YAML.

00:36:14.080 --> 00:36:23.020
So you have these YAML files later switched to JSON because it was faster, because YAML can get really complex in some cases.

00:36:23.020 --> 00:36:29.080
If you get really fancy, you could even do some kind of HTTP stuff.

00:36:29.080 --> 00:36:33.020
But I don't think that there is an extension for that yet.

00:36:33.020 --> 00:36:36.260
But that would be something really interesting to try.

00:36:36.260 --> 00:36:37.260
Yeah, absolutely.

00:36:37.580 --> 00:36:52.160
And so when I saw this, I was thinking, well, okay, maybe there are certain things you can do to leverage some of the shortcomings that you had talked about when you're like, maybe not so much from multiple process, stuff like that.

00:36:52.160 --> 00:37:03.760
So, for example, MongoDB recently switched from storing binary JSON in a variety of files to something called WiredTiger, which is apparently much faster.

00:37:03.760 --> 00:37:09.060
And they support multiple processes from there through certain types of locking and so on.

00:37:09.100 --> 00:37:15.960
I was thinking, hmm, could you take one of these storage engines and somehow, you know, with some work, plug it into TinyDB?

00:37:15.960 --> 00:37:16.960
Yeah, that should be possible.

00:37:16.960 --> 00:37:17.960
Yeah, that should be possible.

00:37:17.960 --> 00:37:29.400
Writing a custom storage is really easy because you just need to create a class with a constructor, which takes the parameters you need to create the storage.

00:37:29.400 --> 00:37:35.400
And then method for reading and one method for writing the data to the storage.

00:37:35.880 --> 00:37:38.260
So you could do some logging there.

00:37:38.260 --> 00:37:44.260
I think that would be an interesting thing to do, logging on storage level.

00:37:44.260 --> 00:37:53.640
So basically, then it has to wait until the database file itself is locked or unlocked to return the data.

00:37:53.640 --> 00:37:55.280
That would be interesting.

00:37:55.660 --> 00:37:56.440
It would be interesting, right?

00:37:56.440 --> 00:38:01.160
And so, I mean, it definitely seems like you could unlock some potential there as well.

00:38:01.160 --> 00:38:02.940
Okay, cool.

00:38:02.940 --> 00:38:04.240
So custom storage engines.

00:38:04.240 --> 00:38:06.220
And you could even do crazy stuff, right?

00:38:06.220 --> 00:38:12.380
Like you could say, I'd like to use an S3 bucket as my storage location or those types of things, right?

00:38:12.380 --> 00:38:13.660
Whatever you're looking at.

00:38:13.660 --> 00:38:19.440
As long as you can read it and write it, it should be possible to use it as a storage.

00:38:19.440 --> 00:38:20.220
Interesting.

00:38:20.220 --> 00:38:20.680
Okay.

00:38:20.680 --> 00:38:23.440
The other thing you talked about was custom table classes.

00:38:23.440 --> 00:38:24.700
What's that?

00:38:24.940 --> 00:38:32.740
Yeah, custom table classes mainly came into existence to modify the way the database implementation works.

00:38:32.740 --> 00:38:39.900
So it's basically in other cases you would have to over or to use a subclass.

00:38:39.900 --> 00:38:47.180
And so with custom table classes, you basically what you do is have a subclass of TinyDB.

00:38:47.180 --> 00:38:51.180
And then you can really do a lot of modifications.

00:38:51.180 --> 00:39:04.520
For example, the smart cache extension uses a custom table class to provide or to intercept every insert, update, and delete, and so on to add its own logic behind it.

00:39:04.520 --> 00:39:05.000
Okay.

00:39:05.000 --> 00:39:10.640
Could you use something like a custom table class to say, like, this field is required?

00:39:10.640 --> 00:39:16.920
Or this one must match like some kind of constraint, like it must be an email address or something like that?

00:39:16.920 --> 00:39:20.560
And then like not let it insert if you tried to save the wrong one?

00:39:20.800 --> 00:39:22.680
Yeah, that's definitely an interesting idea.

00:39:22.680 --> 00:39:22.720
Yeah.

00:39:22.720 --> 00:39:33.560
That would be a really interesting extension for our TinyDB validation because you can intercept all the in-source and updates.

00:39:33.560 --> 00:39:38.480
And then you could check if it matches and raise an exception if it's not valid.

00:39:38.820 --> 00:39:38.940
Yeah.

00:39:38.940 --> 00:39:40.560
Yeah, that's really nice.

00:39:40.560 --> 00:39:44.840
The other way, I guess, that we haven't really spoken about yet is middleware.

00:39:44.840 --> 00:39:49.200
So what kind of things can you do with regard to extensibility and middleware?

00:39:49.200 --> 00:39:49.780
Yeah.

00:39:49.780 --> 00:39:55.280
Middlewares are the third way to do modifications or to modify the way TinyDB works.

00:39:55.280 --> 00:40:00.420
Basically, it acts as a layer between TinyDB and the storage you pass to it.

00:40:00.880 --> 00:40:04.040
So you can do some interesting things over there.

00:40:04.040 --> 00:40:14.820
For example, in TinyDB, there is a caching middleware, which provides a way so it doesn't have to hit the file system on every read and write.

00:40:14.820 --> 00:40:17.580
Only every couple of reads and writes.

00:40:17.580 --> 00:40:20.500
So you don't have that overhead.

00:40:20.500 --> 00:40:27.060
And yeah, so you can do some modifications there, different ones from a custom storage.

00:40:27.060 --> 00:40:34.460
I think the main way in which middlewares are interesting, because you can use multiple of them at the same time.

00:40:34.460 --> 00:40:40.680
You can use it with any kind of storage behind it and do some modifications over there.

00:40:40.680 --> 00:40:41.380
Yeah.

00:40:41.380 --> 00:40:41.980
For example.

00:40:41.980 --> 00:40:42.520
Yeah.

00:40:42.520 --> 00:40:43.020
Okay.

00:40:43.020 --> 00:40:44.140
Very cool.

00:40:44.140 --> 00:40:46.120
So what's the future of TinyDB?

00:40:46.120 --> 00:40:49.560
Have you got anything you're adding or anything like that?

00:40:49.560 --> 00:40:52.320
I don't really have any big plans for TinyDB.

00:40:52.320 --> 00:40:59.560
I think right now I'm quite happy with how the project works and how the API is looking like.

00:40:59.560 --> 00:41:03.600
There might be some renaming of methods.

00:41:03.600 --> 00:41:07.280
So the intent of the method becomes clearer.

00:41:07.280 --> 00:41:13.740
But apart from that, right now I don't have really any big plans to change how it works.

00:41:13.740 --> 00:41:16.820
Because, I mean, the project itself is quite small.

00:41:16.820 --> 00:41:23.440
So there's not much I would want to add into the TinyDB core itself.

00:41:23.440 --> 00:41:28.920
But of course, there's still room for a lot of extensions to write.

00:41:28.920 --> 00:41:30.020
Yeah, that sounds great.

00:41:30.020 --> 00:41:36.420
Are you looking for people to create extensions or suggest changes or basically open source contributors?

00:41:36.420 --> 00:41:43.420
Yeah, I think writing extensions for TinyDB can lead to some really interesting projects.

00:41:43.420 --> 00:41:46.420
As we already talked about a couple of extensions.

00:41:46.420 --> 00:41:51.220
And apart from that, I'm always glad if people have improved the documentation.

00:41:51.220 --> 00:41:55.000
Because I'm not a native English speaker.

00:41:55.000 --> 00:41:59.340
So there might be a few rough places in the documentation.

00:41:59.340 --> 00:42:07.300
So if people think that the wording might need some improvement, I'm definitely happy to accept that pull request.

00:42:07.300 --> 00:42:12.280
And also, if you write an extension, just open an issue on the project on GitHub.

00:42:12.280 --> 00:42:14.140
And I will add it to the list.

00:42:14.140 --> 00:42:15.060
Okay, excellent.

00:42:15.060 --> 00:42:17.680
Well, I think it's a very cool project.

00:42:17.680 --> 00:42:21.860
I would love to see a nice, robust, embedded document database.

00:42:21.860 --> 00:42:25.420
So thanks for taking us a little closer to that world.

00:42:25.420 --> 00:42:26.080
Yeah.

00:42:26.080 --> 00:42:26.500
Awesome.

00:42:26.500 --> 00:42:30.980
All right, so before I let you go, I have two quick questions for you, as always.

00:42:30.980 --> 00:42:34.480
So first of all, you're welcome to name your own stuff if you want.

00:42:34.480 --> 00:42:36.780
What's your favorite PyPI package?

00:42:36.780 --> 00:42:40.440
I just noticed we now have over 90,000 PyPI packages.

00:42:40.440 --> 00:42:44.040
And so you must have some exposure to a few that are pretty interesting.

00:42:44.040 --> 00:42:51.800
Yeah, I think the package I would recommend is the one we already talked about, which is uJSON.

00:42:52.060 --> 00:42:56.160
Because in many cases, it can improve performance by a lot.

00:42:56.160 --> 00:43:06.320
Without, if you don't need to customize the way the JSON parser of Python works, then using uJSON can be really cool.

00:43:06.700 --> 00:43:07.400
Yeah, that's great.

00:43:07.400 --> 00:43:11.420
And so many people are processing JSON in one way or another these days.

00:43:11.420 --> 00:43:13.080
So it's very broadly applicable.

00:43:13.080 --> 00:43:14.060
A good choice.

00:43:14.060 --> 00:43:17.340
I'll throw in TinyDB for you as well.

00:43:17.340 --> 00:43:18.540
So pip install TinyDB.

00:43:18.540 --> 00:43:20.700
And how about your editor?

00:43:20.700 --> 00:43:22.500
If you're going to write some Python code, what do you open up?

00:43:22.760 --> 00:43:24.560
It depends on the project, really.

00:43:24.560 --> 00:43:27.960
For small projects, I probably use sublime text.

00:43:27.960 --> 00:43:35.300
But if it's more than two or three files, I probably fire up PyCharm and use it.

00:43:35.300 --> 00:43:35.620
Okay.

00:43:35.620 --> 00:43:40.540
Also because, yeah, that's an interesting development in Python with gradual typing.

00:43:40.540 --> 00:43:41.420
It's high-pins.

00:43:41.420 --> 00:43:41.900
Yeah.

00:43:41.900 --> 00:43:48.780
And PyCharm, as far as I know, already supports them to some extent.

00:43:49.340 --> 00:43:52.440
So it's really handy for that as well.

00:43:52.440 --> 00:43:53.400
Yeah, that's really cool.

00:43:53.400 --> 00:43:59.760
If I'm trying to understand some code and I'm like, oh, these three things that are coming in are these types.

00:43:59.760 --> 00:44:04.280
If you tell it, it'll give you a lot more assistance trying to understand something.

00:44:04.280 --> 00:44:07.560
If you're jumping into something you don't totally know the API for, that's cool.

00:44:07.560 --> 00:44:08.520
All right.

00:44:08.520 --> 00:44:08.700
Yeah.

00:44:08.700 --> 00:44:09.060
Awesome.

00:44:09.060 --> 00:44:10.200
PyCharm is a good one.

00:44:10.200 --> 00:44:11.000
So sublime text.

00:44:11.000 --> 00:44:11.820
All right.

00:44:11.820 --> 00:44:12.580
Final call to action.

00:44:12.580 --> 00:44:14.260
People should get out there.

00:44:14.260 --> 00:44:15.320
Write some extensions.

00:44:15.320 --> 00:44:15.960
What do you think?

00:44:15.960 --> 00:44:17.060
Yeah, definitely.

00:44:17.060 --> 00:44:18.280
Write some extensions.

00:44:18.480 --> 00:44:19.480
Write some extensions.

00:44:19.480 --> 00:44:20.480
Yeah.

00:44:20.480 --> 00:44:21.480
And if you're looking at the API, discuss the documentation.

00:44:21.480 --> 00:44:33.560
I'm always happy to hear feedback.

00:44:33.560 --> 00:44:45.520
And also if you find some difficulties or even just difficulties to get how to get started with TinyDB, I think that's worth of an issue on GitHub.

00:44:45.520 --> 00:44:53.980
So other people can improve if you or other people can have an easier way to get started if you run into problems.

00:44:53.980 --> 00:44:58.960
Yeah, just open an issue on GitHub and we can discuss things over there.

00:44:58.960 --> 00:44:59.760
All right.

00:44:59.760 --> 00:45:00.860
That sounds great.

00:45:00.860 --> 00:45:03.840
So, Marcus, thanks so much for being on the show.

00:45:03.840 --> 00:45:04.240
Yeah.

00:45:04.240 --> 00:45:05.100
Thank you for having me.

00:45:05.100 --> 00:45:05.380
Yeah.

00:45:05.380 --> 00:45:05.600
Bye.

00:45:06.520 --> 00:45:10.200
This has been another episode of Talk Python To Me.

00:45:10.200 --> 00:45:12.760
Today's guest has been Marcus Siemens.

00:45:12.760 --> 00:45:14.980
And this episode has been sponsored by Hired.

00:45:14.980 --> 00:45:16.640
Thank you for supporting the show.

00:45:16.640 --> 00:45:19.500
Hired wants to help you find your next big thing.

00:45:19.500 --> 00:45:27.980
Visit Hired.com slash Talk Python To Me to get five or more offers with salary and equity presented right up front and a special listener signing bonus of $2,000.

00:45:28.800 --> 00:45:30.780
Are you or a colleague trying to learn Python?

00:45:30.780 --> 00:45:35.480
Have you tried books and videos that just left you bored by covering topics point by point?

00:45:35.480 --> 00:45:44.100
Well, check out my online course, Python Jumpstart, by building 10 apps at talkpython.fm/course to experience a more engaging way to learn Python.

00:45:44.660 --> 00:45:51.380
And if you're looking for something a little more advanced, try my Write Pythonic code course at talkpython.fm/pythonic.

00:45:51.380 --> 00:45:57.860
You can find the links from this episode at talkpython.fm/episodes slash show slash 80.

00:45:57.860 --> 00:46:00.080
Be sure to subscribe to the show.

00:46:00.080 --> 00:46:02.280
Open your favorite podcatcher and search for Python.

00:46:02.280 --> 00:46:03.520
We should be right at the top.

00:46:03.520 --> 00:46:12.820
You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct RSS feed at /rss on talkpython.fm.

00:46:13.220 --> 00:46:17.920
Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix.

00:46:17.920 --> 00:46:24.620
Corey just recently started selling his tracks on iTunes, so I recommend you check it out at talkpython.fm/music.

00:46:24.620 --> 00:46:29.960
You can browse his tracks he has for sale on iTunes and listen to the full-length version of the theme song.

00:46:29.960 --> 00:46:32.060
This is your host, Michael Kennedy.

00:46:32.060 --> 00:46:33.340
Thanks so much for listening.

00:46:33.340 --> 00:46:34.520
I really appreciate it.

00:46:34.520 --> 00:46:36.660
Smix, let's get out of here.

00:46:41.780 --> 00:46:58.300
I'll see you next time.

00:46:58.300 --> 00:46:58.800
Open.

00:46:58.800 --> 00:46:59.300
you

