Learn Python with Talk Python's 270 hours of courses

#420: Database Consistency & Isolation for Python Devs Transcript

Recorded on Wednesday, Jun 7, 2023.

00:00 When you use a SQL database like Postgres, you have to understand the subtleties of isolation

00:05 levels from read committed to serializable. And distributed databases such as MongoDB offer a

00:11 range of consistency levels from eventually consistent to linearizable and many options

00:17 in between. Plus, it's easy enough to confuse isolation with consistency. To break it all down

00:23 for us, we have A. Jesse Giroud-Davis from MongoDB back on the podcast. This is Talk Python to Me,

00:29 episode 420, recorded June 7th, 2023.

00:33 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow

00:52 me on Mastodon, where I'm @mkennedy and follow the podcast using @talkpython, both on fosstodon.org.

00:59 Be careful with impersonating accounts on other instances. There are many. Keep up with the show

01:03 and listen to over seven years of past episodes at talkpython.fm. We've started streaming most of

01:10 our episodes live on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube

01:15 to get notified about upcoming shows and be part of that episode. This episode is sponsored by

01:22 Sentry. Don't let those errors go unnoticed. Use Sentry. Get started today at talkpython.fm slash

01:28 Sentry. And it's brought to you by InfluxDB. InfluxDB is a database purpose built for handling

01:34 time series data at a massive scale for real-time analytics. Try it for free at talkpython.fm

01:40 slash InfluxDB. Hey, Jesse. Hey, Michael. Great to have you here on the show. Welcome back to Talk Python to Me.

01:47 Thanks a lot. Yeah. It has been a little while since you were on the show. You were the second

01:53 guest ever. How about that? How cool is that? That's really cool. I knew it was a while ago. I

01:59 think it was 2015. And we were talking about Python and MongoDB, which is a natural subject, but I

02:06 didn't know it was so early in your career. Yeah, it was. You really helped launch the podcast. So

02:10 thanks for that. And then you also did a really popular episode about writing an excellent programming

02:16 blog. And we talked about fun things like design patterns of writing for technical writing. That was

02:23 really well received. So yeah, excellent to have you back. We're going to make it a little bit

02:28 more modern than just, you know, five, six years ago, whatever that was. Cool. So what have you been

02:33 up to? Give us maybe a quick intro for people who don't know you and catch up on what you've been up

02:38 to since then. You and I met because when I joined MongoDB around 2011, I was the Python evangelist,

02:47 which is still my favorite job title of all time. I'm still at MongoDB. And I've been doing various

02:55 sorts of engineering the whole time. I switched over to doing C and C++ and moved from doing Python

03:04 client library work. I had been working on PyMongo to working on the core MongoDB server and helped

03:12 develop the first version of serverless MongoDB, which is pay as you go. And now I'm a researcher

03:21 with MongoDB Labs, which is our tiny little research organization. And I'm looking at new products,

03:30 cutting edge techniques that we might want to adopt at MongoDB. All of those things sound super awesome.

03:36 Maybe take them in order if I remember them right. So you worked on PyMongo, which is if people use

03:42 MongoDB at all, they basically either use PyMongo or Motor, right? And did you also work on Motor? You did,

03:49 right? Yeah, I invented Motor and came up with the cute name. At the time, Tornado was one of the major

04:00 asynchronous Python servers. Very ahead of its time in that sense. There's way before async and await and

04:06 asyncio and all those things. Yeah, that's right. It was extremely influential. And I wanted PyMongo to

04:14 work well with Tornado. So I came up with this very complicated way to sort of asynchronize PyMongo

04:21 and make it work with Tornado. And I combined Mongo with Tornado to come up with Motor.

04:29 And it's still maintained, but not by me. And it's a good choice. If you want that asynchronous API,

04:37 it now supports async and await. And it now works with asyncio as well as Tornado. So if you have

04:44 some reason for wanting to do async Python already, and then you want to connect to MongoDB without fear

04:49 of blocking your application, then Motor is the driver of choice.

04:55 Yeah, absolutely. And much like if you're using, if you're doing synchronous MongoDB stuff in Python,

05:00 chances are using PyMongo. If you're doing async stuff, you're probably using Motor. For example,

05:04 the website you're looking at here is backed by MongoDB. That is Talk Python. And it uses Beanie,

05:11 which does basically Pydantic and async and await plus MongoDB. But the way you work with it is you

05:19 create a motor connection, motor client and hand it off to the underlying framework. So really,

05:23 it's still using your code. That's cool.

05:26 That's neat.

05:27 Yeah. Pretty neat. So I imagine those are two really different worlds, building client libraries to talk

05:35 to some semi-black box type of system, like I send requests over the API to Mongo and it does its thing

05:42 and I get a response to switching and being inside that box. Maybe contrast those two worlds,

05:49 because I think they're probably pretty different.

05:51 Yeah. The problem spaces are practically disjoint. When I was working on MongoDB drivers,

06:01 I had a great deal of concern for making the API usable by application developers. A lot of my time

06:08 was spent figuring out how to make a consistent experience for people who were using MongoDB from

06:16 Python and also from JavaScript or C or PHP. These are completely different kinds of languages,

06:23 but you need as much consistency as possible while still respecting the style of the language itself.

06:30 And then of course, since these are semantically versioned libraries, almost every decision you make is permanent.

06:37 On the server side, on the other hand, I was mainly concerned with how to implement algorithms that solved tricky problems.

06:46 And so we could change our minds every few years with sort of upgrade downgrade logic. It's complicated, but it's not permanent in the way that an API is. A lot of the problems that it was handling on the server side, first of all, I was working with C++ in a half million line code base.

07:04 So that was a great deal more complexity than I'd ever confronted before.

07:10 And it's probably really super polished and had every little change probably has, you know, many, many knock on effects that you've got to carefully think about. And you're like, ah, do we really need to check for that? Would this ever happen? Or can we reorder those bytes? Probably it's fine.

07:27 There's a lot of hidden complexity that people who've been working on the server code base longer than I kept pointing out to me. Like, no, you can't just change this data structure. You have to take the following six locks before you can even think about touching that. Then interactions among the servers in a replica set or a sharded cluster are literally exponentially complex.

07:52 Yeah. Like n factorial type of thing.

07:54 Right.

07:54 Okay.

07:56 Okay. And C++ versus Python. That's a pretty big, pretty big distinction there.

08:02 Yeah. I had coded C++ right out of college. I thought I was going to be a 3D graphics guy working for Pixar, which never happened, but I had known C++ once upon a time. But C++ in the nineties is a completely different language from modern C++. So I had a lot of catching up to do. On the other hand, I really enjoyed the fact that you can make it a lot of catching up to do.

08:26 I had to make things run fast. And I hope that this is not offensive.

08:32 And you can make algorithms and you can make algorithms more efficient, but you can't really make your code run all that fast. And I found it really enjoyable to write C++ and have things finish in microseconds.

08:48 Right. And you're about as fast as it gets, unless you're going to go do assembler.

08:53 And maybe not.

08:54 Maybe you can make it a compiler.

08:55 And probably not. Certainly not for me.

08:56 Yeah, exactly.

08:57 Yeah. And who would want to write a similar, right? Especially. Oh my gosh.

09:02 Yeah. I think the story with Python performance is interesting. A lot of times it's plenty fast for what people need to do. But if you're building a server, like a high-end database server, you know, those microseconds count. And, you know, that's a different world, right? It's a different trade-off. Trading somewhat developer speed for performance and code speed.

09:22 Although I do think we're getting, you know, some proper attention on Python speed in the last couple of years and will for the next couple as well. With, you know, the faster CPython initiative and 3.10, 3.11, 3.12, all that stuff. But it's still, even the fast versions of those are not, you know, C++ type of speed.

09:42 Yeah. It's a fundamentally different way of executing code and they're never really going to overlap.

09:49 Maybe someday we'll get fully compiled Python. Who knows what the future holds, but until then. As long as it's interpreted, probably not. All right. The third thing you mentioned, which sounds interesting as well, is MongoDB Labs.

10:02 Can you give us an example of some of the things that have come out of there or some of the types of problems you're researching, anything like that? How secret is the lab? Is it like a skunk's work at Lockheed Martin? Or can you talk a little about it?

10:15 I can certainly talk about it. It's small and fairly new, a couple of years old, about a dozen people working on a number of things. One of them is streaming data processing, which I think we'll be able to announce quite a bit more.

10:31 Is this like a high speed time series data? Like I'm hooked up to some kind of pipe to the NASDAQ or something like this? Or what's an example?

10:40 Right. Where you've got a source of data events, not necessarily stored in any database anywhere, but coming in as a continuous stream of events.

10:52 And you want to connect stream processors in some sort of kind of network of pipes and nodes and eventually drop the results into MongoDB or another data store or send it off to another service.

11:08 MongoDB Labs has been a place where we can incubate some of those ideas and make them available in our developer data platform.

11:20 Sounds like a really cool, cool place to work. Just sort of playing with ideas and got the time and space to do that, right?

11:26 Yeah. Labs has also been a place where we incubate cryptography ideas, like queryable encryption, where MongoDB doesn't know the contents of your data, but can nevertheless answer queries about it.

11:41 I personally have been a lot of the data. I personally have been working on improving the debugging experience for people who are writing complex aggregation pipelines.

11:51 And then what I'm working on right now is predictive scaling for Atlas.

11:57 The idea is that a lot of customers have really regular weekly business cycles or daily business cycles.

12:04 Right.

12:04 Like you might have Monday through Friday, traffic gradually increases for around 9:00 or 10:00 AM, and then it drops off at the end of the day.

12:13 And then you have a huge spike at midnight when all of your nightly analytics queries go off and then the weekend is quiet.

12:18 We should be able to detect those patterns and automatically scale you up and down so that you have the capacity you need just before you need it.

12:30 And then you don't pay for it at a time when you predictably do not need it.

12:34 Actual deployment of that idea? I have no idea how far off that is, but that's what's nice about labs is that we are working on things that the larger company doesn't have scheduled yet.

12:47 You can experiment, right? That's, I mean, it's a lab.

12:51 Exactly.

12:52 This portion of Talk By The Nome is brought to you by Sentry.

12:56 Sentry. You know that Sentry captures the errors that would otherwise go unnoticed.

13:01 Of course, they have incredible support for basically any Python framework.

13:05 They have direct integrations with Flask, Django, FastAPI, and even things like AWS Lambda and Celery.

13:12 But did you know they also have native integrations with mobile app frameworks?

13:17 Whether you're building an Android or iOS app or both, you can gain complete visibility into your application's correctness,

13:24 both on the mobile side and server side.

13:27 We just completely rewrote Talk Python's mobile apps for taking our courses.

13:32 And we massively benefited from having Sentry integration right from the start.

13:36 We used Flutter for our native mobile framework.

13:39 And with Sentry, it was literally just two lines of code to start capturing errors as soon as they happen.

13:45 Of course, we don't love errors, but we do love making our users happy.

13:49 Solving problems as soon as possible with Sentry on the mobile Flutter code and the Python server side code together

13:56 made understanding error reports a breeze.

13:58 So whether you're building Python server side apps or mobile apps or both,

14:04 give Sentry a try to get a complete view of your app's correctness.

14:08 Thank you to Sentry for sponsoring the show and helping us ship more reliable mobile apps to all of you.

14:14 Tell people about Atlas a little bit.

14:18 You talked about your predictive scaling in Atlas.

14:21 I imagine not everyone knows what that is.

14:23 Sure. Atlas is MongoDB's cloud services.

14:27 And we originally launched it as a database as a service.

14:33 So you decide how you want to deploy your database, a replica set or a sharded cluster,

14:39 which cloud providers you want to use.

14:42 We allow you to use multiple cloud providers and what size of server you need.

14:52 And so we would manage backups and administration and that sort of thing, upgrades and so on.

14:59 More recently, we've announced Atlas serverless, which is pay as you go.

15:04 So you no longer have to worry about how your database is deployed, what instance size you don't

15:09 need to do capacity planning.

15:11 We just auto scale you and bill you for what you used.

15:15 We've also got a few other services, which I'm not an expert in, but we've got Atlas data federation,

15:21 which helps you move data among different services, both MongoDB services and other ones.

15:26 And those are kind of the highlights as far as I'm aware.

15:31 All right. I'll put a link to the GitHub organization for MongoDB Labs up there.

15:36 There's some cool looking repos and also some funny names like Cobra to Snooty.

15:41 We don't have to be professional over here. It's nice.

15:44 Exactly. You can just have fun with it.

15:47 Yeah. Excellent.

15:48 All right. Well, let's talk a little bit about databases in the broad sense,

15:53 and then we can dive into the core ideas that I invited you here, which I guess is worth pointing

15:59 out. The reason I knew about this and reached out to you was you gave a talk at PyCon 2023.

16:07 Maybe tell us a bit about that experience before we jump into the databases.

16:11 It's really nice to have PyCon back after COVID. I went to PyCon at Salt Lake City last year and this year.

16:18 And last year, I spoke about modern concurrency patterns in Python.

16:24 And this year, I talked about consistency and isolation.

16:28 And I also learned a lot after being a C++ programmer and going to PyCon and being not sure what I was

16:34 doing at PyCon anymore. This year, I came as a researcher. And so, the areas of my interests are

16:43 essentially everything. And so, I went to a bunch of talks and learned a bunch of stuff and renewed my

16:50 love for being at PyCon.

16:52 Oh, that's excellent.

16:53 I wanted to talk about consistency and isolation at PyCon because these are fundamental database

17:01 concepts. They are some of the hardest to learn that I've ever encountered in computer science.

17:08 And I kind of think that most of the approaches are bad. You can read the fundamental papers and you

17:16 probably should, but reading the original papers is a very hard way to learn something. And they tend to be

17:23 too concise and too abstract and not very well digested. And then maybe if I'd taken that

17:29 database's elective in college and read the textbook, I would be in better shape, but I didn't. And then I

17:37 joined a database company and I had to learn it on the job. So, I came up with a few ways of thinking about

17:45 learning about it, which took me a few years. So, I wanted to come to PyCon and share those and

17:51 hopefully accelerate other people's learning.

17:54 Yeah, I imagine everything about a conference like that is really different if you come with a

17:59 researcher's mindset. You go hit all the expo booths and you're like, "All right, I need your ideas.

18:05 Tell me all about this with a special focus." Right?

18:10 So, all right.

18:14 So, we could start with just, you know, not every database is the same.

18:18 I think long ago, people, when they said database, they just meant relational database.

18:24 Mm-hmm .

18:25 Right? And nowadays, there's more variety. And by nowadays, I'm thinking the last 15 years,

18:31 right? It's not just today.

18:34 Yeah.

18:34 Let's just maybe get a, just a quick high-level landscape view of the different kinds of databases. So,

18:40 relational, that's probably what most people are using.

18:44 Mm-hmm .

18:45 Yeah. I'm going to just give us a rundown of your thoughts on how this category and this categorization goes. Taxonomy, I guess.

18:52 Yeah. Relational databases, which are made of tables of rows and columns,

18:58 and you almost always query them with SQL, which is standardized, although every database has its own

19:07 extensions, came to really dominate in the '90s. And they are great for a lot of reasons. And I think

19:17 everybody should know how to use them. But, you know, around the time that I joined MongoDB in

19:22 2011, something was happening, which is that the scale of data came to the point where you needed

19:29 specialized approaches. And then another thing that was going on was that, sort of funny enough,

19:35 object-oriented programming and relational databases had both come to dominate at the same time,

19:40 and they worked very, very poorly with each other.

19:43 Yeah, that is funny, but that did happen.

19:44 Yeah.

19:45 Because relational, I mean, a relation is sort of the object opposite of an object. Object-oriented

19:51 programs are quite hierarchical and fairly flexible and relations are not hierarchical and extremely

19:58 inflexible. So, we call that the impedance mismatch. I don't know if that's actually a helpful term.

20:03 It's just, it's bad.

20:05 The object-relational impedance mismatch, if people are familiar with that term. I haven't thought of

20:09 that for a while, but yeah, that's, you know, that was a big concern often.

20:13 NoSQL was kind of a movement, and it encompassed a number of solutions to these problems.

20:23 One of them is that NoSQL databases tend to be distributed. So, before I came to MongoDB, I was

20:29 working with an Oracle database. And as load increased, we just needed to buy a bigger and

20:34 bigger single box until, you know, we had like a million dollar refrigerator-sized thing,

20:40 which must never ever go down.

20:41 Many of the NoSQL databases, including MongoDB, are distributed. So, you can use a large number of

20:50 smaller machines, which is much more economical.

20:52 Much more cloud-friendly.

20:54 Much more cloud-friendly. It's not only a good way to scale your CPU and RAM, but it's also a good

21:01 way to ensure reliability and geo distribution. So, we took advantage of all that. We kind of built

21:07 in the distributed nature of it quite early. And a lot of the other NoSQL databases did as well.

21:14 We're also a document database, with the data format is a lot like JSON, and it's easily convertible to JSON.

21:24 And that means that it's very familiar for people who write Python or JavaScript or anything like that.

21:30 We store things that are a lot like dictionaries and lists.

21:33 And they're...

21:35 Very API-friendly, right? Like you're exchanging JSON. So, you're 95% of the way there,

21:41 just on your API data exchange often.

21:43 Yeah, if you want to expose a MongoDB collection as something to a REST API, you can do that in a

21:48 couple of lines because JSON and MongoDB data are so similar.

21:52 All right. So, document databases is probably the best known NoSQL, but we also have

21:56 key value stores and column-oriented databases. Still trying to grok them.

22:02 Yeah. Key value stores are just like big dicks in the sky.

22:08 And in exchange for that simplicity, they're usually extremely fast and extremely robust.

22:13 Memcached, for example. And column-oriented databases, I'm still wrapping my head around.

22:19 So, I'm not going to talk about that all that much. Yeah.

22:22 But it's my impression that they're good for giant analytics jobs where you tend to need to do a huge aggregation, like find the sum or the mean

22:31 of a very large amount of consistent data.

22:36 Maybe Pandas is a good mental model or NumPy or something like that, where you say,

22:42 I'm going to apply this operation to this whole column and you don't want it on a per user basis.

22:46 You want to say, I want all the latency times as a thing, right? And that's a more natural thing to ask

22:54 for instead of projecting out the latency across all of them, that kind of thing.

23:00 Graph databases, I think, seem to be going strong still.

23:04 And this is an area where MongoDB hasn't, has kind of left it to our competitors for the moment.

23:11 But graph databases are great at representing nodes that are connected by edges. So, for example,

23:17 a social network of people with friendships or a network of servers that are connected by ethernet

23:24 cables. And you want to do queries like, how closely connected is this to that?

23:29 Maybe even modeling like hierarchies within a large corporate organization or something.

23:35 Right.

23:35 This person reports to that person and then, you know, those kinds of things.

23:39 Yeah. Okay. So I think that sets the stage for a lot of what we're talking about. You know,

23:44 what ones for, in terms of consistency and isolation, this applies to relational document,

23:51 probably not key value stores. I don't know about graph databases at all in terms of these,

23:56 which ones of these are kind of relevant to the main topic here?

23:59 Actually, it's interesting. So first of all, let's separate out that isolation is a way of height.

24:07 So even on a single machine, a database can run concurrent operations. So you could have two

24:14 transactions that are going on at once and their operations may be in some way interleaved. And if the

24:21 database allows concurrency like this on a single machine, then that can reveal phenomena that you

24:31 would not observe if concurrency were not allowed. And these phenomena are called anomalies. These terms,

24:39 phenomena and anomalies go all the way back to these, maybe to the 1970s database theory. There's nothing

24:46 specific to relational or non-relational data to the SQL language or any other query language. So long as

24:54 the database allows concurrency, then anomalies are possible. And so the database may choose to provide

25:03 isolation levels to you. There are four isolation levels that people have probably heard of that are

25:09 in the SQL standard. And so obviously they've got that connection to the relational model and the SQL language.

25:17 But in my PyCon talk, I was just showing a key value store.

25:20 Yeah, that's right. You actually were kind of more or less writing the code for the different implementations.

25:27 They're demonstrating the code for the different implementations of how you might do a key value store.

25:33 And so what you did was you said, let's imagine we just have a giant dictionary in memory.

25:38 And that was our database, at least for a table or a collection. Right. And then like, how do we

25:42 model these isolation levels in Python code? Right. Yeah, right. Exactly. So there was no SQL involved.

25:49 The data was just a dict, but I was showing how you could use Python locks to provide each of the four

25:58 well-known isolation levels and allow concurrency. All right. So I interrupted you a tiny bit there.

26:03 What are the four isolation levels? At least the SQL standard ones. Right. So there's read uncommitted,

26:10 which is anything goes. YOLO. YOLO. Every operation that one transaction does is immediately visible to all

26:18 the others or may be visible to all the others, even if the transaction doesn't, hasn't committed yet,

26:24 even if it aborts later. Right. And so the concurrency is nakedly displayed to all of the clients.

26:32 Basically, you don't see this in practice. Read committed is what people are much more accustomed to,

26:38 where your transaction as it's going along may see the data change as other transactions commit,

26:46 but then its own writes are only visible to other transactions all in one instant at the moment that

26:54 your transaction commits. But of course, you could read the same value multiple times in a row and get

27:01 different answers because other transactions are allowed to commit and modify it as you go.

27:05 Right. I imagine that that's a pretty common level. The read uncommitted is just chaos, right? It's like

27:11 It's like multi-threading without locks, basically. Yeah.

27:15 Without any locking mechanism or protection. While that would be the fastest, it's probably too risky.

27:22 But read committed, how common do you think that is?

27:25 Read committed is quite prominent. It's the default for a number of the SQL databases. I don't remember

27:30 which exactly, but people can live with it pretty happily. And to be honest, MagoDB's default is read

27:37 uncommitted. If you write to the primary, other clients can see those rights immediately. There's

27:45 no transaction by default. After all, you have to opt into transactions on MagoDB.

27:50 There are atomic changes you can make. Like you can use the set and push those kinds of operators

27:57 on a single document. But as soon as you start talking to two documents in the same collection or

28:03 cross-collection, then this is what you're talking about. There's no transaction, right?

28:07 Yeah, that's exactly right. And the document model does make it much more practical to do things without

28:11 transactions because you can keep related data all together in a single document and update it in one

28:18 statement. Yeah. Whereas the relational model tends to kind of spray your data around and make it much more

28:23 difficult to maintain whatever application invariance you want because you have to modify multiple

28:31 rows at the same time. Yeah, multiple rows, multiple tables. Like there could be a many-to-many relationship that you're

28:37 adding to or taking away from, right? Just to update part of some statement.

28:43 Yeah, right.

28:44 This portion of Talk Python to Me is brought to you by Influx Data, the makers of InfluxDB.

28:51 InfluxDB is a database purpose-built for handling time series data at a massive scale for real-time analytics.

28:59 Developers can ingest, store, and analyze all types of time series data, metrics, events, and traces in a single platform.

29:06 So, dear listener, let me ask you a question. How would boundless cardinality and lightning-fast

29:11 SQL queries impact the way that you develop real-time applications?

29:14 InfluxDB processes large time series datasets and provides low-latency SQL queries, making it the go-to choice

29:22 for developers building real-time applications and seeking crucial insights. For developer efficiency,

29:28 InfluxDB helps you create IoT analytics and cloud applications using time-stamped data rapidly and at scale.

29:35 It's designed to ingest billions of data points in real-time with unlimited cardinality.

29:40 InfluxDB streamlines building once and deploying across various products and environments from the edge,

29:47 on-premise and to the cloud. Try it for free at talkpython.fm/influxDB.

29:53 The link is in your podcast player show notes. Thanks to Influx Data for supporting the show.

30:01 The next one is serializable, which takes a while to wrap your head around. But with serializable isolation, there is a total order of operations that every client sees that is as if each transaction ran one at a time.

30:22 InfluxDB – And each one read at one moment of time and then committed at one moment of time with no other transactions operations interleaved.

30:32 InfluxDB – So it's hard to explain. It's also extremely intuitive. It's almost what you would assume.

30:37 It's kind of like assume that there was no concurrency. This is what it would look like.

30:42 The pedantic detail here is that the order that the transactions appear to occur might not be exactly the order that you did them in for complicated implementation reasons.

30:54 So if you have multiple different databases talking to each other, it might not be good enough for you because they might end up choosing different orders of operations.

31:05 So you can see, you can see anomalies there. But basically, serializable is the highest isolation level that you're likely to need in a relational database.

31:14 I feel like that might be the default for some of the relational databases. Not 100% sure.

31:20 I'm not 100% sure either. There's also a compromise, less than serializable, called repeatable read, which means that any single piece of data that you've read, you'll continue to see the same value for it.

31:34 So you'll see that in the same way.

31:38 And the way it's usually actually implemented is a slightly stronger level called snapshot isolation, where you just get a copy of the data at the point in time when you first started your transaction, approximately.

31:51 And you just read as if you are always reading from that version of the data until you commit.

31:56 So snapshot isolation is what people usually actually mean by repeatable read and a lot of databases provided.

32:05 And it's the default for MongoDB. When you start a transaction in MongoDB, you read from a version of the data for the rest of that transaction.

32:12 For the rest of that transaction.

32:14 Okay, interesting. Yeah, you did say that MongoDB typically doesn't have transactions as it's kind of recommended default way, you know, look at a lot of the tutorials and stuff. People are just making updates and so on. But this is

32:26 didn't come out originally with MongoDB, but at some point, what version did you all add transactions? Like actual transactions?

32:35 I wish I could remember.

32:37 Yeah, I don't remember either.

32:38 But I feel like, oh, there we go. How about that?

32:40 Oh, 4.2.

32:41 Jumping around for, I think version four-ish.

32:44 Okay.

32:45 Which we're on version six now, but that was quite a while ago.

32:47 Yeah.

32:47 You've got to pick to go do those transactions in Mongo, for example.

32:51 Right.

32:51 Whereas that's also true, say for Postgres or Microsoft SQL Server, right?

32:58 You've got to, you've got to actually do the transactional stuff in whatever code you're using to talk to that as well.

33:04 So it's just more, more visible in a lot of the tutorials and examples for those libraries, I think.

33:11 MongoDB in general has taken a more kind of show our guts to you approach where we,

33:17 early on the distributed nature of MongoDB, which was much more visible to our users than other databases

33:24 and transactions. Now we kind of show you the details a bit more than other databases do,

33:34 but that actually allows you to write more reliable code. The interesting thing about an SQL interface is

33:42 that you can start a transaction, do a bunch of writes, and then send the commit message. And if you

33:49 get some sort of error, like a network error, perhaps you got disconnected or the server crashed, you don't

33:58 know. And you don't know whether your transaction committed or not. And if you reconnect, you don't know whether you will see the data,

34:02 you don't know whether you will see the data that you committed or not. This is somewhat difficult to handle.

34:07 But that's what you're trying to do. And most people aren't aware of it. The MongoDB transaction API makes you think about this. The drivers essentially have you pass a function in, which executes the transaction code, and that it's automatically retried if the commit fails. And so we give you the mechanism to ensure reliable transaction commits.

34:29 I see.

34:31 You can either be sure it happened or you have the mechanism to run it again.

34:35 Exactly.

34:36 Yeah.

34:36 Yeah. Okay. Interesting. Let's look at two aspects here. One of the things is you spoke about anomalies and what might go wrong. The read uncommitted, I think people can probably conceptualize that pretty well. Right? It's just as multi-step transactions are happening, other ones are potentially running. And you could read something. Either that transaction could roll back after you've carried on, or you could have just, you know, had some

35:03 Speaker 1: The equivalent of a race condition. Right?

35:06 Right.

35:06 Right. Right. So, is there a, there's a term for that kind of anomaly, right? Each of these anomalies have terms I've learned.

35:12 And you can definitely memorize them. And if you go to Jepson.io, there's a lovely diagram of the relationship among all of the consistency and isolation levels.

35:30 Speaker 1: G-Y-P-S-U-M?

35:33 That is J-E-P-S-E-N.io.

35:37 Ah.

35:38 Yeah. This is a researcher named Kyle Kingsbury, his website about consistency and isolation and testing. This is the best place, I think, to go learn about this stuff.

35:51 Yeah. There's a lot of cool visualizations and stuff here.

35:53 Yeah. For those watching on the live stream, we've got a tree diagram up that shows all of the isolation levels, all of the consistency levels that are commonly used, and how they relate to each other in terms of how, let's say, the serializable isolation is strictly stronger than read committed.

36:15 Speaker 1: Mm-hmm.

36:16 Speaker 1: Every anomaly that is prohibited by serializable is also prohibited by read committed.

36:22 Speaker 1: Got it. Maybe help us understand the read committed anomalies versus, like the read uncommitted one is, like, it's full of them. But maybe help us understand some of the things that can go wrong in the safer ones. Like trying to decide between read committed and serializable, for example.

36:37 Speaker 1: Right. This was also my approach when I first started learning.

36:41 Speaker 1: Right. This was also my approach when I first started learning.

36:43 Speaker 1: Was to try to memorize these things.

36:45 Speaker 1: Hmm.

36:46 Speaker 1: The reason why I'm not answering your question is that I don't think that this is the right approach.

36:49 Speaker 1: What would you suggest?

36:50 Speaker 1: I think the right approach is to step back and ask, why do anomalies exist?

36:58 Speaker 1: And the answer is, I've come to understand it, is that databases want to permit more concurrency so that you can get higher throughput with multiple transactions at once.

37:11 But you've got to sacrifice something for that. What you've got to sacrifice is isolation. Some of these anomalies have to appear.

37:17 Speaker 1: And so why is that? Why do you have to make that trade-off?

37:20 Speaker 1: Well, the short answer is that databases prevent anomalies by in some way locking pieces of data to prevent one transaction from modifying or even reading it if another transaction has modified or read it.

37:35 And so the more things you lock, the less concurrency is permitted.

37:38 Speaker 1: And if you want to understand that in detail for each of the isolation levels and each of the anomalies, you could watch the video of my PyCon talk, which...

37:49 Speaker 1: Yeah.

37:50 Speaker 1: ... goes through like 20 or 30 line long Python implementations of databases that provide each of these isolation levels.

37:58 So you can see why they need different amounts of locking. You can see why they permit different levels of concurrency.

38:06 And you could start to get a feel for what amount of concurrency each of them permits and also what sort of anomalies each of them permits.

38:17 Speaker 1: And with that in mind, for me at least, thinking about how these things are actually implemented gave me a much... made it much easier for me to then memorize.

38:26 Speaker 1: Sure.

38:27 Speaker 1: Read committed provides phantom reads. Why is that? What does that mean? Now I understand that because I understand how the... how an implementation might work.

38:37 Speaker 1: I think that makes a lot of sense. And that really is the relationship that people should understand, right? That there's an inverse relationship between...

38:46 Speaker 1: ... sort of the data consistency and the lack of these anomalies and how much you can handle scale and concurrency.

38:55 Speaker 1: Right.

38:56 Speaker 1: Right.

38:56 Speaker 1: Right. The stricter, the more consistent the data is, the less scale that you get.

39:00 Speaker 1: Right.

39:01 Speaker 1: And so where... where do you live? Can you get to points where the database actually like locks up and kind of a deadlock situation?

39:05 Speaker 1: And I show one of those in my PyCon talk, the serializable level of isolation is particularly prone to this. Basically if... and I do this all with Star Trek memes. So I show an example where Sulu reads one piece of data and something which... they're fighting over who gets to check out the shuttle to go to surface for shore leave. And Sulu reads one piece of data.

39:24 Speaker 1: And Sulu checks if Uhura has the shuttle and he sees no. But then Uhura starts a transaction and checks if Sulu has the shuttle. The answer is also no. So each of them tries to then check out the shuttle. But since they have each locked the row that the other needs to modify, they then

39:53 Speaker 1: deadlock. This then gets into fancy old database theory of deadlock detection and resolution, which is the subject of many textbooks. But essentially you're probably going to block for some period of time. And then a deadlock detector will come along and abort one of the transactions to allow the other to continue.

40:17 Speaker 1: Right. Making your code more complex, harder to work with. Right.

40:21 Speaker 1: And now your code needs to be able to handle this situation. If you are aborted due to deadlock, should you retry that transaction or not? That's now something that the developer needs to make a decision about.

40:33 Speaker 1: Okay. So all of this that we've discussed so far has to do with one or more database servers that just... the requirement is that it allows concurrent queries and updates and all that. On the other side, we might have...

40:50 Speaker 1: Some kind of distributed topology with...

40:54 Speaker 1: In MongoDB case, we have both replication and sharding, which may be worth touching on those things. But in a lot of scale out situations, you know, you have some sort of data spread around some kind of distributed database or even you'll see like geo replicated databases. You know, I want to have my data replicated in Asia and the US so that we can run our server side code near...

41:08 Speaker 1: Data for those different users, right? Right.

41:14 Speaker 1: Right.

41:14 Speaker 1: That's the consistency, not the isolation side of your talk, right?

41:18 Speaker 1: That's right. So to review, isolation is a response to anomalies caused by concurrency on one machine. And then...

41:24 Speaker 1: Consistency is a response to anomalies that are due to replication in a distributed database. So the distributed database, you always write to one of the nodes or you read from one of the nodes.

41:42 Speaker 1: For any particular operation. And there are different rules about, is there only one leader that can take writes or can any node take writes? Is there... Can you only read from the leader? Can you read from any node? Can you only read from some of the nodes?

42:07 Speaker 1: But no matter what database you're using, you always read or write to some nodes and then they replicate writes to other nodes. And so there's always a lag because that replication takes time. The most obvious example is if you write to the leader and then immediately read from the follower, the data that you just wrote may or may not be there yet.

42:32 Speaker 1: So that's the source of inconsistencies. And those inconsistencies are called anomalies as well. And then there are multiple isolation levels, sorry, consistency levels, which allow or prevent various of those. This is not standardized, by the way. So here you're going to see different terms. And then even more upsetting, you'll see some of the same terms.

43:00 Speaker 1: You'll see some of the same terms, but they mean different things depending on which database documentation you're using or which paper you're reading. So I talked about three levels of eventual consistency, causal consistency, and linearizability in my talk, which I think is kind of a good sample. But this area of computer science is a lot less paved than isolation is.

43:28 Speaker 1: Yeah, I agree with that. And it also seems to me like it really matters for the particular database server that you're using, what its flavor of distributed means.

43:38 Speaker 1: Yeah.

43:38 Speaker 1: Right? Like my understanding from MongoDB is replication is largely about reliability, failover, uptime, but the possibility of reading from a replica for some read scaling.

43:50 Speaker 1: Exactly.

43:50 Speaker 1: Whereas you might, you might have another one kind of with the example that I talked about with like, we want our data located in multiple geographies.

43:56 Speaker 1: And all of them are kind of the local database for those, those areas. And so the types of issues you run into, as well as the words you use, they probably vary somewhat, right? Because you're kind of solving different problems.

44:10 Speaker 1: Mm hmm.

44:11 Speaker 1: Well, we both know the MongoDB one pretty well. So maybe give us the story on like a replica set, which is, I'll let you talk about it. Yeah.

44:18 Speaker 1: What's the motivation of a replica set and what are the challenges and different modes there?

44:25 Speaker 1: 90%, 95% of people use MongoDB deploy it as a three node replica set. And 90 to 95% of the time, as you said, their goal is failover. That if the primary goes down, they want a very different way.

44:46 Speaker 1: That if the primary goes down, they want a very recent hot copy of their data available in a secondary, which will be promoted to primary as quickly as possible.

44:54 Speaker 1: And if you're writing to and reading from the primary, somewhat surprisingly, you can still see anomalies because there could be a failover in between the time that you wrote to the primary and the time that you do that read.

45:08 Speaker 1: You might be reading from a different member, which didn't get the copy. And so we've changed the default in the last few years, I think, to make every write weight to be replicated to a majority of the members.

45:26 Speaker 1: So that, and then we've got a protocol that ensures that whoever becomes primary after a failover, therefore is guaranteed to have that data.

45:34 Speaker 1: So you were with that setup, you're pretty well protected from anomalies. I'm sure that pedants and stress testers have found exceptions to this. I'm not going to make any promises.

45:46 Speaker 1: Yeah. Maybe if your right to the primary is the thing that takes it down potentially, you know, something really, really instantaneous almost.

45:54 Speaker 1: There are always edge cases.

45:56 Speaker 1: Yeah. Potentially. Yeah. Okay.

45:58 Speaker 1: The real inconsistencies that you start to see is if you do secondary reads. So if you read from a follower, that can be useful because you're shifting load from the primary or maybe the follower is located at a lower latency location on earth to you, but it's always going to have some degree of lag compared to the primary. So by default, you're going to get what we call eventually

46:24 eventual consistency. Any right that occurs on the primary that gets acknowledged by a majority of the members will eventually be replicated to all of the members. So all of the members will pass through the same series of states as the primary does.

46:42 Speaker 1: Okay.

46:42 Speaker 1: At least the same set of states that the primary got majority acknowledgement of to be extremely technical. But if you read from the primary and then a secondary and then a different secondary, you'll feel like you're doing a lot of

46:54 jumping around in time because you'll always be reading a different version of the data and some of those versions will be older than the last version that you saw.

47:02 Speaker 1: In MongoDB, in order to make that happen, you've got to pass extra flags opting into this read from secondary thing, right? And you're in the driver.

47:12 Speaker 1: Yes, that's right. We call that read preference and there are a bunch of options, but the default is just to read from the primary and not see a lot of consistent inconsistencies.

47:20 Speaker 1: So you mentioned a couple of this is the eventual eventual consistency issue. What are some of the other consistencies that you talked about?

47:28 Speaker 1: There's causal consistency, which I think is quite nice. You can get it in MongoDB by using the sessions API and causal consistency ensures that every right that you do and everything affected by that right, you will be able to read its consequences.

47:48 Speaker 1: And here again, I think talking about the implementation really makes things a lot clearer than talking about like the abstract mathematical definition.

47:58 Speaker 1: Of course, yeah.

47:58 Speaker 1: So here's the way MongoDB does it. You connect a client, you do an update. The primary applies the update, sends it to, waits for a majority of members to acknowledge it. And then the primary also increments a counter. So let's say that counter is now has the value four.

48:16 Speaker 1: And so replies to the client and says your update succeeded and the counter values now for now you can read from a secondary. You can say I want to read some value, but don't reply until your counter is at least for now in the background.

48:34 Speaker 1: Secondaries are replicating from the primary. And they're also replicating the primary counter value. And so only when they get to the number four or past, do they reply to your query. And they're also guaranteed at that point to have applied to that update that you just sent to the primary.

48:52 Speaker 1: Yeah, that's a really cool solution. You know, the one of the problems, if you're allowed to read from secondaries is imagine you're going to create a new account, let's say on a website. Go in there and say, here's my information.

49:02 Speaker 1: Yes, my password has a lowercase and uppercase and a special number and you know, whatever, right? Say create it inserts it into the primary response redirects back to the server and says, great, you're on your account page. Let me just pull back from the database who you are to show your details on the page.

49:21 Speaker 1: And if you know that's basically instant right at the bit down to ping time to the server and you could potentially end up in a situation where you've just created an account, but then you hit a replica that has yet to receive that. So what you're saying is if we use this concept of sessions, we'll get some kind of point in time marker that we're going to wait until just MongoDB,

49:48 Speaker 1: DB behind the scenes will basically block and say, we're still waiting on that answer. Hold on for who it is until that replica makes that point in time or further.

49:57 Speaker 1: Exactly.

49:58 Speaker 1: Okay. That's excellent. Question from the audience from Marwan says, how do you keep the counter store consistent? If you need to replicate it across regions?

50:06 Speaker 1: The counter at any given moment in time, we'll have different values on different replicas, but that's actually its purpose is that it represents how caught up each of the members is on their shared sequence of operations. Once you have replicated a given operation, you've also updated your counter value to the counter value that the primary had when it did that operation.

50:24 Speaker 1: And so you're now consistent, but the primary may be ahead of that as well. Like there's never any absolute truth. There's only a sequence of operations in your position in it.

50:43 Speaker 1: I think another important piece of information here is that you're writing to the leader. The leader always knows what its point in time number is and it can increment that.

50:55 Speaker 1: That's right.

50:55 Speaker 1: Right. And so that thing's always going to be consistent and auto incrementing forward. It's just a matter of how caught up are the replicas, right?

51:01 Speaker 1: Exactly. And if you want the MongoDB terms for these, that sequence of operations is the op log and that counter is the op time.

51:13 Speaker 1: Yeah. And that's basically that op log. That's the thing that's gets pushed to the replicas and is copied as it goes. Yeah, exactly.

51:21 Speaker 1: Which brings us a little bit back full circle to your talking about like these time series stream data. It's kind of like replicating across these clusters.

51:28 Speaker 1: Yeah, that's right. The op log is a little bit back full circle to your talking about like these time series stream data. It's kind of like replicating across these clusters.

51:28 Speaker 1: Yeah, that's right. The op log is a little bit back full circle to your talking about like these time series stream data. It's kind of like replicating across these clusters.

51:29 Speaker 1: Yeah, that's right. The op log is the original streaming data at MongoDB and people have done all sorts of hacks on top of it. And we're making that kind of mechanism more and more general.

51:42 Speaker 1: Interesting. So you can do all sorts of different things with streams of operations.

51:46 Speaker 1: What should we throw in here before we call it?

51:48 Speaker 1: We can mention the final consistency level, which is called linearizability. And it's pretty easy to understand. If you do an operation and then

51:59 Speaker 1: You try to read the results of what you just did from any member, you are guaranteed to see that result. It's pretty much the strictest level of consistency, but it's also quite expensive and slow. So MongoDB does provide this, but it requires a lot of machination behind the scenes. So don't use it unless you need to. But if you do need it for something like if you're updating users password, where you want to make sure that you're going to be able to update your password.

52:28 Speaker 1: To make sure that every attempt to read that password will always get the freshest copy, then linearizability is the consistency level to use.

52:38 Speaker 1: Excellent. Yeah, they all sound pretty straightforward, but the consequences of choosing these different levels and then what that means for how you write code around those systems is pretty complex. And it also just this whole conversation has made me appreciate how much databases serve as the actual conclusion.

52:57 Speaker 1: Concurrency coordinators of modern applications.

53:01 Speaker 1: That is a great point.

53:01 Speaker 1: Yeah.

53:02 Speaker 1: I mean, you can write web apps or APIs or queues and just kind of almost forget that concurrency is happening and you just talk to the database and how come, you know, how can you forget that? Because it falls upon the database to like keep this stuff hanging together.

53:16 Speaker 1: Yep.

53:17 Speaker 1: Cool.

53:18 Speaker 1: Yeah.

53:19 Speaker 1: In the show notes or wherever we should drop a link to a blog post that I wrote, which has a link to lots of papers and other places where you can learn more because this is a very hard topic.

53:30 Speaker 1: To learn, especially from a podcast or a conference talk, you need to read multiple times, maybe make flashcards. But I think this way of talking about things where we think of, okay, isolation is a way of hiding the consequences of concurrency and consistency is a way of hiding the consequences of replication. That was a useful breakthrough for me. And so I hope it's useful for other people too.

53:56 Speaker 1: Yeah, I'm sure it will be. And I'll definitely link this article in the show notes along with the consistency diagrams and all the other things.

54:05 Speaker 1: Great.

54:06 Speaker 1: Yeah, cool. All right, Jesse, thank you for being here. It's been really great to have you back on the show.

54:10 Speaker 1: Thanks a lot, Michael.

54:11 Speaker 1: This has been another episode of Talk Python to Me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show.

54:20 Speaker 1: Take some stress out of your life. Get notified immediately about errors and performance issues in your web or mobile applications with Sentry. Just visit talkpython.fm/sentry and get started for free. And be sure to use the promo code talkpython, all one word.

54:37 Speaker 1: Influx data encourages you to try InfluxDB. InfluxDB is a database purpose built for handling time series data at a massive scale for real time analytics. Try it for free at talkpython.fm/influxdb.

54:50 Speaker 1: Want to level up your Python? We have one of the largest catalogs of Python video courses over at talkpython. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription in sight. Check it out for yourself at training.talkpython.fm.

55:07 Speaker 1: Be sure to subscribe to the show. Open your favorite podcast app and search for Python. We should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /play.

55:19 Speaker 1: RSS on talkpython.fm. We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

55:40 Speaker 1: I'm out.

55:57 you you you Thank you.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon