#240: A guided tour of the CPython source code Transcript
00:00 Michael Kennedy: You might use Python every day, but how much do you know about what happens under the covers, down at the C level? When you type something like variable equals open bracket square bracket to create an empty list, what are the bytecodes that accomplish this? How about the class backing the list itself? All of these details live at the C layer of CPython. On this episode, you'll meet Anthony Shaw. He and I take a guided tour of the CPython source code. After listening to this episode, you won't have to guess what's happening. You can git clone the CPython source code and see for yourself. This is Talk Python to Me, Episode 240, recorded Wednesday, October 30th, 2019. Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm, and follow the show on Twitter via @talkpython. This episode is sponsored by Linode and the University of San Francisco. Please check out what they're offering during their segments, it really helps support the show. Anthony, welcome back to Talk Python to Me.
01:20 Anthony Shaw: Hey, Michael, it's great to be back.
01:21 Michael Kennedy: It's great to have you back. To say you've been on the show before is a bit of an understatement. You were on episode 131, contributing to open source, 155, practical steps for moving to Python 3, 168, 10 Python security holes and how to plug them, 180, what's new in Python 3.7, and then 214, diving into 3.8. So I think you might be one of the most prolific guests here on the show, which is awesome, I love having you on here.
01:47 Anthony Shaw: Yeah, thanks for having me back, it's good to be on the show. Yeah, I can't believe that this is the sixth time I've been on here.
01:54 Michael Kennedy: Yeah, I had to actually use the search engine on the website to figure out how many times you've been here. So yeah, this is going to be a good one because we're going to dive into something that everybody uses but it's, there's so many dark corners for most folks who are not core developers, right. Okay, we all know there's CPython, there's probably other Pythons. You maybe have heard of those, like PyPy and whatnot. But can you open up the code, where do you get it? Where are the important parts, right? It's a huge project, but there's certain parts you really should pay attention to and others are details, right.
02:26 Anthony Shaw: Yeah, absolutely, I mean, I think of it as my car. If I open up the bonnet of my car, I know where the engine is, I know where to put the oil. And that's pretty much it, I don't know what half of the other components do. So yeah, I feel like it's like that with CPython sometimes, you know how to use it, but in terms of how it actually works, it's a bit of a mystery.
02:46 Michael Kennedy: That's a good analogy for sure. Now, before we dig into this, maybe just tell people what it is that you do day to day so they get a sense of where you're coming from, in the open source space and in your day to day job. If they want your whole story, they can go back to episode 132 and they'll get started and whatnot, but give us the quick summary there.
03:04 Anthony Shaw: Yeah, sure, I work for a company called NTT and I run talent transformation for them so I can get skills and development of employees for NTT globally. That's my day job, and them I'm also a Python enthusiast and get involved with various open source projects as well. So Apache projects, as well as some of my own personal projects.
03:25 Michael Kennedy: Yeah, awesome. Like Wily.
03:27 Anthony Shaw: Like Wily, yeah, and I've been playing a lot with pytest and Azure Pipelines recently as well.
03:32 Michael Kennedy: Right on. So we're going to talk about this whole CPython source code story, and you've been touching on this in several ways, you've been writing some articles, but then you decided to just write a book and disguise it as an article. And you called it Your Guide to the CPython Source Code over at Real Python, and it's excellent, we'll link over to it. But we're going to cover a bunch of the ideas that you touched on in there, because this is a really good exploration. But what got you started in digging into the source code in the first place?
04:04 Anthony Shaw: Some of it was curiosity. A few years ago, I wrote an article on how to add an operator to the Python syntax. So how to add a ++, so like an in place increment operator, which Guido is famously against for good reasons, but it was more of an exploration, like how would you actually add that to the syntax and recompile Python? Which was really really interesting to dig into. And also, I found that, if you want to contribute to CPython, the documentation, there's a site called the Dev Guide. Which is great telling you the process for raising pull requests, what the branch strategy is. But if you were to join a new software team, you would expect that, in the first few weeks, one of the senior developers would sit you down and walk you through the code and explain how everything works. But that documentation is missing. So I wanted to write something that filled that gap so that, if people wanted to get into working on CPython, contributing to it or making tweaks, enhancements, or customizations, then it's something that really takes them through in depth the whole source code and how it works and what each component does.
05:10 Michael Kennedy: It definitely does that, I feel like these large projects that often have a bunch of special steps to get started, to get your machine configured and whatnot. They can be intimidating, but using the article, I was able to get the code, get it up and running, and be playing with Python 3.9 super quick. It was just, I don't know, most of time was waiting on the compiler, actually.
05:30 Anthony Shaw: Yeah, there's ways of making it faster, but it's a big piece of code to compile so it takes a while.
05:35 Michael Kennedy: Yeah, I definitely ramped up the number of cores getting used there, but it still takes a while. It's quite cool. Alright, so before I guess we maybe dive into the source code itself, let's maybe talk a little bit higher level. Some of Python is Python, which is cool and meta, and some of Python is C code, which is maybe surprising to some folks who are new to Python and how it executes internally, and maybe there's even some other code in there as well. I haven't seen any inline assembly, but you never know, right. What's the breakdown there or how would you categorize that?
06:10 Anthony Shaw: It's about 70% Python and then the rest is C code. So there's about 350,000 lines of C code, which is a lot of C code, but over 800,000 lines of Python, which includes I guess the test suites as well. On top of that, actually, there's documentation, there's over 220,000 lines of documentation. So the documentation itself is a huge amount of work.
06:35 Michael Kennedy: So the restructured text is actually one of the main languages.
06:39 Anthony Shaw: Restructured text is one of the main languages, yeah, absolutely, 230,000 lines of restructured text.
06:44 Michael Kennedy: Would it be safe to say that most of the standard library is written in Python, but not all of it, but almost all of the core interpreter and compiler is written in C? Is that a good representation?
06:56 Anthony Shaw: The core types are written in C, the compiler is written in C, most of the core engine and the runtime is written in C. In terms of the standard library, anything which doesn't need to patch into any of the operating system APIs, like the networking or any hardware or anything, is written in Python, otherwise, it's written in C.
07:17 Michael Kennedy: Some languages that all of it are written in that language, right. Like Go, for example. Then there's other ones like Python where it's some Python, some C. But we also have things like PyPy, which is more Python, is it 100% Python, I'm not sure, there might be some little tiny shim to get it started. But why is it in C?
07:37 Anthony Shaw: If you're making a new programming language, to write the compiler, you need a programming language to write the compiler in, so it's difficult if you're starting a new language from scratch. The Go is actually a good example, because the Go compiler is now written in Go, but it wasn't originally. Once they got Go a bit more mature, then they basically rewrote the compiler in Go. But you still need an actual interpreter and a compiler to be able to do that. So CPython is written in C largely because they needed something to start off with. This was written a while ago, C is still a very popular language. And also, Python has a lot of integrations into the operating system components. And most operating system APIs are in C. So for Windows, Linux, and MacOS, if you want to talk to the sound card, if you want to talk to the screen, if you want to open a socket on the network, then you're going to be talking about C APIs, so the ability to do all that stuff seamlessly in Python means that, at some point, it needs a C layer to integrate into the kernel.
08:43 Michael Kennedy: Right, to call the Win32 API or down into Linux or MacOS, their native APIs, right?
08:49 Anthony Shaw: Yeah, exactly.
08:50 Michael Kennedy: Yeah, cool. So this is a huge project as the size my joke about your book hinted at. So when you look at it like, how did you get started? There's got to be a bunch of stuff you decided not to cover, some stuff you did. You do have your mission of here's the missing dev guide, sit down with a senior developer, but how did you decide to get started on this or what goes in and out?
09:14 Anthony Shaw: Yeah, so the approach I took was not to go file by file, but instead follow a trace from typing Python at the command line with some code all the way to it being executed and then back up again. So it takes, the article takes you through what happens when you run Python and then basically steps through each layer deeper and deeper and deeper into the code, and then explains at each point what's happening. And then I've added diagrams and stuff like that to show you, so it's almost like a traceback, if you were to add a custom traceback. But actually doing tracebacks in Python is really hard, not in Python, but in CPython.
09:57 Michael Kennedy: Yeah, most of the code that you'd be trying to look at would actually be in C, not in Python, right?
10:01 Anthony Shaw: Yeah, exactly, and I've ended up writing some tools to help me put the article together and also do some debugging to pick this apart.
10:09 Michael Kennedy: Sure, well, what's your background in C? How prepared were you for this journey and how easy was it I guess is what I'm getting at.
10:16 Anthony Shaw: I thought I understood C, but then diving deeper and deeper into this code, I really had my head scratching a few times. There's a lot of macros in the CPython source as well. So anyone who's worked quite a bit with C code might be surprised at the sheer volume of macros. So macros is basically a way of, before the code gets compiled, the preprocessor will replace a macro with another piece of text basically before it gets compiled. And there's a lot of these in CPython, so it makes it, basically, they're micro-optimizations to the code, but it does make it quite tricky to read and understand.
10:53 Michael Kennedy: Yeah, I can imagine. When I was looking through it, I used to do, for a handful of years, professional C++ development. And I could read it, but I was thinking, I'm really glad I'm writing Python these days because wow, I know what this means, it's a lot of work. A lot of work to write C.
11:10 Anthony Shaw: Yeah, and also making changes to the code. So in the article, it encourages you to not just understand how it works, but also to make little tweaks and changes and add your own custom statements and maybe interfere or look at the tracing and stuff like that. And as part of this, I ended up writing a few pull requests into CPython and doing a couple of bug fixes and things like that.
11:34 Michael Kennedy: That's awesome, what were they for?
11:35 Anthony Shaw: They were really minor ones, just stuff that I discovered when I was digging in. There's a couple that still need to be merged as well, I'm still working on one for Windows support for changing the parser generator. So if you want to add custom syntax to Python from Windows, then getting that support in. And also, I worked on one which was rejected, but it was an interesting experiment to do with list comprehensions. So if you basically do a list comprehension over a list, so typically, you'd use list comprehensions for things like filtering a list into another list. But when you run a list comprehension, it, first of all, initializes an empty list. And what I realized is that, if you initialized that list to a larger size or you predicted the size of the list, then it's a lot more efficient. So it ends up being about 10%.
12:27 Michael Kennedy: If there was no if block, there's no if part in the list and you know you are doing a comprehension of the list, the size should be exactly the same as before, right, so you should just preallocate that.
12:37 Anthony Shaw: Yeah, so it was an experiment to see if that was possible, which it was, but it was a bit hacky. And it did make a difference in terms of performance. I think it worked out being about 8 or 10% faster on list comprehensions, but it added too much complexity, so it was rejected, but I think it's an ongoing experiment that we need to look into.
12:56 Michael Kennedy: That's a non-trivial difference you made by doing that, I mean, I understand the complexity thing, but 8% is a lot these days on a 30 year old polished piece of software.
13:06 Anthony Shaw: Yeah, as if you're doing a list comprehension over a list of a fixed size. But all of the benchmarking tools in CPython use the range function, which doesn't have a fixed size. So basically, the benchmark suite didn't think there was much difference because the benchmark suite heavily uses range, but in practical applications, you wouldn't use range a great deal.
13:28 Michael Kennedy: How interesting, okay, that's super cool, I love it. Nice, Alright, well, let's start at the beginning. I'm interested in the CPython source code and I want to play around with it. How do I get it? What is it, in Subversion or something these days?
13:39 Anthony Shaw: Of course, that is really easy. Yeah, so it's all moved to GitHub, it's easy to find. Github.com/python/cpython. And you can download that as a zip file. You can download that using a Git client, or you can use your IDE to pull it for you.
13:55 Michael Kennedy: It's so cool that it's over on GitHub these days. It's really nice to have it modernized. I think it encourages people to participate more in discussion. They were talking about moving the issues there as well, but I don't know if they were moved yet. That'll be cool when that happens.
14:10 Anthony Shaw: There's a PEP that Marietta has put together, and it's proposing moving to GitHub issues from a bug tracker that they have at the moment. I don't think that's been decided on yet.
14:21 Michael Kennedy: It wouldn't surprise me to see it happen, but I guess maybe when the question is just all the historical issues get somehow migrated over and that, I could see challenges there. This portion of Talk Python to Me is brought to you by Linode. Are you looking for hosting that's fast, simple, and incredibly affordable? Well, look past that bookstore and check out Linode at talkpython.fm/linode, that's L-I-N-O-D-E. Plans start at just $5 a month for a dedicated server with a gig of RAM. They have 10 data centers across the globe, so no matter where you are or where your users are, there's a data center for you. Whether you want to run a Python web app, host a private Git server or just a file server, you'll get native SSDs on all the machines, a newly upgraded 200 gigabit network, 24/7 friendly support, even on holidays, and a seven day money back guarantee. Need a little help with your infrastructure? They even offer professional services to help you with architecture, migrations, and more. Do you want a dedicated server for free for the next four months? Just visit talkpython.fm/linode. Alright, so we've got the code, we've git cloned it or however we're going to get it off of GitHub. And then you get this project structure with 13 or 14 top level folders. So maybe we could talk through just some of the major sections, because I think the file structure is a pretty good partitioning, the folder structure there is a pretty good partitioning to start to understand, where do I go explore?
15:48 Anthony Shaw: Yeah, once you downloaded the source code, the first folder is called Doc, which is where the documentation is, that's the 230,000 lines of restructured text.
15:57 Michael Kennedy: Wow.
15:58 Anthony Shaw: Yeah, and if you want to start off by understanding some of the APIs as well, the documentation is a good place to go. There's also a folder called Grammar, which is for the computer readable language definitions. So what is in the Python syntax, what makes an if statement an if statement? Are you allowed, could you type if else colon, that wouldn't make sense and why, so the computer understands I guess how the language is structured. There's a folder for Include for C header files. Which is also good to understand the API in a bit more detail. There's the Lib directory for basically the standard library modules, all the ones that are written in Python.
16:39 Michael Kennedy: Yeah, I think the Include one is pretty good, because you can just see the function definitions, you don't have to jump through all the implementation and the macros and the #ifdef stuff, you can just say these are the things I can call over here, you can get a higher level view.
16:53 Anthony Shaw: This is why, if you're going to jump into this, you want to pick a pretty decent IDE, because there's a lot of code in here and using a plain text editor is going to be extremely difficult to navigate things.
17:05 Michael Kennedy: I agree.
17:06 Anthony Shaw: So yeah, I'd recommend picking a decent IDE to start off with.
17:09 Michael Kennedy: For me, when I was playing with this, I used VS Code on the whole top level project. And it just said hey, you should have the C/C++ extension installed, so sure, do that. It was pretty good from there, it also installed the restructured text extension, it was adapting. Maybe over, if I was on Windows, I might use Visual Studio proper. 'Cause it's got a Visual Studio solution in there, which is cool for Windows developers.
17:31 Anthony Shaw: Yeah, in the article, actually, I take through, so Visual Studio 2019 came out whilst I was writing this. So it's been updated to explain how to use the Community edition, which is the free version. So it's different to VS Code, Visual Studio is a fully blown IDE. It's designed for C, C++, and C# development. And there's an explanation in here about how to use that to both compile CPython from source as well as do debugging and stuff like that.
18:02 Michael Kennedy: Yeah, you've got some really cool stuff, how you have the REPL running in a debugging instance embedded in Visual Studio or something like that, right?
18:09 Anthony Shaw: Yeah, it's pretty cool, actually, I was really impressed, definitely for Windows users, I'd say if you want to make changes and not just explore, then I'd pick Visual Studio rather than VS Code because you're going to get much richer debugging. And Visual Studio 2019 as well is going to be able to compile for you, so if you just use VS Code, then you're going to need to run the MSBuild script files, which are located in the PCbuild directory. But it is actually a lot easier to use Visual Studio rather than running it all in the command line.
18:42 Michael Kennedy: Yeah, that makes sense, and I totally derailed your summary of these things over here. So we were talking about the Lib folder is full of all the part of the standard library that is written in Python, the CSV module or whatever, but then there's a bunch more, yeah?
18:56 Anthony Shaw: Yeah, so there's a folder for MacOS support files. There's a couple of other miscellaneous directories that you shouldn't need to worry about. There's a folder called Modules, which is the standard library modules that are written in C. So the standard library modules are split between the Lib and the Modules folders, so whether they're written in Python or C, they're in two different places.
19:16 Michael Kennedy: Right, so object, where you have a class and it derives from object. There's an object.c file in Modules, right, that part's pure C.
19:24 Anthony Shaw: Actually, they're in the Objects folder. So there's, yeah, so there's a folder called Objects, which has got the core types and the object model. So what is a number type, what is a string, what is an array, that sort of thing.
19:41 Michael Kennedy: Yeah, I had it totally wrong. So maybe GC or something like that.
19:44 Anthony Shaw: Yeah. Then there's the parser, which is basically the thing that actually parses the source code into something that can be interpreted. Then there are some directories for Windows users, so there's PC and PCbuild. PC is the new version, PCbuild has got some legacy scripts for building for older versions of Windows. There's a Programs directory, which is the source code for either the Python.exe or the Python binary that you end up with on your machine. And there's a Python folder, which is confusing, but it has the interpreter source code, so it interprets the code through to execution. And then there's a folder called tools, which has got some tools and scripts and stuff like that for either building or extending Python.
20:30 Michael Kennedy: Super cool. And yeah, there's some of these that you want to really dive into, and others are just support files. When I was looking around in the Lib folder, I was blown away at some of the stuff that's in there, I'm like, alright, well, what in here is actually implemented in Python, what's the code look like? Those are all interesting things. And then I came across some stuff that surprised me, obviously, you would expect that these files would have comments, documentation that describes what they do, right?
20:59 Anthony Shaw: Absolutely.
21:00 Michael Kennedy: Yeah, but then I saw that a lot of them, not a lot, some of them, non-trivial number of them, actually have ASCII diagrams that describe, say, workflows or relationships in the comments, it's pretty wild, like the Lib/concurrent/futures/process.py has a great long in-process, out-of-process workflow diagram in the help doc.
21:23 Anthony Shaw: Oh wow, okay.
21:24 Michael Kennedy: That's pretty funny, right? And then also the JSON module has cool ASCII art documentation, so anyway, I thought those were, those were surprising to me, really, there's diagrams in here, how cool.
21:36 Anthony Shaw: Yeah, I think if anyone wants to have a go at contributing to CPython, a really easy place to start is in the Lib folder, is all the standard library modules that are written in Python, they're easy to read. Most of them are not too complicated. Have a look through some of those, because there's stuff in there which is legacy syntax which needs updating. There are bugs in there which have been reported in the bug tracker but never fixed. As well as, if you compare what's in the code to what's in the documentation, you'll pretty quickly find gaps, functions which either don't have any documentation or the documentation is wrong or the argument list is not up to date. So if you want to have a go and contribute and you're looking for something simple to get started with, then I'd say pick some of the, probably some of the more obscure standard library modules and see what needs fixing in those.
22:29 Michael Kennedy: Yeah, that's a good idea, and I honestly don't know how much of those have these issues, but it seems like a pretty straightforward way, certainly contributing to the Lib folder or the Docs folder seems much more approachable to me than to the Objects or the Modules, 'cause down there, that's where the C code lives, right?
22:47 Anthony Shaw: Yeah, before you get stuck into the core runtime, then I think it's a good idea to have a look at some of the high level Python code first.
22:54 Michael Kennedy: Probably a good idea as well. So I thought I'd just pull out a couple of files from each of these major sections, these folders that you talked about that were just interesting. So over in the Lib module, we have things like CSV, the whole CSV implementation is over there and it's, it's not that much, I don't remember exactly how long it was but it's a couple hundred lines, maybe 500 lines. And you can just go read it and play there, right?
23:20 Anthony Shaw: Yeah, and you can make changes, you can put your own debug statements in, you can see how things are working. But it's pretty straightforward, there's a, if you're used to use the DictReader before, which is a really handy way of doing CSV parsing, yeah, it's easy to understand how that works, it's written in clear Python.
23:37 Michael Kennedy: Super cool. Yeah, so people can poke around there, over in, let's say Objects. That's the one I got wrong, actually. Over in Objects, this is where object.c is defined, right, and this thing is way more complicated than I expected. Obviously, it's the base class for everything, so it's going to be doing a lot. It doesn't do as much as I thought it would, but there's a lot of code involved in there, isn't there?
23:59 Anthony Shaw: The core object types are actually quite complicated. That was something that surprised me when I was going through this deep dive, is that I thought that there wasn't really a big difference between objects that I defined and objects that were built in, like the int type and the string type. I thought that they were more or less the same but actually, everything sits on top of the core types. So the core types are all declared in C, effectively. And they have C functions. And everything that you put on top of that goes into a dictionary and is pure Python.
24:30 Michael Kennedy: Right, and it seems like there's a lot of memory management stuff that's happening down in there as well. So allocation and finalization, and it seems like that's the main purpose of what the plain object is about.
24:42 Anthony Shaw: Yeah, I think the two ones that are really interesting to look into is the list object and the dictionary object. So the beautiful thing about the list object is that you never have to worry about writing linked lists or doing list allocation, like you would in many other languages. You can just add items to a list and it just magically makes it the right size. In the article, we actually talk about the growth pattern and how it reallocates the size automatically and how that works. But yeah, it's pretty cool to see how that is behind the scenes.
25:12 Michael Kennedy: Yeah, to really appreciate what it's doing for you down there, that's awesome. I love that, it's not just here's an array and now you get to figure all the complicated details of using it dynamically, right. It's, no, it's beautiful, it's a list. There's one, so if you open this whole project up in Visual Studio Code and Visual Studio Code has this cool extension, built-in extension or something I installed, that has a little gray highlight when you're on a line of who's done what, I think it's, GitLens maybe is the thing I installed. But I was just poking around and it shows who has contributed or checked in this file. So Pablo Galindo had just worked on object.c 22 days ago. But then as I arrowed down, different parts would light up with people doing different things. Line three, it says Guido van Rossum, 29 years ago, initial revision.
26:03 Anthony Shaw: Some of the code in the CPython source is quite old and hasn't needed to change. It just worked the first time. And it hasn't needed to be updated or there's nothing that's been changed there. It's really interesting, 'cause when you dive through the code, I guess you can see that, if you're looking at the Python 3.8 source code, you're actually looking at a, almost a canvas of 2.x all the way up to the latest version. Even, actually even before 2.x, some of this stuff is all the way back from version 1.
26:34 Michael Kennedy: Right, right, like object.c, yeah.
26:37 Anthony Shaw: Yeah, exactly, so some things haven't really changed since the first versions, but the vast majority of the code has been drastically rewritten over the last 10 years.
26:45 Michael Kennedy: It makes me think of a canvas that a painter would paint on. And it's been painted over and painted over, right.
26:51 Anthony Shaw: Yeah, exactly.
26:52 Michael Kennedy: Cool. Alright, so the next major area was the Modules folder, and this is where the standard library C implementation goes, right. Yeah, so there we have the main.c, which is the Python interpreter main program, which is pretty cool, and also GC module and stuff like that, right, memory allocation and whatnot.
27:11 Anthony Shaw: Yeah, so one of the ones we dig into quite a lot is main.c. So this is some of the really high level APIs for initializing the Python application, so the binary that you would run. So there's different ways that you can run Python. One that you typically use is just by typing Python on the command line or python.exe. And that basically goes through a high level binary. And then that either takes an argument, which is the name of the file you want to run or the library you want to run or the module. Or you can even do python -c, for example, and actually give it a string with some Python code. So all the wrappers around that are in this, in both this main.c and also another file called pymain. And then also, there's a Python API which you can call at the C level. So the Python binary is actually just a wrapper for the Python C API. And the Python C API you can also import and use from your own application, so you can actually write an application in C that has embedded Python and it uses exactly the same code compilation, parsing, everything that you get when you type Python at the command line. So there are some practical uses for that, so there are some big applications out there that have Python built into them. One being a 3D designer called Houdini, which is a 3D graphics tool that uses Python deeply integrated into it using the C APIs.
28:43 Michael Kennedy: Yeah, that's really cool, I don't think people do that maybe as much as they should, right, because saying the way that you extend our application is not to go write tons of C code that you could forget to allocate something and crash the whole program, but here, just write a little simple Python code and it makes our main app go. I think the folks in the movie and 3D game space use that a lot in a lot of their tools and actually use Python to drive those pipelines, the automation of a lot of the tooling like you're talking about with Houdini. Yeah, that's pretty cool, so you spend a whole lot of time in the article diving into those different things and how that works, that's pretty cool. And then I guess the last interesting, there's lots of interesting stuff, but there's the last one that I want to call out on the different sections, the different directories, is the whole Python one, and this is where the CPython runtime lives as opposed to the standard library or maybe the stuff that starts up the processes or the builds, this is where the execution happens, right?
29:39 Anthony Shaw: This is, so the Python directory in the source code is really the brain of the whole application. So it's basically how it does the evaluation of the opcodes, which is the low level assembly code that Python ends up building. The pythonrun.c, ceval.c. So these are highly optimized C files which have been written in and changed over time, that basically executes the Python code.
30:10 Michael Kennedy: For sure, and so ceval.c, this is the big one when it comes to execution. This portion of Talk Python to Me is brought to you by the University of San Francisco. Learn how to use Python to analyze the digital economy in the new master's in applied economics at the University of San Francisco. Located at the epicenter of digital disruption, USF is the ideal launching pad for the next phase of your career. Their new STEM designated economics program doesn't just provide technical training and high demand skills like machine learning, causal inference, experimental design, and econometrics. It takes the next step, teaching you how to apply these techniques to understand the economics of platforms, auctions, pricing, and competitive business strategy in the world of big data. The program is open to beginner and to advance coders looking to apply their skills in a new area. Applications are now open for the fall 2020 classes. To learn more and get an application fee waiver, go to talkpython.fm/usf, that's talkpython.fm/usf. Maybe talk about the process of going from Python source code, what we would think of as what we wrote, to getting to Python bytecode before we get to ceval.c, what's the high level flow there?
31:29 Anthony Shaw: Okay, cool, of course, first things first. So you've written some Python code in a file, I'm assuming. So first of all, it has to read the file, which is actually non-trivial because you got to think about encodings and all sorts of other fun things. Then basically, the parser will go through and take the file apart and put it into something called an Abstract Syntax Tree. So there's basically, actually, before that, there's a step called tokenization, which is to split the application into components. Then it goes into an Abstract Syntax Tree. You can use the AST module, and in the article, I talk about how to use the AST module. And I write a small web application called instaviz which you can download on GitHub.
32:14 Michael Kennedy: Nice, what does it do?
32:15 Anthony Shaw: It basically represents the Python syntax in a massive tree that you can explore like an interactive tree. And you can write Python code and it will give you the Abstract Syntax Tree in web application, an interactive D3 graph. So you can play around and see. So it's a bit easier to see how that tree works with the syntax and understanding the difference between a new line on an indent and what a name is and things like that. So yeah, the tokenizer will look at the file and wrap it up into tokens, and then the next step will be to put that into an Abstract Syntax Tree. Once it's got the Abstract Syntax Tree, then it will essentially compile that by doing a depth-first search. And then it puts it into something called a Concrete Syntax Tree. That is more of a literal interpretation, so an Abstract Syntax Tree says that you've got an if statement, and inside the if statement, you're comparing two variables and you're checking that they equal each other. And then, if that is successful, then you're going and doing these three extra lines of code which you've nested with a tab, for example. So that's what an Abstract Syntax Tree would look like. The Concrete Syntax Tree, the CST, is basically a bit more low level than that, so it's actually saying at the lower level, here's the statement that we need to execute. And then the compiler is another step, which basically takes the Concrete Syntax Tree and converts it into a list of opcodes. And this is the Python bytecode, basically, which is no longer a byte, it's actually a word, but yeah, there's separation.
33:56 Michael Kennedy: They're long words, wide words. Yeah, yeah. So I think that's interesting, because a lot of folks think of Python as a scripting language, they think of it as this thing that is not, one of the things that defines what Python is that it's not compiled. But you just said there's a compiler.
34:13 Anthony Shaw: Yes. So it absolutely is compiled, it's compiled into an intermediate language. So similar to .NET and Java, so .NET and Java both have JITs, so Just In Time compilers and execution, but Python is a bit different. CPython, that is, PyPy does have a JIT. But CPython compiles down to an intermediate language as well.
34:36 Michael Kennedy: It's like C# and .NET or like Java, but where it differs is what happens when you try to execute that, right. In Java, it would JIT that to machine instructions. In Python, it takes this big bunch of these opcodes and feeds them off to ceval.c. There's a switch statement.
34:55 Anthony Shaw: The opcodes end up getting cached in a pyc file or in the, in newer versions, in the __pycache__ folder. So if you've ever noticed that, when you've, when you run a Python application, it creates this cache folder in your application directory. So basically, it goes from the source code all the way through to almost the compiled code, and then it puts that in a cache folder. So it does actually do the compilation, and then the next step is to execute that. There's a literal list of statements that it works through and it has a frame stack and a value stack. Which I talk through in the article, 'cause it takes a bit of time to get your head around those. But in a nutshell, the frame stack is, if you called a function inside your Python code, then you'd expect the local variables to be different, for example, and you couldn't just reference the stuff that you were using beforehand. So there's essentially these frames, and then also there's a value stack, which is used by the opcodes, so it doesn't really understand the different variables you have, they're just all in a pile. And you can add things on top of the pile and you can remove things from the pile. So that's essentially how it works.
36:05 Michael Kennedy: Right, it might load three things onto the value stack and then call the function with that, right? Something to that effect.
36:12 Anthony Shaw: So the opcodes are really low level statements, essentially. So they're the push and pop values from the stack, for example, or to initialize a new list or to initialize a new variable or to call a C function.
36:27 Michael Kennedy: That's cool. If you want to explore those, there's the dis module, right, people can import, they can say from dis import dis, and then they can start taking a function and asking what bytecodes or opcodes make up this thing, right?
36:39 Anthony Shaw: Yeah, I've got a small snippet of code which will work in Python 3.7 and above. So in 3.7 and above, you can actually do, in the sys module, there's a tracing flag which you can enable. And you can run a piece of code and it'll actually print out where you are in the frame stack, what opcodes are being executed, and you can inspect the value stack as well. So if you want to almost play around with the runtime and see what's happening live when I run this piece of code, there's a snippet in the article which we'll link to in the show notes as well where you can basically see the frame stack live.
37:15 Michael Kennedy: That's pretty awesome. So you can just dump every bit of what's it up to, huh?
37:19 Anthony Shaw: Yeah, and I got it to nest as well so that, as you go deeper down in terms of the frame stack, it will pad things out further and further to the right, so you can see where you are in the tree as well.
37:28 Michael Kennedy: Yeah, and that sounds like you'd almost have to have it or you'd just go crazy. That's cool though, it sounds super useful if you just try and understand.
37:36 Anthony Shaw: Yeah, so then eventually, it goes down to ceval.c. Which is a very complicated piece of C code. And it has a lot of macros as well, which makes it quite hard to follow. But essentially, it's a big loop. So it's a big for loop, essentially, and it goes through each opcode in that frame and executes it. So it's a big for loop, and inside the for loop, there's a massive switch statement which says if it's this opcode, do this, if it's that opcode, do this. And then for each one, it typically calls a C function. So if you're going to load a variable onto the value stack, then it would fetch the variable and push it onto the stack. There would be two or three lines of code for each opcode. And some of them are a bit more complicated, so you can do there let's say, SET_ADD, for example, is a basic opcode. So if you have SET_ADD, you're adding two sets together, then basically it would call PySet_Add, which is a C API. So most of the opcodes actually just call C functions.
38:41 Michael Kennedy: Right, they take the variables that are on this value stack and they just go call the C function with that.
38:45 Anthony Shaw: Yup.
38:46 Michael Kennedy: Yeah, this is a serious switch statement. If people haven't looked at it yet, it's 3000-4000 lines long. I don't have it pulled up just right this second. Another thing that surprised me is there's some interesting flow control mechanisms in there, like goto.
39:04 Anthony Shaw: There's a lot of gotos.
39:07 Michael Kennedy: How did that, did you feel when you saw that, what is this?
39:09 Anthony Shaw: The gotos make sense 'cause basically, there are some optimizations in here. So sometimes, you will get opcodes that typically come in pairs. So what they've done is over time, they've said okay, if you're going to create a new list, then you're going to create a new name and initialize a new list, and those two opcodes are going to end up being next to each other. So in the switch statement, it actually has shortcuts in the code so that it knows that, if it's running this opcode, the chances are, it's probably going to run that opcode next, so it basically shortcuts a lot of the other inspections and a lot of the other checks. Also, there's a lot of gotos in terms of yields. So if you're yielding values back, as well as doing errors, so if you call a function and the function crashes or if you try and store an attribute but there's some sort of error at the low level, then it will go to a generic error section, which basically starts off the whole exception process.
40:11 Michael Kennedy: It makes sense because it's so highly optimized. Forget doing it the right way. If the goto makes it a little bit faster, get in the goto, right, 'cause this is the hot loop that runs every single thing that happens in the language.
40:25 Anthony Shaw: Yeah, this thing is going to run thousands and thousands and thousands of times. So you want it to be as fast as possible. I'd say that, in terms of micro optimizations, I think they've pretty much done most of what they can. There was some stuff introduced in 3.7 to do with the fast method calls.
40:42 Michael Kennedy: Right, for methods without keywords, right?
40:45 Anthony Shaw: Yeah, so the fast method calls, what you do with, if you're calling a method in a class that doesn't have keyword arguments, then it's about 20% faster. And you can see, in this loop, you can actually see that opcode and how it works as well, so it's pretty cool.
40:58 Michael Kennedy: I would say, if you are trying to understand how CPython executes code, the ceval.c and the switch statement, this is the place to start, right?
41:05 Anthony Shaw: It's a good place to understand, I wouldn't say start here. I'd say get here, I'd say work your way towards it. If you just jump in, it's not going to make a whole lot of sense.
41:16 Michael Kennedy: Alright, okay, fair enough. So another thing that you spent a fair amount of time on, and you tied it back to the underlying C API, was memory management.
41:23 Anthony Shaw: Yeah, this is definitely one of my weaknesses in understanding. I thought I understood how memory management worked in CPython. But the more I looked into it, the more complicated it actually is, there's basically different types, there's different ways it allocates blocks and also arenas. And there's basically a Python version of, or a CPython version of memalloc. So instead of calling memalloc directly from C, you're supposed to call the CPython version, which has got more governance and also more cleanup.
41:52 Michael Kennedy: Yeah, and it also does a bunch of work to try to avoid fragmentation and stuff like that. So I think it's interesting, I don't think people talk about memory management in Python very much. On one hand, who cares, whatever, you don't have to worry about it, hooray, right. That's one of the reason we like the language, I'm so done with calling malloc and free, I just don't want to do that again. But on the other hand, just having a conceptual understanding of what is happening at a pretty good level helps you think through this algorithm might be better than that, or if we're having these memory problems, we might be able to do something slight different in terms of how we're using, how we're defining our code. Maybe to take advantage, not work against the way it works, but to work with it, right.
42:32 Anthony Shaw: Yeah, one of the biggest benefits to CPython is that everything basically comes from PyObject. So the core type of an object, which is used by integers and strings and lists and everything, including the objects that you define. Everything comes from the same type. So the memory management is very optimized, because basically, everything inherits from something, so you know that the structure at least has this fixed size. So what they've done is they've built in these utilities for allocating sections of memory on your machine so that you can store objects easily and fetch them and reference them. So that's something called the PyArena, which is referenced quite a lot in the code and you'll see, when you add objects to the arena, where that goes to. So basically, it's a way of putting Python objects into memory. And also, that's where the PyArena malloc comes from, which is to do object memory allocation. So that's the really low level memory allocation techniques that are used inside CPython, which are optimized around the size of the PyObject type and typically, the types of memory that are requested and the way that they're used. But if you're using Python at the Python level, you'd never care about that sort of thing, you just expect that, when you declare an object, it has memory, that's, it figures that out itself. But at the Python layer, you definitely do need to know about the reference counter and the garbage collector if you're writing applications which run for any long period of time.
44:10 Michael Kennedy: Yeah. Well, you talked about Java and .NET before, those are both mark and sweep garbage collecting type of systems. If you go back to something like C or C++, it's manual. Maybe you could use a smart pointer in C++, and then that's kind of reference counting. But Python's interesting, I think, 'cause it has this blend, right. It's like, well, we're going to do reference counting, which is pretty awesome and predictable and deterministic and fast. Except for when you have cycles, that's the main weakness of reference counting, right, is you cannot break a cycle because the reference count is never going to go below one, for if you have two things that refer to each other, how do they become garbage, right? So we have this GC that also runs but yeah, it's pretty interesting, and you could see some of that happening in, that you're exploring there.
44:53 Anthony Shaw: I'll go through the garbage collection modules, so there's gc module. And you can actually put debugging in the gc module from the Python layer, so if you import gc and then run gc.set_debug, you can basically turn on debug statistics so that, when you're, even at the REPL, if you want to assign variables and stuff like that, you can se what's happening at the garbage collector and what the threshold is and when it runs. And you can also customize how many cycles the garbage collector runs at because basically, the garbage collector, I've taken the easy analogy that it's like the trash trucks that come and pick up your garbage. It doesn't make sense for them to come every time you put something in the bin. They come once a week or once a fortnight. So you can basically customize that, so you can say how many cycles until it goes and checks which objects are no longer needed and which ones don't have any references anymore, and it'll go and clean those up.
45:49 Michael Kennedy: Another thing that's interesting, the two things that are interesting that you can write in Python code that are fun to play with to give you a better understanding, is you can write some code, I think you've got to do some C imports, but you can basically ask, how many references are there to this object ID, right? And it'll tell you there's five or whatever and so on. The other one that you can do that's interesting is to play with the weak reference type, create a weak reference to a thing, then see if it's still live or not, right. Because that doesn't, it'll still let you address that thing, but not actually keep it alive by holding a pointer to it, right?
46:26 Anthony Shaw: The garbage collector's debug stats are a great way of doing that, 'cause it dumps that information to the REPL as well.
46:33 Michael Kennedy: Yeah, another thing you can do is implement the __del__, I think. You can actually get it to print out when an object is finalized or deleted.
46:41 Anthony Shaw: Yeah, __del__ is really useful for doing any of that custom cleanup code. So it's almost like the, like you have a constructor in __init__, and then you have a destructor as well.
46:52 Michael Kennedy: Yeah, exactly. So while we're on this memory stuff, let me throw out this really quick, this article by Instagram, it's a little bit old, it's a couple of years old now. But it's called Dismissing Python Garbage Collection at Instagram. Maybe I covered this on Python Bytes long long ago, but it says, because you can import gc and say gc.disable or gc.collect now, right, you can take a little bit of control. Not necessarily I think you should, but you can. And over at Instagram, they said they can run 10% more efficiently by disabling GC and reduce the memory footprint and improve the CPU LLC cache hit ratio on their Django servers, that's a very focused use case, but they happen to do that, I'll put a link to that in the show notes, so you can also see about playing around the GC some over there and whatnot.
47:39 Anthony Shaw: Awesome.
47:40 Michael Kennedy: Yeah, pretty wild. Alright, let's see. So another thing that I think will be fun to talk about, we've got just a little bit of time left, not much, is just objects in the Python data model. Right, so the Objects folder is where all that stuff lives, and we talked a little bit about object.c, but there's more stuff that defines the data model, right, like __iter__, __enter__, __exit__, __repr__, all that stuff, right.
48:05 Anthony Shaw: It is basically a list of these core, so in the Python data model, actually, I reference Luciano Ramalho's book, which if you want to understand...
48:14 Michael Kennedy: Yeah, it's Fluent Python, it's a great book, yeah.
48:17 Anthony Shaw: Fantastic book, and if you want to understand the Python data model and how to really leverage it to write fluent Python, then I recommend checking out that book. It's available in pretty much every language now, so that's awesome, I think it's a must read for any Python programmer. But basically, if you are writing a custom type that was a sequence, so it was a sequence of items, then you'd have __len__, for example, to, you can customize what the length is, you can say dunder contains or you can do slicing or you can do repeats, concatenation. So you can override the behavior of these core operators. But that stuff is actually built into the object. So there's a list object, if I pick on list as an example, and in the list object type, there is something called sequence methods, which is basically, in the data model, the dunder methods that are in place for anything which is a sequence. So a byte array, for example, or a list.
49:16 Michael Kennedy: So you can go in there and just look at all the different parts or aspects of the core data model. It's spread across all these different classes, right, it's not just all jammed into object.c. Like you said, there's a listobject.c and an iterobject.c and so on.
49:31 Anthony Shaw: Yeah, so there's basically these core types, I think there's about 20 of them, the core types, so you'd recognize them, things like dictionaries and modules and methods, memory and long objects. So yeah, it's interesting to dig through them. If you want to look at them, I'd say don't jump into the Unicode object first, because it's probably one of the most complicated. So the Unicode object is the string type, the old string type, basically, it doesn't really exist anymore, you can do byte arrays. So it's a byte array object type now, but if you want to look at strings, the Unicode object is in there, but it's hugely complicated 'cause it has to deal with all the encodings and all that magic.
50:13 Michael Kennedy: Yeah. Wide character pointers and all that, yeah, no thanks. I'll start somewhere else. So that about covers it for our guided tour through the actual source code, but maybe we could talk just really quickly about a couple other things before we're out of time. You talked about doing a lot of stuff with pytest and doing testing, what's the story around testing in Python and the source code?
50:34 Anthony Shaw: In CPython, there's a huge test suite, which takes a lot of time to run. It's both, there's, all the tests are written in Python, which is great. They are written using unit tests, using the unittest module. And they run using concurrent processors as well, because there are so many tests to run in the test suite, there's basically a, it doesn't call unittest directly, it actually runs this test runner, this custom test runner that they've built. So inside the test module for CPython, it'll test both the standard library module behaviors, as well as the parser, as well the core runtime, as well as the APIs. So yeah, like I said, it's a huge test suite. The simple ones to understand I guess are the tests which are focused on the standard library modules, because it's Python code testing Python code, which is fairly simple. And then if you look at the C layer, then basically, they're the way that you wrap C code and call it from Python. Essentially using that from the test code to test different functionality. There's also, documented somewhere is the coverage for the different parts of Python. Some standard modules, standard library modules, have pretty low test coverage, so if you do want to get started somewhere, adding tests is always a great place to have a look. And you'll find some of the more obscure modules as well have little to no tests.
51:59 Michael Kennedy: Yeah, so you could write some potentially, right?
52:01 Anthony Shaw: Yeah, definitely you could write some. And when you're writing tests, you might come across a few bugs that you want to fix as well, which is awesome. But the core runtime itself is heavily tested.
52:11 Michael Kennedy: I'm sure it's super super heavily tested. The other thing that might be fun to talk about is, now that CPython is over on GitHub, it's really easy to go look at the branches and how they're working at those. Maybe give people an overview of that, it looks like it's pretty much focused around releases and not, say, features branches or something like that.
52:30 Anthony Shaw: Yeah, so they have release branches. So if you were to go and look at CPython now, it would say Python 3.9. Which might take some of you by surprise because 3.8 only came out a week ago. So actually, what they do is they declare a feature freeze in the Python release cycle. So the feature freeze for 3.8 actually happened a few months ago.
52:54 Michael Kennedy: Basically, when they go to beta, right?
52:55 Anthony Shaw: Yeah, basically. So any new enhancements or stuff like that would go into the next version, which is 3.9. And bug fixes would get merged back in. There's a series of bots in the GitHub repository that do merging back of bug fixes and stuff like that, so when you tag the issues or the PRs that you phrased with certain tags, there are some really cool bots that Marietta wrote that will actually go and merge that back into the appropriate releases, which is really cool.
53:25 Michael Kennedy: Yeah, that actually sounds super awesome, that's great. I suspect people who are core developers or make contributions, they might do feature branches, but not here in the main repo.
53:34 Anthony Shaw: Yeah. So basically, most of the core developers have their own fork. And they run feature branches on their own forks. So if you want to look at some of the proposed PEPs, then typically at the bottom of the PEP, there'll be a link to example implementation. And that typically sits on a fork of the CPython repo on that core developer's copy. So you can actually see different experimental versions and different experimental features, like some of the work that Eric Snow was doing on subinterpreters lived inside his fork, which is really interesting to explore. But it doesn't live in the main Python, CPython repository.
54:11 Michael Kennedy: Right, not till it's officially accepted as part of it.
54:13 Anthony Shaw: Yeah. And then, when it gets officially accepted, they typically rewrite it anyway and clean up the code.
54:18 Michael Kennedy: Yeah, I can imagine. Very cool. And then, just it's worth throwing out that we're over on github.com/python/cpython, but if you step it up one level just to the Python organization, there's actually some other interesting projects there as well, right. We've got the PEPs are over there, typeshed, which are the stubs that define the static types for various Python things. Python.org is there, the dev guide that you referenced, a bunch of stuff to go play with, right.
54:45 Anthony Shaw: Yeah, there's heaps stuff on there.
54:46 Michael Kennedy: Super cool. Alright, Anthony, well, this was really interesting, thanks for doing all the research, writing this great article and walking us all through it. It's going to be a good resource for years to come, I think.
54:56 Anthony Shaw: Yeah, I'm hoping that people read this and get some value out of it, get some knowledge, and hopefully pick some perks, watch some documentation, do some tests, or maybe even add a new feature and get it merged into CPython.
55:08 Michael Kennedy: Yeah, absolutely. One really quick thing, you did mention that, as you were writing this, the source code that you were referring back to was changing and you didn't want it to get out of date, so you also did some cool extensions to keep the article up to date, didn't you?
55:21 Anthony Shaw: In the article, I reference some of the functions, actually, a lot of times, so thousands of times, I reference a particular function or a particular file. And as I was writing the article, obviously, the code is not static, it's changing all the time. So what I ended up having to do was actually write a markdown preprocessor in Python.
55:41 Michael Kennedy: Were you inspired by the C macros you were seeing all over the place?
55:44 Anthony Shaw: Yeah, I just saw so many macros, so I thought it would be a good idea. So in the article, if I reference a function, you can click on it and it takes you to the GitHub source code and it's taking you straight to the line where that function is defined. But actually, that work was done doing, using Python using a preprocessor so that I can basically refresh the article with newer versions of CPython as they come out and it rewrites it for me.
56:11 Michael Kennedy: That's awesome, I love it, that's a really really cool way to approach it. Alright, so I think we're going to have to leave it there for the guided tour, but I do have the two questions to ask you, of course, before I let you out of here and write some Python code. Maybe let's change it, if we're going to work on this project, right, where it's got Python and C, what editor are you going to use on it?
56:29 Anthony Shaw: I'll use, thanks to the nice people at JetBrains, I've got access to PyCharm and CLion. CLion is the C version of PyCharm, so it's made by JetBrains, it's very similar. And so I use CLion for really deep debugging and PyCharm for exploring some of the Python code. And then, when I was writing on Windows, I would use Visual Studio 2019. So I'd recommend either of those two stacks. You can use VS Code in both environments as well, but for the deep debugging and stuff like that, I found CLion was pretty good.
57:05 Michael Kennedy: Yeah, that's cool, I was definitely considering CLion to explore this as well. But I'm not going to compile it, I just want to walk through it and pull some stuff out, so VS Code. Super cool, Alright, and then notable PyPI package, what have you run across lately that you're like oh, this is sweet?
57:19 Anthony Shaw: That's the hard one, actually. I'm working on a few at the moment. Yeah, at the moment, I'm working on pytest support for Azure Pipelines. So if you search for pytest Azure Pipelines, if you're using Azure's new CI/CD service and you're using pytest, then please check out the module. It basically automates how you run pytest and upload test results and it gives you coverage automatically and stuff like that. So yeah, it's a really simple package that I put together. And it's ended up being referenced in the documentation, so it's become quite popular pretty quickly.
57:54 Michael Kennedy: That's super cool, so it's becoming officially part of the way to work over there, right?
57:58 Anthony Shaw: Yeah, and then there's also now a plugin for your GUI. So in the Azure GUI, if you install this plugin, you can actually get all the pytest information in the UI as well.
58:09 Michael Kennedy: Okay, that's awesome. People definitely need to check that out. Alright, final call to action, people are excited about CPython, you gave them some ideas about going back, adding some documentation, adding some tests, looking for unsolved bugs, things like that, what do you tell them?
58:23 Anthony Shaw: Oh yeah, have fun as well, you can add silly features, you can do experiments. Don't feel like you have to do something that might seem like a chore at first, just experiment, see what you can break, see what you can change. And run your own custom fork and just, I think you'll learn a lot by just experimenting.
58:43 Michael Kennedy: Yeah, it's super easy to get started with the article you put together, so it's fun to experiment for sure. Alright, well, thanks, bye.
58:48 Anthony Shaw: Thanks, Michael.
58:50 Michael Kennedy: This has been another episode of Talk Python to Me. Our guest on this episode was Anthony Shaw, and it's been brought to you by Linode and the University of San Francisco. Linode is your go to hosting for whatever you're building with Python. Get four months free at talkpython.fm/linode, that's L-I-N-O-D-E. Learn how to use Python to analyze the digital economy in the master's of applied economics at the University of San Francisco. Just go to talkpython.fm/usf to find out more. Want to level up your Python? If you're just getting started, try my Python Jumpstart by Building 10 Apps course. Or if you're looking for something more advanced, check out our new Async course that digs into all the different types of async programming you can do in Python. And of course, if you're interested in more than one of these, be sure to check out our Everything Bundle. It's like a subscription that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for Python, we should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm. This is your host, Michael Kennedy. Thanks so for much for listening, I really appreciate it. Now get out there and write some Python code.