Monitor performance issues & errors in your code

#179: Python Language Summit 2018 Transcript

Recorded on Sunday, Sep 30, 2018.

00:00 Michael Kennedy: The Python Language Summit is a yearly gathering of about 40 to 50 developers from CPython and other Python implementations and related projects. The summit is typically held on the first day of PyCon. Many of the decisions driving Python forward are made at this summit. On this episode you'll meet Mariatta Wijaya, Lukasz Langa, and Brett Cannon, three well known core devs to walk us through the major topics covered in this year's summit. This is Talk Python to Me, Episode 179, recorded September 26, 2018. Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host Michael Kennedy. Follow me on Twitter where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter via @talkpython. This episode is sponsored by Linode and Rollbar. Please check out what they're offering during their segments. It really helps support the show. Mariatta, Brett, Lukasz, welcome to Talk Python.

01:11 Panelists: Hello. Hi. Thank for having us.

01:12 Michael Kennedy: Yeah its great to have, Brett and Lukasz, have you guys back, and Marietta, welcome to the show for your first time, I'm excited to have you here.

01:19 Panelists: Thank you, yeah. I'm excited to be invited here, thank you.

01:23 Michael Kennedy: Absolutely, so we're going to talk about the Python Language Summit which I think is, not that well known and really, really important. So I'm really excited to talk about that show but, I want to give you all a chance to just introduce yourself for folks who maybe didn't listen to some of the other shows or who don't know you, So Marietta, let's start with you, just, you know, who are you, and, what do you do, day-to-day, real quick.

01:46 Panelists: Yeah, my name's Mariatta, I work at Zapier building integrations for various apps. I work in the Partner Engagement Team, so basically, we're building more tools for our partners.

01:58 Michael Kennedy: Nice.

01:58 Panelists: For CPython I build lots of bots, lots of automations as well so, yeah.

02:04 Michael Kennedy: And you're on the core developer team, right? Very nice. Lukasz?

02:07 Panelists: hello, I'm Lukasz. I work at Facebook on the Python team there, so we make sure to be using the latest and greatest that is available in a safe way and in a way that sort of enables us to use Python in our increasing scale and whatnot. I am contributing to Python, but you probably don't know this, but, what you do know, is that I am the creator of Black, so this is what sort of everybody, I don't know, for some reason always knows about me, but not the other things! I do write PEPs from time to time. In fact, I'm working on one right now. But you will probably also not know about this because that PEP is very boring. It's just about how we're going to make decisions in the future, we'll see if it gets selected.

02:52 Michael Kennedy: I think that's actually a pretty big topic, actually.

02:53 Panelists: Oh yeah, it is. There's a few, sort of, I don't want to say competing, but, they are competing, in a sense. There are a few possible models of governance that we can end up with, and, the one that I'm working on is revolving around the community. It's assuming that there is not going to be another benevolent dictator and there's not going to be a very small council or triumverate or however you call it. We're going to survey, look up to the best, like C++ standard committee, ah, no, kidding. Well there's there's other projects that actually use this model like Rust, like ECMAScript, and whatnot. And, Python actually is enough, well, it's mature enough that, in my view, this model would actually ensure that we are addressing the interests of the widest sort of number of our users that way.

03:49 Michael Kennedy: Yeah, I think I read the PEP that you're working on and that's really great. I like the sort of exploration of the other options that are out there, it's really cool.

03:57 Panelists: Cool.

03:57 Michael Kennedy: You're also the release manager for 3.8 right?

03:58 Panelists: Yes, I am. So far it's going to look at as the most interesting release because it has a very very controversial PEP implemented in it and at the same time, a rather boring release because now we cannot make any other decisions, so there's probably not going to be anymore sort of fundamental features in it.

04:17 Michael Kennedy: It's quite ironic isn't it? Alright, very, very cool. Brett, how about yourself?

04:21 Panelists: Ah, yeah, so my name's Brett Cannon. I work at Microsoft as the dev lead on the Python extension for Visual Studio code. In terms of my Python contributions, I've been a core developer since April 2003, so, think that's over 15 years at this point. Unlike both Lukasz and Mariatta, I am not writing a governance pack.

04:40 Michael Kennedy: Although your name has been thrown around and related to dev, I've heard.

04:43 Panelists: Yes, that's true but, yeah, another reason why I am not writing a governance PEP.

04:46 Michael Kennedy: Perfect, So Brett, why don't you kick us off with our main topic and just tell people what is this language summit that you guys have every year.

04:54 Panelists: Yeah, so, basically, we came to a realization at PyCon, feels like eons ago, that we, didn't really ever have a chance to sit around as a development team and talk things out. Some people would think the sprints were that option, and they were, honestly, early on, but what happened was is as Python's popularity grew, and PyCon's own popularity grew, we had more and more people coming to us during those sprints, saying, hey I want to get started, and contribute, can you help me out. Basically, it's made it such that, the sprints at various conferences are not a place for us to actually get work done, per se, as a team. So in order to make sure that happened, we carved out a day. It always happens at PyCon US during the first day of tutorials where we get together in a room, all day long, and basically, discuss things. Originally, it was kind of a round tabley kind of thing. It's slowly grown into a more structured, people come in with a presentation complete with a question or some specific reason why they're coming to talk about it and present to the room of core developers, and basically, it's just a good way for us to kind of hash out or have discussions about where we see things going. If we need a decision made it's easier to make a decision in the room versus trying to do it over the mailing list, et cetera, et cetera, and, I think it's been useful. Lukasz and Mariatta can answer that since they've been doing it... Pardon? As long as I think they've been core devs and they're going to be leading it starting next year so, they can have more insight on how they see it going forward.

06:26 Michael Kennedy: Yeah, sounds good, you guys want to add a little bit a little bit to what Brett said about the language summit?

06:30 Panelists: Sorry, you said, you guys, you mean, you all, right?

06:31 Michael Kennedy: Yes.

06:34 Panelists: Lukasz, perhaps you can say more just because I've only been to two language summits, so. Yeah, so there's several language summits, in fact, we don't always do the European one as EuroPython. The one in the US at PyCon US is, as far as I can tell, on annual event like as far as I remember. So the first one that I actually witnessed was either in 2008 or in 2009 at EuroPython, so I have a totally different sort of view of what that means because the European one was run by Michael Ford, who had a really sort of, well I don't want to use the word, but it was a really happy, hippy approach to, let's just come on in and discuss things as we go. Everything sort of organically grew, like people were overflew with sort of ranty comment but it was a very lively conversation. The language summit in the US, on PyCon US, have to be structured are a little bit more because there's just way more of us in the room, so there is an agenda, there are talks, as Brett said and whatnot, but I found like very amazing about this is that people who are really some of the most active contributors and some of the most senior contributors to Python and related projects are there in the room so, it's something that even if you wanted, you just can't buy.

08:10 Michael Kennedy: Yeah, it seems like such an amazing gathering of people. Should you have to be a core developer to attend, or can anyone attend these?

08:16 Panelists: So, so far, at PyCon US, the rule has been as follows, if you are a core developer of CPython or any other alternative runtime, you are automatically in. If you are interested and you are in town that day, you can just sort of respond to the invitation, and there's going to be a spot for you. There's also a few other ways in which you can be present at the summit. If you are invited by a core developer to talk about a subject that is of interest to us, and obviously, you can get in, or if you are a member of a project that is written Python of some notability, right? But, historically, we have people from the Twisted project. We've had people from the Mercurial project and whatnot. Sometimes the scientific stock and whatnot, so, like definitely there's some form of deliberate exclusivity to the event but the point of it is mostly that we really have very little time in which we can actually see each other face to face and discuss things that are just way easier to be discussed when you see that other person, so we don't want this to devolve into sort of bikeshedding, or you know like story-of-my-life sort of event for any random attendee of PyCon.

09:44 Michael Kennedy: Yeah, that makes a lot of sense 'cause this is for you to all to get together and really be productive and move the language and the runtimes forward, which I think is great. I guess the way I'd like to do this is, let's just go through some of the various sessions. There we quite a few sessions and there's a really nice write-up, actually, on lwn.net, which I'll link to the main write-up and then all the sub-articles about the sessions. And maybe we could just start there so not sure which one of you is most familiar with each session, so I'll let you all jump in. So the first one that I want to talk about there was something with regard to subinterpreters in Python, and I thought that whole concept of a subinterpreter was pretty interesting, and what was the story this time?

10:27 Panelists: I can take that, so that was presented by Eric Snow, T MetaMind on the Python extension, and basically Eric was kind of giving a status update of what he's trying to do, which is basically, CPython itself currently has the concept of subinterpreters. It's used in the Apache web server, actually, for I believe, mod_python, I don't think, I'm not sure if mod_wsgi was uses it, but basically, it's a basic way to run multiple interpreters in a single process but it's never really been built out, and so Eric viewed this as a potential opportunity to deal with a concurrency model by trying to make subinterpreters a first-class citizen of CPython itself. So the idea is basically you will potentially eventually have a module in the standard library that will allow you to actually create other subinterpreters in the process, send data across to those subinterpreters to then be worked on, send results back, and basically give you kind of a message-passing style of concurrency.

11:30 Michael Kennedy: Is it a little bit like multiprocessing, but not as heavy weight, in the sense that you don't share the data structures as much.

11:36 Panelists: Correct. Yep, exactly, and a lot of Eric's work has been actually, trained to tease out a lot of the global state that built over the years by accident and trying to compartmentalize them better so that it's very obvious what is connected to a specific interpreter. So, regardless of where the subinterpreter work goes, it's been nice to kind of clean up that code to help centralize all the data structures and such.

12:02 Michael Kennedy: Yeah, that's really interesting. One of my first thoughts when I first saw this, I mean, you were talking about it in terms of concurrency, which is really interesting. My first thought went to compatibility, right? Could we take something that runs in Python 2 and somehow get it to stay closer to other code or is there any thought of that or is it more of a concurrency thing?

12:21 Panelists: It is very much a concurrency thing from our perspective. You, Dropbox, just did a blog post yesterday about how they transitioned from Python 2 to Python 3, and it sounds like they somewhat went down that road in their custom Python version of CPython, so it sounds like it'd be definitely a potential way of doing it but I think from Eric's perspective, with Python 2 hitting EOL January 1st, 2020, the amount of time he still has to put in to make it work isn't going to be worth his while to look into that but I'm sure someone could potentially make that work, with a lot of effort.

12:56 Michael Kennedy: We don't want to encourage that behavior, do we?

12:57 Panelists: No.

12:59 Michael Kennedy: This portion of Talk Python to Me is brought to you by, Linode. Are you looking for bulletproof hosting that's fast, simple, and incredibly affordable? Look past that bookstore and check out Linode at talkpython.fm/linode, that's L-I-N-O-D-E. Plans start at just $5 a month for a dedicated server with a gig of RAM. They have 10 data centers across the globe, so no matter where you are, there's a data center near you. Whether you want to run your Python web app, host a private git server or file server, you'll get native SSDs on all the machines, a newly upgraded, 200 gigabit network, 24/7 friendly support, even on holidays, and a seven-day money back guarantee. Do you need a little help with your infrastructure? They even offer professional services to help you get started with architecture, migrations, and more. Get a dedicated server for free for the next four months. Just visit talkpython.fm/linode. How about the next session that you all covered was modifying the Python object model. This one came from someone outside of the core developer team, right?

14:03 Panelists: Yes. That is somebody that I actually invited to the language summit. It's Carl Shapiro who works currently with Instagram on making Python sort of handle all the next billion users for the social network like we've seen some tremendous growth over the years on Instagram. We're like extremely grateful to have this runtime and be able to run the social network on it. It's, in fact, like in the year where we switched from Python 2 to Python 3 we've seen some crazy feature growth and user growth at the same time and we actually switched the runtime version just sort of all in parallels so that was pretty amazing.

14:47 Michael Kennedy: Yeah, I think that whole transformation is really a case study in how large organizations should do it. You guys did an awesome job there.

14:54 Panelists: Yeah so, this was a massive project, like there's a PyCon 20, what is is it? 17 or 15?

14:59 Michael Kennedy: 17? I think it was 17, yeah.

15:01 Panelists: I think, yeah, like a keynote about this so I highly recommend it, it was a fun journey. However, like since then, like we've been seeing that we are paying some of the price for things that we don't have to pay the price for. At Instagram's scale, we're running many, many servers with the Django processes that actually serve users, so we observe that some of the things that CPython interpreter is doing can be optimize away, or rather, we can change the interpreter in a way where those things can be sometimes precomputed before runtime, like it can be done like circa compile time, if you know what I mean, like for Python, right? Like obviously, we are running like out of .pyc files, which are created by the interpreter, so, whatever we can do there instead of every time that we start up the process would be nice. In particular, Carl has a lot of background on various implementations of virtual machines for dynamic runtime, dynamic languages, for example, Dart, VM Dart, and whatnot, so he has some, plenty of experience to know like where we could actually find pieces that we can optimize. In particular, he presented ways in which, for example, we could use an observed pack but most objects in Python don't actually change their shape during runtime. What that means in regular language is that when you're creating an instance at runtime in Python, Python doesn't have any expectation currently, whether this instance is going to grow or remove attributes later on in the life-cycle of the object. It's very dynamically sort of built in the sense that you can just change those things as you go. You can add things, not only to the objects, you can add things and remove things from the classes themselves, you know, like, freely,

17:08 Michael Kennedy: Right, yeah.

17:08 Panelists: like whatever you want, but, in practice what is happening is that a large majority of objects have all their state created in their dunder __init__ methods.

17:20 Michael Kennedy: Right, you could also use something like slots, and like lock them down, and make them more efficient, but people don't do that. They're not even actually recommended to do that, right?

17:29 Panelists: Yeah so, slots, in fact, is a very sort of implementation of what we would be after. What we would be after is more akin to the concept of hidden classes in JavaScript that V8 and a bunch of other runtimes implement. What that means is instead of storing your attributes on the object as a hash table, right? Like in the sense that you can just add things and remove things but then the cost of retrieving every attribute is rather high. What you can do is you can understand what the shape of the object, the shape, meaning efficient data structure in memory so that it's no longer a hash table, it's simply consecutive array as you can just index, which has a very nice property of you know having way faster access at runtime, and in fact, we experimented with stuff like this at Instagram that actually did win us some nice improvements in performance, so that change was suggested as something that could be, in fact, implemented in a future version of Python.

18:36 Michael Kennedy: That's super interesting. Do you think some of these changes are going to make it into the Python and to CPython in the future?

18:43 Panelists: So, the things that we need to be careful about, like keeping Python what it is, right? So we don't want to become a bit more performant but way less dynamic like Python's entire success is built around the fact that it's very hackable, right? Like you can change things as you go at runtime, and a lot of our day-to-day work is actually built on top of that frame of mind. For example, if you're mocking things in Python, right? Like during your unit test and whatnot. This is a very dynamic feature so taking stuff like this away could potentially hurt our users, so we don't want to do that.

19:25 Michael Kennedy: Right, a 10% increase in performance is not worth the loss of unit testing capabilities or massive rearchitectures.

19:33 Panelists: Absolutely, so what we want to do instead is, for the kinds of things that don't sacrifice user-visible functionality, we want to have optimizations that actually let us do things faster, do like, do less work! If you can do less work, everybody wins. However, we also have another constraint in CPython that is maybe not always visible which is that the runtime that we maintain has to be maintainable for the team that is actually in it for the long haul, right? So if you were to drop much more performant piece of runtime on our laps, we would thank you, but we would be also rather concerned about, what are you going to do with it now? If this is a very brittle and complex piece of code, we need to take into account that we need to release the next version of Python and then the next version of Python and there's different platforms that we're going to have to run on, in fact, just today, I fixed a bug in Black, where people noticed that, oh, we cannot run on Android phones!

20:41 Michael Kennedy: You're like, I didn't test it on Android phones!

20:44 Panelists: I mean, seriously, like are formatting Python code on Android phones? Like that didn't occur to me as an important feature, but people really have needs like this and those needs change in time and whatnot, so the point I'm trying to make is that we do put a lot of weight to making sure that we understand the piece of software that we're maintaining, right? So there's definitely some compromise that we need to do there. We cannot bring in another million lines of code like from some benevolent contributor and then be left with maintaining that thing, so, yes.

21:22 Michael Kennedy: This goes all back to Brett's thing about open source expectations and you know, givin' away companies

21:26 Panelists: Oh, yeah!

21:28 Michael Kennedy: and things like that.

21:28 Panelists: Absolutely.

21:29 Michael Kennedy: But it sounds like there's some really interesting possible performance of benefits, yeah, that's cool. The next one up is the Gilectomy, Larry Hastings' Gilectomy, the attempt to remove the GIL. Mariatta, you want to take this one?

21:40 Panelists: What I know is, doesn't seem like it's going to happen, but he, Larry had some issues with it, and think he was looking to get inspiration from Carl's previous talk, but last time I checked from Larry, it doesn't seem like he's making any progress.

21:58 Michael Kennedy: Yeah, I think that's kind of the summary of this whole talk is like, I've been workin' on it, but it's not really goin' anywhere, right? I mean, the subinterpreter thing that Brett mentioned, part of that attempt is to break free the GIL. We have multiprocessing, we have asyncio, which, is not really subject to the GIL because the way it works and things like that, at least if it's I/O bound, what you're awaiting. So, I don't know, I'm not sure if it's as necessary as it used to be but it sounds like it didn't really go, go anywhere, right?

22:26 Panelists: No.

22:28 Michael Kennedy: That's too bad, but it's good to... I think it's a cool project and I really hope he makes progress but it doesn't sound like it, right?

22:34 Panelists: Right.

22:34 Michael Kennedy: Alright, so that was, a pretty short update, but that's cool. Why don't you take the next one as well because did you actually do this presentation?

22:41 Panelists: Yes so, I proposed that we started using GitHub Issues instead of Roundup which is bugs.python.org, so BPO, bugs.python.org, is an instance of Roundup that we've been maintaining and it's been working for us, but I think we could, our workflow could be better if we started using GitHub...

23:04 Michael Kennedy: Yeah, Bretty actually,

23:04 Panelists: Issues just because...

23:05 Michael Kennedy: had helped drive the product and move CPython code, so the code is on GitHub already, but the issue tracking had been somewhere else, so you were like, well, why not just, in GitHub, where it is, right?

23:18 Panelists: Exactly and everything else on Python uses GitHub issue trackers except CPython, like our PEPs in GitHub, our dev guide workflow issues, everything is in GitHub already so, I think it's good to starting explore why don't we start using GitHub for CPython as well.

23:40 Michael Kennedy: I think it's a great idea, I think it's, absolutely good thing that CPython code is on GitHub. I know like it's just a source control system but I think it opens up people's willingness to participate and interact with the code way more for being there.

23:54 Panelists: Yeah, and personally I found it always odd that I have to jump from one interface to another, like looking at the issues in Roundup that that looks very different and then I have to go to GitHub to create my pull request like, for me it's distracting, so I think it's it's one benefit that we get this unified experience not have to jump from one place to another and in fact I've started writing a PEP for it PEP 581, so that's for, that's our plan for starting GitHub Issues and we've got to discuss more about this during the sprint, couple weeks ago? Ezio actually went around and ask all other core developers who attended whether they would like start using GitHub Issues and most people are okay like at least not totally opposing it, so I think we might. There's no decision yet because we don't have BDFL to pronounce on this PEP but...

24:57 Michael Kennedy: Yeah, that's another session we're comin' up on shortly, Brett, since you were so involved in GitHub, what do you think about this?

25:03 Panelists: The main reason, when we moved over to the Git repo, we didn't move to Issues was two-fold. One was, moving the repo over itself was a good amount of work just because we were moving from Mercurial to Git on top of hosting and all that, but also there were some initial knee-jerk pushback, like, don't change too much underneath me. I basically just only had so much energy in the world. 'Cause I mean the GitHub transition took two years, like, people don't realize this. It took about a year of discussion and a year of actually making it happen, so, I can only imagine how much more time would have added to do it.

25:38 Michael Kennedy: And how much potential pushback, right?

25:40 Panelists: Well exactly, and I just, only have so much time in the day to deal with pushback, and I got buy-in on the Git stuff, so I decided not to press my luck. I think there's definitely possibilities for having an improved workflow like a lot of the work Mariatta has shown the team is possible using bots through Miss Islington and the stuff she and I have done with Bedivere really show that a lot of workflows can get automated and made fairly cleanly. I do know there are some core developers especially ones that have been around for quite a while who are kind of attached to Roundup in terms of certain feature sets, so my suspicion is if we can add the missing features that they have latched onto on Roundup and somehow mirror them as appropriate or find an alternative that gives them the same result that they're after, I think it's definitely a good idea. In general, I am for it because I live in GitHub already for work, and as Mariatta said, everything else is over there already, so I'm definitely at least a plus zero to a plus one on this.

26:39 Michael Kennedy: Perfect. That sounds good. So Lukasz, the next one was shortening the Python release cycle. I suspect that might've had some input from you.

26:48 Panelists: Yeah, cool.

26:48 Michael Kennedy: Tell us about that one.

26:49 Panelists: I still do have this this idea to like attempt to talk about this on the next language summit, and maybe we're going to get somewhere there with it, but for now I postponed it at least for 3.8 because currently there is nobody to work with on actually making a decision about it or not. So what are we trying to decide? What was the idea about? Well, Python has a release schedule that is currently 18 months, right? So every 18 months, every year and a half, we're going to get a new, major release of Python, like you know 3.7 was just out

27:28 Michael Kennedy: June, yeah.

27:28 Panelists: in the year now, so yeah, like you know, add 18 months, and you're going to get a new release of Python on late next year.

27:34 Michael Kennedy: That'll be exactly when Python 2 goes out of support right, pretty much.

27:40 Panelists: Almost, yes, I'm pushing this a few weeks here and there, just so that we can have some more interesting cadence in terms of doing sprints during PyCon US, and maybe having our annual cores sprints somewhere else, so we try to have those events productive, so we're not super tired to actually have the release exactly every 18 months, like we always have some leeway. For example, when we released Python 3.6, we had a core sprint at Facebook, which was just coinciding with the beta 1 release. Beta is when we stopped working on new features. We were saying, hey, this is the release that we're going to have, now let's just fix all the bugs and things that we already added. We're not adding anymore, we're now stabilizing the release. So, originally that beta was supposed to happen like the week before the sprint, but that would make the sprint very, very boring, and rather sad, you know? So I talked with Ned who was their release manager for 3.6 and 3.7, about just pushing it just a week and a half into the future so that we can actually sort of rally up during the sprint and finish up all the fancy features that maybe or just almost ready but not quite. In fact, as far as I can tell, that was the most productive week in the project's history, even up to today. We're trying to do this at every time pretty much now, but why shorten the release cycle? Well, the problem with having a release every 18 months is that it builds up this mode of work where most of the time spent on the release is just vague ideas, like having things implemented then left in a state of, there are things that needs still to be done, and then the few weeks before the beta cut is sort of aggressive sprinting on making sure that we actually make it in time to meet the deadline and have our featured shipped. But this doesn't really create a very healthy rhythm of the development in my experience, and actually working at Facebook I learned that releasing early and releasing often works very well. It does have a lot of good features, like it does decrease the size of our release, right? If we released say, every six months, that would mean that Python 3.8, 3.9, and 3.10, would have way fewer differences between them, so my upgrading between those release would also be easier for the user. There is a price to pay, right? Like now your Tox Matrix on your open source project is way bigger because there's more versions of Python, it pushes more work onto the release managers, and so on and so on, so definitely something that people need to agree with, but I still believe that we would be better off having smaller releases that are released more frequently.

30:50 Michael Kennedy: That sounds good to me, especially the year thing. You could schedule the release to be in the fall or something so you always have at PyCon sort of a sprint before the beta period closes, things like that, right?

31:04 Panelists: For example, yes.

31:04 Michael Kennedy: Now it's kind of, it's out of phase, basically, every other year, or something weird like that, right, so yeah, I think it makes it pretty predictable. The major tech conferences are in May and June. We know that Python releases in the fall. It was somethin' like this, right, like that's just, how it might go. This portion of Talk Python to Me has been brought to you by, Rollbar. One of the frustrating things about being a developer is dealing with errors, ugh. Relying on users for report errors, digging through log files, trying to debug issues, or getting millions of alerts just flooding your inbox and ruining your day. With Rollbar's full stack air monitoring, you get the context, insight, and control you need, to find find and fix bugs faster. Adding Rollbar to your Python app is easy as pip install rollbar. You can start tracking production errors and appointments in eight minutes or less. Are you considering self-hosting tools for security or compliance reasons? Then you should really checkout Rollbar's Compliant SaaS option. Get advanced security features and meet compliance without the hassle of self-hosting, including HIPAA, ISO 27001, Privacy Shield, and more. They'd love to give you a demo. Give Rollbar a try today. Go to talkpython.fm/rollbar, and check 'em out. Next one is about unplugging old batteries, right? Python is said to come with batteries included, that means it has rich, standard library, and all sort of other stuff you can use but every now and then those go out of fashion. People stop using them, like what do you do then, right? Do you still have to maintain them? So who wants to take this one?

32:38 Panelists: I'll take it 'cause it's been, I've been ruminating on this in the back of my head for, feels like a decade now.

32:44 Michael Kennedy: Sounds good.

32:44 Panelists: Yeah. So, this was a presentation by Christian Heimes, I'm going to butcher Christian's last name, I'm so sorry, where he basically has a rough draft of a PEP, where he has suggested some modules that we can potentially remove from the sired library, and we've done this once before in the large movement when we went from 2 to 3, where I actually personally went through and, God, I went to modules, deprecated, and removed. And the main reason we do this on occasion, some things just turn out to not be useful anymore. For instance, a good example is moving from Python 2 to 3, we got rid of the gopherlib module because, who runs Gopher anymore? Heck, who even knows what Gopher is, right? What? I said I know what that Gopher is. But the key thing here is, there is a burden of cost, of maintenance for every single module we have on the standard library. Even if it's just sitting there, there's still the cost of updating it, for instance, we just made async a proper keyword in 3.7. That means any module that was using async as a variable name had to be updated. There's potential bug reports, there's feature request. Even if all those were ignored they still take time because for instance when you go through and triage issues, you still have to see that it's an issue, you still have to read it, you still have to choose not to do it. So there is a time sync regardless of how much maintenance you actually put into it. So I personally have always wanted to kind of potentially scale back the amount of modules that we've started librarying because I don't think a lot of people realize we literally have hundreds. I wrote a really quick script the other day to count the number of .rst files in the library directory that had either letters, numbers, or underscores, and the count was 248 so that's a rough count of how many top-level module that are in the standard library. I mean, that's not an insignificant number, and you have to remember that there are only 93 core developers in total, and over the past year only 46 people across the globe have submitted a pip that got a PR that got merged. So there's definitely a maintenance burn in here where the amount of code to be maintained is not in a good ratio to the amount of people who are able to help keep it going, so, Christian was basically saying, we got some modules, need to get rid of. A good example from Christian's PEP is, how many modules do you think there are specifically for processing sound files in the standard library? So there's three.

35:16 Michael Kennedy: There should be one. No, there's three.

35:18 Panelists: Yeah. There's three, there's AIFF for a AIFC module, there's the sunau module, and then there's the wave module,

35:28 Michael Kennedy: Wow.

35:28 Panelists: and, exactly. So, why do we have this, why do we even need this kind of thing in standard library? So that was Christian's point was, there's some things that literally just don't have a purpose anymore because they were put in back when, basically Windows stood up to anything 'cause there was no PyPI and there was no distribution model except CPython itself.

35:47 Michael Kennedy: Right, now you can pip install the tools you need to work with, MP3s, or whatever it is, and why does that need to be part of the, standard library, right?

35:55 Panelists: Exactly, and we've never had a discussion as group to really come down with a set of guideline to what should or should not be in the standard library, and unfortunately as we keep alluding to, lack of governance means there's no one to have that discussion with at the moment, but it is something we're going to probably have to decide, do we want to stay as heavily batteries included now that PyPI exists and is so solid? Do we want to scale it back and be more targeted towards what a potential script writer needs to maintain their machine but as soon as they go past anything standard they're just going to be expected to go and get it from PyPI, for instance, Ike.

36:30 Michael Kennedy: Yeah, that's an interesting idea.

36:32 Panelists: Like I don't think for instance we would add argparse necessarily today. We could've stayed with getopts or something simple and say, if your command line interface is going to get that complex, you need like git-level subcommands and stuff. There's probably going to be something developed better on PyPI that you're going to want to grab versus something we should ship in the standard library that we have to maintain. And this is also by the way why we don't have requests in the standard library 'cause it develops faster but also pulling requests would also mean pulling urlib3, and it's just, once again, goes back to maintenance. How do we keep this sustainable and everyone have code quality while not burning out from the fact that there are nearly 200 open issues against argparse alone.

37:11 Michael Kennedy: Right, and the release cycle as well being 18 months means, a new HTTP thing comes out right? It would take forever to get that actually into Python's, HTTP library, which it be weird.

37:24 Panelists: Yeah, so basically Christian's just saying we need to evaluate what's in there right and, make a decision on what really should still be there and what maybe should go out, and he's fairly conservative, as Lukasz has said, we have not had a conversation of general guidelines of what should qualify for something being in the sired library.

37:40 Michael Kennedy: Yeah, it makes a lot of sense to me like I have no expectation that I can write a proper web app without going to PyPI and using pip, alright, just doesn't make sense. But there's obviously the compatibility thing, right? You don't want to rip out things that someone might be using and break them, so, definitely worth considering, another thing that Christian said was talk about Linux distributions and Python 2 and with the end of life Python 2 coming, how are distributions that ship Python 2 going to deal with that so let's keep moving 'cause I think we have some that are more relevant but I think people can read up on that, that's pretty interesting. Lukasz, maybe you should take this one 'cause I know you do so much with static typing the next session was an update on Python static typing.

38:18 Panelists: Static typing is still very much in progress, there's a bunch of things that is happening with it, like one of the things that was discussed was, how should we package external typing information for projects which are themselves not typed yet? Or if they want to be typed now, how do we expose this information to the type-checker like Mypy. In fact, this is one of the reasons why TypeScript and it's ecosystem is so successful because the equivalent there the definitely typed website in the related integration with npm that the types provide, has proven very successful. So Ethan has been working on a PEP about this, PEP 561, and as far as I can tell it's already landed, and PyPI already has the required support for it and whatnot. So in the we are going to be able to slowly move away from the model where Typeshed, the repository, and the library shipped with Mypy and type-checkers is the sort of library of the collection of types for the entire world. That's obviously very hard to maintain, that's obviously very hard to do well, so we would like to decentralize where the stubs are held.

39:44 Michael Kennedy: Yeah, that makes a lot of sense, like, shipping them with say, when you pip install, if I say pip install SQLAlchemy, it would be great if the stub file that defined the types just came as part... That just landed on my file system, I didn't have to think about it.

39:58 Panelists: Yeah, so there a few other updates, I want to be short now, so like I don't actually steal the entire episode, but there's a number of other types that were typing-related like PEP 560 moved some of the typing functionality into the core interpreter so that it's now way faster. PEP 563, very dear to my heart, was about making the typing module a bit more usable in terms of just, I don't know, aesthetics, right? Like enabling things like forward references for classes, which means you can actually use a type of a class within that class or before that class is even defined or whatnot. And the most interesting probably is 544 which is protocols, so like duck typing for static type checkers. This was a rather long presentation, it was not like I find to stop listening so you know, we can do an entire episode on that alone.

40:57 Michael Kennedy: Yeah, absolutely, with the idea of protocols being, that's kind of the way that Python type system works. Long as I can call this function and it has this field, like, we're good, and so, this is like, bringing that into the Type Annotations, world scene. Well, whatever you pass here it has to have this attribute and this method, right? Something to that effect, yeah, nice.

41:15 Panelists: Yes, exactly.

41:15 Michael Kennedy: There was a discussion on whether virtual environments are serving us well especially around the teaching. Brett, this one actually features Steve Dower, your colleague at Microsoft so, tell us about it.

41:28 Panelists: Yeah so, I believe there's actually already a PEP now, up on the proposal. Basically, if you follow certain people on Twitter, one of the coming issues they have is trying to explain virtual environments to beginners and part of that's because pip automatically installs to your global environment, your user environment by default, there's a long-running issue on pip to change that default.

41:52 Michael Kennedy: That would be nice.

41:52 Panelists: Yeah. Feel free to comment on that issue, I'm sure they'd love it. Basically, virtual environments are really what everyone wants but it's an extra level of work to try to explain how that functions. Steve Dower and Kushal Das and various other people got into a conversation over this, where the idea of more or less having something equivalent to node_modules directory came up, proposed his dunder pypackages, and basically the idea is your dependences should be installed in the local directory, and, Python would more or less add that to your system path by default and you would teach tools, pip to just install in there when appropriate and that's it. It's a fairly simple concept. It's right now, once again, no governance means, it been discussed beyond, getting the PEP written, but, probably the discussion's going to end up whether that is enough to want the added complexity of having this yet another way to specify where your dependencies can live, and does it solve enough of the problem?

42:59 Michael Kennedy: Right, are you just changing one hard way to explain things for another and then making it worse by having pipenv, pip, this other way, virtual environments, it's like, ah! Can't take it anymore.

43:10 Panelists: Exactly, so, that's going to be the real question, I think, is, does it solve the case for enough people to warrant the cost of introducing yet another way to do this because virtual environments won't go away because they do serve a purpose. It's whether or not this will simplify it for, 90, 95% of the world and that last 5% are going to be advanced users who will know how to use virtual environments. Or do we just have to push the tooling to just get better to make it easier to work with virtual environments as is, such is pipenv and for something else.

43:41 Michael Kennedy: Yeah, I don't know, exactly what the answer is.

43:42 Panelists: It's a tough question.

43:45 Michael Kennedy: I feel like no matter which of the options you choose they're not quite there, they're close. I just virtual environment straight up with pip, and it's all good, venv module, but still, I hear, hear you're very interesting. Okay, so I think we have time for just two more. We're gettin' close to the end of our time we have together so the next one, I'll let whoever wants this, take it. This is PEP 572, that was the big, in-line assignment PEP that caused a lot of grief, and Guido stepped down over, and now what are we going to do in terms of decision-making, and it sounds like almost everything we've talked about is hinges upon this, who wants it?

44:25 Panelists: Well I see everyone's leapin' up to take the lead on this discussion.

44:29 Michael Kennedy: This one's a hot potato man, no one wants this, alright, so, yeah.

44:31 Panelists: I can do it, it's fine. So basically it was a discussion, this was, so, the language summit happened in May. Guido stepped down, I believe in July, so this all predated Guido taking his indefinite vacation as BDFL. And basically it was a discussion of, what can we do going forward to manage our PEP process better because PEP 572 which is assignment expressions was a very, as Michael alluded to, a very hot topic, where a lot of things were discussed on Python ideas and clearly it got resolved and it was actually a reasonable discussion and then it went to Python dev, as part of the PEP process, and all of it got rehashed yet again. And actually it got a little, I don't want to say pushy is quite the write word, but it got a bit vehement, and some people were making rather grandiose statements like at one point I think someone like, core dev said, if this gets merged I will refuse to review any PR that uses this feature. Like it got really, I personally think, overblown, which is partially why Guido just said, "I never want to have to fight this hard to defend my decisions ever again, I'm taking vacation." So this would be a discussion of what can we do do to try to get the PEP process to not be so burdensome while keep it, obviously, for our needs of recording history, making sure we don't have to have discussions about every single solitary suggestion someone has and try to find a good balance of bloat versus basically proper gatekeeping to make sure we don't feel overloaded.

46:03 Michael Kennedy: Yeah, that's definitely important.

46:04 Panelists: I guess, just to add, I think it's also something we all looking into improving in our governance PEPs. We want to try to look into how the back process has been going in the past and whether there are things we can improve on how to make decisions and as well as, where should discussions happen? Those are also being discussed in the governance PEPs.

46:31 Michael Kennedy: Yep, okay, that's an important one. Alright, Mariatta, I'm going to give you the last one because we are out of time but we got about three minutes to talk about a pair of presentations one of which you gave about mentoring and diversity in Python. Want to give us the wrap on that?

46:46 Panelists: Yeah so basically, we've been trying, the whole core team has been trying to improve our diversity. If you remember a few years ago, there was no women at all in the language summit itself, but this year we actually have a few, and we're trying to improve it and be better. I actually consulted with Sage Sharp on getting ideas on how we core developers can help improve diversity among our own contributors team and I shared some ideas. I got some ideas for Sage and shared that with the core developers so we've been doing some things. I know Guido himself mentored a few women who now are core developers. I know Victor himself is looking. He told me he's looking for women to mentor, and we have set up office hours over in Zulip, and I know Zack where he has a calendar where people can sign up to get one-on-one mentorship with him. All of this are all...

47:57 Michael Kennedy: That's really cool.

47:59 Panelists: ways we are trying to do in making it to lower barrier into contributing, welcoming more under-representatived people.

48:08 Michael Kennedy: I would love to see more diversity in the core developer team but also it's just good for the community overall I think. Can we make sure we put those links into the show notes, like if you help me get those links so people out there who may want to contact these folks to get mentorship or consider it in this process?

48:26 Panelists: For sure, yes, so we do have the core mentorship mailing list, I will share, and if you look at the dev guide, there is a link for core developers office hours there, so I will share the links.

48:36 Michael Kennedy: Great, thank you. Alright, I think we have to leave it there because we're on a schedule, can't just talk forever, so thank you all for being here, Mariatta, Lukasz, Brett. Thank you for sharing this whole experience, and I'm looking forward to hearing about the 2019 language summit when it all happens.

48:52 Panelists: Thank you so much. Thanks Michael.

48:53 Michael Kennedy: You bet. Bye, everyone. Bye. This has been another episode of Talk Python to Me. Our guests on this episode have been Mariatta Wijaya, Lukasz Langa, and Brett Cannon, and this episode has been brought to you by Linode and Rollbar. Linode is bulletproof hosting for whatever you're building with Python. Get four months free at talkpython.fm/linode, that's L-I-N-O-D-E. Rollbar takes the pain out of errors. They give you the context insight you need to quickly locate and fix errors that might have gone unnoticed until your users complain, of course. As Talk Python to Me listers track a ridiculous number of errors for free at rollbar.com/talkpythontome. Want to level up your Python? If you just getting started, try my Python Jumpstart by Building 10 Apps, or our brand new, 100 Days of Code in Python, and if you're interested in more than one course, be sure to check out the Everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for Python. We should be right at the top. You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct rss feed at /rss on talkpython.fm This is your host, Michael Kennedy. Thanks so much for listening, I really appreciate it. Now, get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon