#100: Python past, present, and future with Guido van Rossum Transcript
00:00 Michael Kennedy: Welcome to a very special episode. This is the 100th episode of Talk Python To Me. It's the perfect chance to take a moment and look at where we've come from and where we're going, not just just with regard to this podcast, but for Python in general. And who better to do this with than Python's inventor himself, Guido van Rossum (@gvanrossum)? In this episode we discuss how Guido got into programming, where Python came from and why, and Python's bright future with Python 3. This is Talk Python To Me, episode 100, recorded January 18th, 2017. Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm, and follow the show on Twitter via @talkpython. This episode is brought to you by Rollbar and Hired. Thank them both for supporting the show. Check them out @Rollbar and @Hired_HQ on Twitter, and tell them "thank you." Hey everyone. I just wanna take a moment and reflect on this milestone of 100 episodes, and say a big thank you to everyone out there who is listening. The reason this podcast is successful, the reason I've kept doing it is because so many of you tell me that you appreciate what I'm doing, that you enjoy all the guests that I have on the show, so I wanna say thank you because without you, obviously, I would not have 100 episodes. I get to live basically my dream job. I talk to all these brilliant people in the tech industry, and I get to share it with you, and get all the great feedback that so many of you give me. So thanks, and we really have a special guest this episode with Guido and a look at Python over the years, the past, present, and future, and I really hope you enjoy that. If you're out there thinking, "Hey, I really love the show and I like to support it." there's a couple of things you can do that are really easy. One, if you give us a review on iTunes, that actually makes a big difference, how we rank within iTunes, so that's kind of like Google Rank for podcasts. That would be great if you wanna do more than that, one of the really great ways to support me and what I'm doing would actually be to buy one of my classes, or recommended my training content to your employer or your team at work. If you're into that, check that out at training.talkpyhton.fm, and there's even a little Patreon link if you just wanna give a dollar or two a week. So, thank you everybody for making this possible. Thank you for helping me reach 100 episodes, and it'd been great to share them with you, and I hope you've really enjoyed them yourself. All right, with all that said, here's Guido. Guido, welcome to Talk Python.
03:04 Guido van Rossum: Glad to be here, Michael.
03:05 Michael Kennedy: I'm honored that you're coming on my 100th show to celebrate this special episode, and I know everyone in the community is really going to appreciate this look at the history and the present and future of Python with you.
03:18 Guido van Rossum: Well, let's have it.
03:19 Michael Kennedy: All right, absolutely. So, we're gonna dig into a whole bunch of things about Python, but before we get there, let's start with your story. How did you get into programming in the first place?
03:27 Guido van Rossum: Well, that was a long time ago. In high school, I did not know what a computer was. I believe I had not even ever heard of the word. I was an electronics hobbyist though. And I got started I think around the age of 10, building very simple analog circuits, things like a little radio receiver that I made from a kit. And I gradually discovered simple digital electronics and integrated circuits were becoming available to hobbyist like me, even with my very small amount of pocket money. And that's where I was when I graduated from high school. Then I went to university to study mathematics, the University of Amsterdam, and it was like a completely different world. They had a mainframe in the basement, and there were programming classes, and the languages that I remember were ALGOL 60 and Pascal, and I was basically instantly hooked, even though I... The first year, I remember, the only way I could input programs to the computer was through punch cards.
04:29 Michael Kennedy: Yeah, if you start programming basically via hardware, and then you have this machine that you can just feed anything, the whole world opens up, right? Even if it's punch cards, it's like it can do anything I asked it to do almost, right?
04:42 Guido van Rossum: I was so happy that I didn't have to sort of solder stuff together anymore.
04:47 Michael Kennedy: That's right.
04:48 Guido van Rossum: Because that was always my week point.
04:51 Michael Kennedy: Oh, it's such a difference. So you guys started really in the early days, and when the term hacker meant something entirely different, right?
04:58 Guido van Rossum: I don't think I knew the word hacker. That was like decades later that people told me I had been a hacker.
05:06 Michael Kennedy: "Oh I didn't know, how interesting?" Okay, excellent. And then you got into... You started working on programming languages. How did you go from playing with mainframes and punch cards to working on things lik ABC and stuff?
05:19 Guido van Rossum: Well, I guess I developed an interest in learning different programming languages that probably started out when my first year, there were two different languages being taught I think, ALGOL, and Pascal. And then I hung out with a bunch of physics students, and their favorite language was Fortran. So, right from the start, there was this discussion about "ALGOL!" "No, Fortran!" "No, ALOGL!" "Pascal!" And somehow that interested me, and i always stayed on the ALGOL and Pascal side actually.
05:52 Michael Kennedy: I was told that, when I was in college, that Fortran was the most important language I'd ever learn in my career, and I should just focus on that. And I was pleading to take some C and C++. I'm like, "Can I do it?" "No, Fortran is where you gotta start. "You can do those as an elective afterwards."
06:07 Guido van Rossum: Oh gosh, yeah. In our university, it was more like a split between the math department and the physics department. The natural sciences were using Fortran because they were sort of processing measurements, and math people were more interested in sort of pure computer science.
06:25 Michael Kennedy: Yeah, very interesting, yeah.
06:26 Guido van Rossum: The professors actually sort of cross that bridge. So, later, actually, I encountered the author of ABC, Lambert Meertens. I encountered him in my personal life, sort of an extra curricular activity where I was helping some volunteer group doing programming, and he was also helping that same group. He was something higher in the organization. And he realized that I was a good programmer and had interest in programming languages and their design and implementation, and had some interesting skills there. And when I was about to graduate, he just offered me a job, and that was that.
07:07 Michael Kennedy: That's fantastic, right? You're probably thinking, "How am I gonna find a job? What I'm gonna apply? What am I gonna do? Oh, it just landed in my lap, how wonderful."
07:16 Guido van Rossum: And then I learned really what the job of a language designer and implementer is. I didn't get to design any part of the ABC. I got to discuss it with Lambert and other team members endlessly, but basically the design was already complete when I joined the team, and all they needed was someone who would implement it. But by fighting every aspect of the language that I didn't understand, I prompted Lambert and others to explain what their reasoning process in the language design phase was, and that helped me learn how to be a language designer.
07:55 Michael Kennedy: Yeah, I'm sure it did, because they designed it more or less in the abstract, right? And they said, "All right, now it's your... now it's time for the rubber to hit the road." And you actually make this thing work that we've specked out, right?
08:06 Guido van Rossum: Yeah, much like that.
08:07 Michael Kennedy: Yeah, and how much of that experience you feel like made it possible for you to actually create Python. To me, thinking of, "I'm gonna create a language, I'm gonna create the CPython implementation in the standard library and all that." It's very daunting and very big challenge, but do you feel like you've kind of got a first round practice of at it, doing this thing at ABC?
08:26 Guido van Rossum: Absolutely. Without having being on the ABC team for four years, I would never have been able to do that. I wouldn't have felt comfortable. I wouldn't have known how to. I wouldn't have known enough about language implementations. I mean Python really is sort of the next version of ABC with all the things that were great about ABC retained, and all the things I thought were not so successful in ABC removed, and (a) very small number of my own ideas replacing them.
09:02 Michael Kennedy: I sees, so Python was sort of your, "Let me do this. Now knowing what I know, let me make a better version of something similar to this."
09:10 Guido van Rossum: Correct, yeah, that's exactly what I was really thinking. I manage to present it to management in a slightly more objective fashion.
09:22 Michael Kennedy: How did you pitch it?
09:23 Guido van Rossum: Well, part of it was that management was not very closely involved in my day to day activities. There was a certain amount software that had to be written, and however it was written was great as long as we had sort of the running applications to prove it at the end. I was more or less at liberty to invent a different strategy for sort of eventually building that software faster.
09:51 Michael Kennedy: That's fantastic. That's some of the origins of Python. What happened to ABC? It's not around and Python is one of the most popular programming languages in the world. Why did those things take different paths?
10:02 Guido van Rossum: I've written about this in my old Python history blogs a few times. The ABC project actually, four years into it, at least for me, four years into it, was canceled by upper management at CWI. The reason being that there was no observable sort of user uptake. There were very few people interested in the language. There were even fewer people who are actually using it, and the team just couldn't move that needle. And part of that was that this was well before the internet, and there was a little bit of Usenet, but it was difficult to distribute a language implementation and get people to use it.
10:51 Michael Kennedy: Right, there's no GitHub. There's not even SourceForge, right? There's no web there. There's so many other pieces that make these work.
10:59 Guido van Rossum: It's worse than that even. There was no electronic way to distribute the source code at all, when we got started. I remember taking a long vacation to the United States with a nine track computer tape in my luggage. And taking that tape to two or three different places in the US where there were people interested in using ABC, so that they could load that tape onto their computer and get the sources, because the amount of source code was larger than you could possibly send as an email. I think attachments hadn't been invented yet or were still limited in size.
11:40 Michael Kennedy: Yeah, you have to like Base64 encode it. Just put it as text or something, right? I'm gonna send you a thousand emails, numbered one, two, three, four.
11:49 Guido van Rossum: Yeah. Even at first, Python distribution had to suffer through some of that, but by then, in '91, it was about 20 compressed maximum-sized messages to some Usenet source code group.
12:03 Michael Kennedy: Yeah, Usenet was just starting to become popular, and the internet was starting to be a thing, but web browsers didn't really come out until '93, '94, so it was quite early days in '91.
12:14 Guido van Rossum: Mm-hmm, correct.
12:14 Michael Kennedy: So, did you envision Python being open-source from the beginning? What was your thinking around that?
12:21 Guido van Rossum: I would say yes. ABC actually, in the sense that the concept even existed, was meant to be open-source. We were not interested in selling it. We were just interested in promoting the language. If the words open-source had existed in the early '80s, we would have said ABC is open-source. As it was, I don't think we had even realized that there was some kind of need of a license.
12:48 Michael Kennedy: Right, that's amazing. Just the nomenclature didn't even exist to describe. You had to use words and a description. This is a thing we're giving away. You don't have to, so on, right?
12:56 Guido van Rossum: And by the time I started on Python, there were a few models that were pretty solid. I remember, I think the big model that I just copied almost literally was the MIT license that went on X Windows. No, you're not supposed to call it X Windows. It was X11. But that was a big essentially open-source piece of free software that was very sort of intentionally open sourced because the authors wanted to unify windowing software across different hardware platforms.
13:33 Michael Kennedy: There were all the different flavors, how do you write apps that run on all of them, and thing like that. That was a big problem, huh?
13:38 Guido van Rossum: Exactly.
13:39 Michael Kennedy: Okay. So how do you think having Python open-source has helped or maybe even hindered Python over the years?
13:46 Guido van Rossum: Oh, it's only helped. I mean if it hadn't been open-source, people would not have been interested in picking it up. Because it's one thing to download an application that's not free software, that's not open-source, and use it to process some of your data. Then it's quite different to start writing your own software using a language that hasn't proven itself yet.
14:14 Michael Kennedy: Right. If maybe you're willing to pay for one of the top three most, the tooling and what not, for the top three most popular language, because you know that's kind of where a lot of the momentum is. But in order to break into that space, being open and free really allowed you to wedge yourself in there, right?
14:33 Guido van Rossum: And it was also the model that some language that I felt I was competing with most directly, like Pearl and Tcl/Tk. Those were also open-source. I do not recall any names, but I recall there were several similar level scripting languages being design and distributed at the time that we're not open-source, and very intentionally so, where there was someone who had a brilliant idea for making a better scripting language. And for all intents and purposes, their language probably was better than what was available, but their model for funding their work was sell copies of the interpreter, and that just never worked.
15:17 Michael Kennedy: Yeah, in those early days, it was still not entirely clear what business models would work for the developer community and what wouldn't, people were really experimenting in. I'm sure some things were lost because of bad choices were made, but I'm really glad Python is still growing strong. Did you ever imagine back in 1991 that open-source and Python would be where they are today?
15:40 Guido van Rossum: I in general suffer from a terrible lack of imagination and vision. So, at no point in Python's history have I ever adequately predicted where Python would be five years from there. So, no, I had never thought that this would happen this particular way.
16:00 Michael Kennedy: To me as well, it's just amazing. Just even over the last five years, the way things have changed is so amazing. GitHub has come into to existence. We're seeing companies that used to be fiercely proprietary, become much more embracing of open-source. I've recently saw the CEO of the Linux Foundation, or the head of the Linux Foundation, standing next to Satya Nadella at a Microsoft conference saying they love each other.
16:25 Guido van Rossum: Yeah.
16:27 Michael Kennedy: This is a different place than we were a while ago. I think it's great for everyone though. I think it's a very, very positive path forward.
16:35 Guido van Rossum: I'm very happy with how this all has turned out. I'm hopeful that a lot of technology will continue to be open-source.
16:44 Michael Kennedy: Yeah, I think it will, and I think it's great to see companies taking open-source and building business models along side of it that are sustainable, companies like Continuum or Scraping Web or these guys that have a great popular open-source project, and somehow they're adding value on top of it, but they're not abandoning open-source.
17:04 Guido van Rossum: Without open-source, you have to do all the work yourself as the company that owns it. And now, mainly if you're a large company like Google, or IBM, or Microsoft, or Apple, you don't mind because you have tons of developers. But for anyone who is smaller, the value of a community is so tremendous because you sort of... You will still be able to make money on a variety of consulting and support projects, so that's how open-source developers support themselves generally. I'm actually an exception. I'm just employed by some large software developer that uses a lot of Python. That's been my person model. But yeah, companies like Continuum or...
17:50 Michael Kennedy: Canonical.
17:50 Guido van Rossum: Canonical. Everything they produce is open-source that they make a lot of money through handholding of customers who don't want to sort of hire their own software developers. That's a model that works for many types of open-source software.
18:11 Michael Kennedy: I think it's great to see people being successful in that model and experimenting with other ones as well. Let's talk a little bit about language design and trade-offs.
18:19 Guido van Rossum: Sure.
18:21 Michael Kennedy: Python has been growing in popularity pretty dramatically over the last five, 10 years. It had been around for 15 years before, and it seems like not only is it still relevant and popular, but that popularity and relevance is growing. And I think part of that is expanding into different areas. Like the adoption of Python and the data science space I think has brought many new people to the Python ecosystem where Python is a primarily language now. How do you trade off sort of serving these different environments or these different ecosystems? Somebody wants in a language as a data scientist might be very different than what somebody wants as a web developer.
18:59 Guido van Rossum: On the language design side, I don't usually take applications into account that much. I've seen some language designs where people proudly announced, "But in our language, a URL is a standard data structure "that is built into the compiler." They say that as if that's a good thing, and I'll advice you is that you can leave out some string quotes, and internally it usually is still a string.
19:27 Michael Kennedy: Right, exactly.
19:27 Guido van Rossum: It's a gimmick, and Python has always presented itself as a general purpose language, and you can do many things in Python. And I didn't design Python for web development obviously, because the language is older than the concept of web development, and ditto for data science. And for web development, it turns out enough people from different backgrounds are interested in doing simple web development using Python that we ended up with a bunch of stuff in the standard library. But still, often the most successful APIs even for web development are actually third party packages. And what the standard library provides is more low level than that, like the standard library has to provide things like sockets. And in fact, one funny story is that, I think in the first year that Python existed, before we made it open-source actually, I was teaching myself how sockets worked. Because sockets were sort of a new thing in our environment at that point. We had a bunch of Unix machines, and I had never really known how the networking on those machines worked before. And then some colleagues started writing little C programs that use sockets, and it was so cool, and their program always crash because they didn't do the right error checking. And I wanted to know what those sockets were about. And my way of teaching myself was "Oh, I'll write a Python extension that wraps the socket API." And so I read the man pages and say, "Okay, well, there is like the sockets call, and the bind call, and the listen call." And I wrap each one of those in as low level an extension as possible with proper error checking, because Python has always has always had this philosophy if something goes wrong you get an exception. And then I started combining those calls in sort of random combinations and figuring out what errors I will get when, and what forms of simple programs actually worked, and that's how Python's socket module came into existence, and later once the world wide web started being promoted, I joined a variety of mailing list around that, and started writing my own little web servers and clients, and eventually that turned into URL, but what people actually use for web stuff is third party frameworks like Twisted or for web serving, pure web serving, it's like Django or Flask. For web clients, it's requests. And so, the standard library doesn't even contribute that much beyond the sort of the low level sockets.
22:23 Michael Kennedy: Sure, so, I totally agree with that. It's the packages that have absolutely made Python successful. Just looking on PyPI.org right now, we've got 96,000 plus packages, and that's really a testament to how amazing the whole ecosystem and community is, right? How do you decide when something should be in the standard library or something should be an external package, and have these ever moved in or out?
22:55 Guido van Rossum: Yeah, I have to admit that the standard library is still fairly uneven because long ago, say 15 years ago, I had much less of a filter about what things would be good standard library modules. I had very strong backwards compatibility requirements, like once a module is in the standard library, it can sort of grow, but it can't just change in an incompatible way, or at least that's a major project, with deprecations and all that. But I didn't have much of a bar for including stuff in the standard library. And that started out with various hobby projects that I wrote myself and played with for two months in 1991 or so. We're still in the Python 2 standard library around 2010. Many of those things finally got ripped out for Python 3, but of course Python 2 is still there. More recently, sort of as Python matured and became more popular, and from my perspective, it's just sort of being steady exponential growth. I can't tell you whether it was five or 10, or 20 years ago that Python suddenly start becoming popular. But so, the current rules for inclusion in the standard library is a combination of something that is useful for multiple application areas. A new API for web development will not make it into the standard library just because that's just one area. It has to be something that's useful for a wide variety of applications. It doesn't necessarily have to be all applications, but something like, well, sockets are obviously something that is useful across the board. Many things in the standard library that really belong there also have to do with sort of the language itself, like introspection tools, partial functions. Those kind of things are good standard library things.
24:54 Michael Kennedy: The disk module.
24:55 Guido van Rossum: For example, yeah.
24:55 Michael Kennedy: Yeah.
24:57 Guido van Rossum: Things that are bad for inclusion in the standard library is usually almost any piece of code that is under active development. Because Python only issues a sort of feature release every 18 months or more, and that's just a really slow pace, and you can't even get people to upgrade quickly. So, if someone has a new idea for... Let's take an example. We async I/O which is in the standard library, but it doesn't have a built-in web framework. Well, why is there no async I/O based web framework in the standard library? Because the async I/O based web framework that exists is under very active development and constantly changes, and not always in compatible ways. And so, people would just be much worse off being stuck with whatever was the async I/O based web framework around Python 3.5.
26:02 Michael Kennedy: You would actually hinder it by putting it in Python, right?
26:05 Guido van Rossum: Exactly.
26:06 Michael Kennedy: You would force it to freeze its APIs, so that could never change in a breaking way, and it couldn't release more than every 18 months, and probably some third party package will come along, mimic that, but iterate faster, and be better anyway.
26:20 Guido van Rossum: Yeah.
26:20 Michael Kennedy: Okay. This portion of Talk Python To Me has been brought to you by Rollbar. One of the frustrating things about being a developer is dealing with errors. Relying on users to report errors, digging through log files trying to debug issues, or a million alerts just flooding your inbox and ruining your day. With Rollbar's full-stack error monitoring. You'll get the context, insights, and control that you need to find and fix bugs faster. It's easy to install. You can start tracking production errors and deployments in eight minutes or even less. Rollbar works with all the major languages and frameworks, including the Python one, such as Django, Flask, Pyramid, as well as Ruby, JavaScript, Node, iOS, and Android. You could integrate Rollbar into your existing workflow, send error alerts to Slack or Hipchat, or even automatically create issues in Jira, Pivotal Tracker, and a whole bunch more. Rollbar has put together a special offer for Talk Python To Me listeners. Visit rollbar.com/talkpythontome, sign up and get the bootstrap plan free for 90 days. That's 300,000 errors tracked all for free. But hey, just between you and me, I really hope you don't encounter that many errors. Loved by developers and awesome companies like Heroku, Twilio, Kayak, Instacart, Zendesk, Twitch and more. Give Rollbar a try today. Go to rollbar.com/talkpythontome. Has anything been brought in to Python that was originally external that you can think of?
28:03 Guido van Rossum: I don't have specific examples in mind, but it's definitely happened. That definitely occasionally happens. It's usually more of a threat where we say... If someone comes with an idea and they say, "This functionality should really be in the standard library." and nowadays, we usually say, "Well, that looks more like an application, "and how do we know that's it's actually useful? "Prove (to) us that the design you have in mind works "by first releasing it as a third party package PyPI "and show us how popular that is." And then there are also maintenance requirements like we would require that a contributor commit to several years of keeping that code up to date and fixing bugs and stuff.
28:50 Michael Kennedy: Sure.
28:51 Guido van Rossum: We've had things that actually were kicked out of of the standard library because they were too much of a maintenance burden for the core development team. I think the last time that happened might have been ESDDB, Berkeley DB, which was a very large package. It's currently much happier as a third party package than it ever was in the standard library.
29:12 Michael Kennedy: Yeah, I'm sure it can, like we said, grow much faster, and so on. There are databases in there, for example, SQLite ships with Python.
29:19 Guido van Rossum: SQLite is one of those things that is so popular and so versatile and so useful for so many different application domains that that was definitely the right decision to include that. Also, SQLite itself is incredibly stable.
29:36 Michael Kennedy: Yeah, that's one of the things. It's not changing much these days. It's very reliable. That's great. So speaking of contributors and having people commit to a certain amount of support over time, if something is gonna come into the standard library, how do you ensure that the core development community invites and retains the best contributors? I have a lot of respect for the core developers, but how do you make sure that ecosystem is healthy and vibrant?
30:03 Guido van Rossum: I don't think we're doing that very consciously. We have a nominal mentorship program in place, but it's mostly used to get people thinking about contributing to open-source projects in general. I don't think that's where we get most of our sort of new core developers. In practice, new core developers almost always happen because somebody has an itch to scratch, and they happen to be a really good programmer, or at least sort of made a serious study of Python, and start contributing. And the people who reviewed their code, and sometimes that's just one or two core devs who sort of mentor that one person finds that their contributions are of high value, and then at some point the mentor or one of the mentors proposes, "Hey, Python dev, core developers, what do you think of giving person X commit bit, so that they, they can commit their own code after it's being reviewed?" There's often real discussion about the sort of potential new contributors maturity, not just in terms of their pure programming chops and how well they know CPython, or Python, or whatever their area of contribution is, but also, do they have the right character? Are they likely to sort of not commit something when the reviewer says that it's not ready?
31:37 Michael Kennedy: Yeah. I suspect one aspect of that is there's a lot of people that come into programming, they see a shiny new thing, something with Node.js, or some other technology that seems like it's just gonna take over the industry, and then it doesn't necessarily, so there's probably a level of maturity of experience tech evolution over the long term so that you can bring that to the language, right?
32:00 Guido van Rossum: That would be too high a bar. You can't say people can only start contributing once they've gone through one technology boom and bust cycle. You need them to have a certain character that is okay with the pace of things, and the thoroughness, and you definitely want them to have a very good attention to detail.
32:22 Michael Kennedy: Sure. I guess you definitely want a different level of attention to detail and being meticulous when you're working on internals of CPython, rather than if you're working on in some packets that's used by a thousand people a month, right. These have different requirements for that kind of stuff. So, you've been a champion of diversity in the whole Python ecosystem. I just wanted to just point that out and say thank you, because I think it's really making the Python space better in some tangible ways than other environments. Just by way of a story, I went to a conference in London last year, and I took my 16-year-old daughter, and this was not a Python conference. It was decidedly not. But I was speaking on Python to these folks, say, "Hey, you should also learn Python. This is cool." And I brought my daughter. We went to the speaker dinner. And it was 28 or 29 speakers there, and I think one or two women. And my daughter looked at me for a while, and goes, "Where are all the women?" I said, "This is the sad part of the tech industry." When I go to Python conferences, I don't feel that way. I just think that's great.
33:25 Guido van Rossum: I've personally always been a feminist, although never a radical one. Encouraging women has been a pretty natural thing for me. I think there was a specific series of events, not entirely sure when it was, but I remember that there was some upset maybe around a decade ago at OSCOM. Some people started pointing out that the open-source community was, despite all its sort of pride in, well, we care about results, rough consensus and working code and all that, it was not a very diverse community, and the number of women that someone counted at OSCOM I believe was well below even the already low industry level. Somehow that was a wake up call for me. There had always been a few active women in the Python community, and I had never really counted how many they were, or if it was always the same, two or three. But because of that discussion at OSCOM that I did not participate or even witness in person, but that's through various blogs, and peripheral vision came to my attention. I thought, "Hm, what's the situation in (the) Python community?" Actually, we're not doing great. So, I sort of made a mental note of, well, maybe we ought to think of how to increase the participation diversity in the Python community. And I think that the main consequence of that was that when there were actually women who came to the Python community or were already part of the Python community and said, "We would like to do something for women. We would like this community to be more welcoming." I thought, "That's a great idea, let's do that." Rather than responding in a way that some other tech communities have responded, by feeling threatened or, "Oh, we don't have to do anything. If you're a good coder, you're a good coder, and then you're welcome. And if you're not a good coder, you're not welcome regardless of whether you're a man or a woman." There's actually a fair amount of bullshit in that attitude, because those people don't realize how much bias there is. So, I've always been very open to the PSF and Pycon, and various groups to try and reverse that bias by giving sort of diversity funding to Pycon attendees for example, I think I should call out specifically Jessica McKellar.
36:01 Michael Kennedy: Yeah, I was gonna bring her up.
36:02 Guido van Rossum: Who has been a champion of this for a long time, has given some fabulous keynotes about this. And all I had to do was be supportive there, I felt, which came naturally. I remember in real world politics, since I was of voting age, I've always made a point of voting for a woman whenever I could.
36:23 Michael Kennedy: I think the progress that's been made is really fabulous. And you're right that Jessica deserves a lot of credit for that. She's been fabulous at it. We're in a much better place than we were before with regard to that. Although, I still think there's more work to do.
36:38 Guido van Rossum: Absolutely.
36:39 Michael Kennedy: I'm super happy to see the direction.
36:40 Guido van Rossum: There's always more work to do and diversity is not just about women either.
36:44 Michael Kennedy: Absolutely.
36:44 Guido van Rossum: Let's remember that.
36:46 Michael Kennedy: Yep, yep, absolutely. I was just, at the top of my mind, because my daughter was so struck. I took her to this tech conference thinking, "Oh I'd really like to show how cool the tech scene is and stuff." And she did come away with that, but she also came away with a little bit of feeling like, "Maybe this is not for women." which made me a little bit ... I just felt bad about it, right?
37:04 Guido van Rossum: Oh, that's terrible. Yeah.
37:05 Michael Kennedy: Yeah, it certainly is. All right, so let's shift gears a little bit. One of the things that there's a lot of talk about right now, and I think is turning a corner in a very positive way, but has definitely been a hot topic lately and over the last few years is Python 3 and the migration towards Python 3.
37:26 Guido van Rossum: Ha! Yes.
37:28 Michael Kennedy: I love your attitude. When I see you speak at conferences and do keynotes, and there's a great 2.8 sign with like a slash through it. Just say, "We're moving forward. We're not going back, you guys." How do you feel we're doing on that?
37:42 Guido van Rossum: That certainly has been a much harder more arduous journey than anybody had really anticipated. I think right now we're well over the hump, the destination is in sight. It's clearly better on the other side, but the mountain range was much bigger and colder, whatever, scarier than we had anticipated. I think that, honestly, the mistake that all us of in the Python core and actually the whole Python community, the mistake we made was underestimating Python's popularity. We sort of thought of Python as this sort of relatively small language with a relatively small number of dedicated followers who would all jump at the occasion of making their code more readable, and converting to this new version of the language that was so clearly better in many ways. We just underestimated how much code people had written in Python 2, and really old versions of Python 2 at that, how much documentation there was that would have to be updated, not just the standard library, and the core Python documentation, and the reference manual, but all the hundreds of Python books and websites, and answer sites, and Stack Overflow questions, and this and that. And we just... Initially, when we first started talking about Python 3000 and actually, the first time that the term came up was around the year 2000. It then took seven or eight years before we did anything about that. But initially, from our perspective and certainly from my perspective, but everyone who participated in Python 3 had sort of had the same experience. Everybody was excited about this change. We're going to fix the language. There are all these PythonWarts. Andrew Kuchling, who was an important early core developer, had very influential post about PythonWarts. He enumerated maybe a dozen of sort of key issues with the language that we were hoping we could somehow improve. One example was in 2000 we introduced unicode, and we introduced it in a way that would be incredibly backwards compatible. By the time, three or four years later, when that design had completely been settled, we started finding out that applications actually became more brittle because they didn't expect the unicode to pop up in places where it happened, and that sort of, that was one of the things that we wanted to fix in Python 3, but we had no idea that there were people with millions of lines of Python code that was all interrelated and written by people who are no longer with that team, or that company, or that project, and what it would take. I mean had we known we could have taken a different, different tack, we could have made Python 3 somehow... We could have changed a few features in Python 3 to allow somehow Python 2 and Python 3 code to coexist in the same virtual machine.
41:07 Michael Kennedy: Sure, somehow make it just seriously deprecated but not gone, things like that, some of the features, right?
41:13 Guido van Rossum: Yeah, and had we really really wanted that, we could have done that, but we underestimated the difficulty it would be for the average Python user to convert their code, because we thought, "Well, someone has a few script here and a few scripts there, and maybe a thousand line library that they're using." But in fact, all the numbers were 10 or 100 times larger.
41:39 Michael Kennedy: Making it so much harder to switch, right?
41:41 Guido van Rossum: Yeah, and once we realize that this was not an ideal situation there wasn't any chance of sort of backing out, and there was also no chance of suddenly accelerating the conversion process, so we've done best we felt we could do, which included actually back porting many things to Python 2.7, that in terms of libraries that were available. Back porting even more things on PyPI, like you can use the enum34 package I think in Python 2.7, and then you can use enums in Python 2. But we also wanted to give the community and the users a sort of a clear message about the future of Python, and that's where the sort of no 2.8 banner came from.
42:32 Michael Kennedy: You can always say, "Well, maybe we should've done this differently." But I'm not sure that it would've necessarily been different. You could've said, "Well, we're gonna leave basically some 2.7 or Python 2 in Python 3, and just put new features and clean up around it." But then you still have all these old code that still is written in the 2.7 style. The fact that people adopt the three features, maybe they wouldn't, and it would've been more of a hindrance, rather than just going, "All right. We're just gonna have to make this break, and just jump the gap to get there."
43:03 Guido van Rossum: It's a complex problem. There's no perfect solution, and you also can't just stop evolving the language.
43:10 Michael Kennedy: Absolutely. I feel like Python 3 is going so fast and doing so much. It's really positive.
43:16 Guido van Rossum: Thank you.
43:16 Michael Kennedy: Yeah, you're welcome. A term that I started using, I got this from Matthias. I'm forgetting his last name, sorry. From some of the Jupyter projects. He's referring to Python 3 as Python, and Python 2 is legacy Python. And I think that's an interesting way to think about it.
43:35 Guido van Rossum: You can try to change the language in the hope that people will subliminally be sort of influenced. I don't know how effective that is. I think the highway administration or whatever it's called have attempted to remove the word accident from our language and replace it with crash.
43:57 Michael Kennedy: Sure.
43:58 Guido van Rossum: That's wishful thinking in my view. I pretty consistently just say Python 2 and Python 3 whenever the distinction is important.
44:09 Michael Kennedy: I'll round up this part of the discussion. I definitely feel like we've crossed the boundary.
44:13 Guido van Rossum: Absolutely.
44:14 Michael Kennedy: It's gain momentum. I see more and more project that say, "Either we're Python 3 first or Python 3 only, and if people wanna back port it, that's fine." This is the right path.
44:23 Guido van Rossum: The really good news is that basically all important libraries work as well with Python 3 as they do with Python 2 or better. Which means that the early problem with Python 3 adoption was that, well, if you had a thousand lines of pure Python code, that was very easy to port. But if you have a thousand lines of Python code that depended on seven different packages, you have to wait for those packages to be ported. That has taken a long time. That was, I think, one of the issues that we underestimated, but that problem has now pretty much been solved completely. NumPy, Django, Flask, everything you can think of works with Python 3, even dateutil. If you want to start writing application code from scratch, there's nothing to stop you from using Python 3.
45:16 Michael Kennedy: Yeah, and that's really great. I do feel like that was maybe the single biggest barrier, because even if people had the intention of switching the Python 3, but they depended upon these libraries, and they just couldn't run well. They're like, well, are gonna throw their hands up and say, "Well, I can't convert all these libraries. I got work to do."
45:33 Guido van Rossum: Yeah.
45:33 Michael Kennedy: Yeah. So, let's do one more Python 3 topic here. Let's touch on a little bit of your favorite Python 3 features.
45:41 Guido van Rossum: The two that I've been I think most directly involved in were async I/O until two years ago. Since then, I sort of... Async I/O has matured to the point where I don't have to be involved all that much directly. Although, I carefully encouraged and reviewed the developments towards async and await, which has been absolutely marvelous. The other favorite of mine, is function annotations. And what we now have in Python 3.6, the variable annotations PEP 526, and the whole type checking area. I think those are my favorite features. The one other thing that I'm personally very happy with is the proper distinction between bytes and text in Python 3, as opposed to the messy way of dealing with that in Python 2.
46:37 Michael Kennedy: Sure, that absolutely solves that whole, "It's unicode, it's not unicode. What is it?"
46:42 Guido van Rossum: Unfortunately, that has also been a major porting barrier.
46:46 Michael Kennedy: Yeah. I've heard that several times, especially people working on web frameworks or things that touch the network.
46:51 Guido van Rossum: The whole generation of Python 2 programmers grew up basically between 2000 and 2010 or so, who knew all the ins and outs of the compatibility and incompatibility between unicode and bytes, and the ambiguity of an 8-bit string which is sometimes bytes data, and sometimes text in ASCII, and sometimes text encoded in UTF-8, and sometimes text encoded in some other encoding, and sometimes it interoperates with unicode, and sometimes it doesn't. But people got used to the ambiguity and actually exploited it in their APIs, which made their APIs difficult to port forward to Python 3.
47:44 Michael Kennedy: This portion of Talk Python To Me is brought to you by Hired. Hired is the platform for top Python developer jobs. Create your profile and instantly get access to 3,500 companies who will work to compete with you. Take it from one of Hired users, who recently got the job and said, "I have my first offer on Thursday. "After going live on Monday, and I ended up getting eight offers in total. I've worked with recruiters in the past, but they've always been pretty hit and miss. I've tried to LinkedIn, but I found Hired to be the best. I really like knowing the salary upfront. "Privacy was also a huge seller for me." It sounds awesome, doesn't it? Well, wait until you hear about the signing bonus. Everyone who accepts a job from Hired gets a thousand dollar signing bonus. And as Talk Python listeners, it gets to way sweeter. use the link hire.com/talkpythontome and Hired will double the signing bonus to $2,000. Opportunity is knocking. Visit hire.com/talkpythontome, and answer the door. So, does it surprise you that new things are still happening in the CPython internals, like dictionary got a major rework in 3.6, things like that?
48:51 Guido van Rossum: It doesn't really surprised me. Dictionaries in particular are such an incredibly fundamental part of Python. It's used everywhere internally, and it's used everywhere in applications, that it's totally par for the course that every at least once every decade, maybe twice, some major innovation happens in that area. I remember when I first started Python, it was like, well, dictionaries were obviously something I needed, and in reaction to ABC, which used B-Trees are at least some form of balanced trees, which I thought was very tedious and I had worked on that code forever in ABC, and it was always buggy. So I thought, "Well, for Python, I'll do a hash table. That's a new thing," and I just looked it up in Knuth volume three. "What's the basic hashing algorithm? How do you hash a string? How do you implement hash table? Well, open hashing or linked lists?" And I made my decisions and implemented it, and then I had to move on to other things, module objects, functions.
49:58 Michael Kennedy: You'd write the whole thing, basically.
49:59 Guido van Rossum: I had to build a whole language, long integers, exceptions, by code. So, the first innovation I think happened when someone with an actual mathematical schooling in that area realized that I had copied an old algorithm from Knuth that was no longer state of art, and there was something like I think I... Knuth had discovered that it was good if you had hash table sizes that were primes or something or relative primes. And it turned out with changes in processor architecture, powers of two were certainly better.
50:35 Michael Kennedy: Interesting, yeah. I've always heard of primes as well. So, that's still in my mind.
50:38 Guido van Rossum: Yeah, that's no longer the state of the art. I mean caches in CPUs and the whole sort of L1, L2, L3 memory architecture has affected language implementations dramatically, and I don't even know all the ins and outs of that area. And fortunately, we have other core developers who keep up with that and do that for me.
51:00 Michael Kennedy: Yeah, that's great, that's great. I'm sure we could dig into all sorts of those things. There's so many other questions that I'd like to ask you with. I wanna be respectful of your time and not go too long. So, let me ask you one final Python question. What do you see coming in Python 3.7, and what would you like to see there?
51:18 Guido van Rossum: Well, I mentioned earlier that I'm not the greatest visionary. Also, 3.6 only came out a month ago. Unlike some other languages, there's no secret cabal of people who are already planning what the language looks like five years from now. I'm sure somewhere, there's C++ committee that's anxiously designing C++ 19 or 20, or whatever the next version is going to be. Well, that's not how we do things in Python. And we just, over time, during the alpha stage of the next feature version, we tend to just collect ideas, and PEPs, and proposals, and often, a real life experience with the previous major version or the previous feature version, I should say, directs an evolution. For example, we added async I/O in Python 3.4, which let Yury Selivanov to come up with the idea of async def and await, and introduced that in 3.5.
52:21 Michael Kennedy: That's great, because that really cleans up that API, yeah.
52:24 Guido van Rossum: That is an incredible improvement, and that is not something we could ever have done if we hadn't had the clunky version of async I/O with yield from in Python 3.4, because the generators actually are... I love the story of generators. I can talk for hours about that because it has been such a rich source of language improvements from the very early for loop to iterators, and generators, and co-routines, and yield from, and then async await, and sort of small improvements to that in 3.6 even. That's been a very gratifying thing. But I don't know what's gonna happen there next.
53:10 Michael Kennedy: Yeah, sure.
53:11 Guido van Rossum: I expect that static typing, optional static typing, what we're doing with Mypy, that's my current project at Dropbox actually, I expect that will also makes strides forward, but I don't know what those strides are going to be at.
53:29 Michael Kennedy: Yeah, so it really highlights that the language is a journey and a living thing, right?
53:33 Guido van Rossum: You take a step and you look around, you see the landscape that you just ventured into from a different vantage point, and you can pick a different destination, or you can plan your next step, or day trip, or whatever, whatever metaphor you want to use. You can plan your next evolution based on where you are and what you see. And sometimes that just comes like "Whoa! We see all these people using this feature that we just introduced two versions ago in a completely novel and interesting way," and now we suddenly realize, "Oh, there's a better syntax or a better API that's sort of waiting to come out." And that's very exciting, but I don't have very specific plans for 3.7 yet, let alone beyond. The only thing that sort possibly way beyond would be the gilectomy, the removal of the GIL, but that is also not at all certain. It's a very complicated story.
54:41 Michael Kennedy: You talked about the gilectomy for quite a while, and it's definitely an interesting area of focus. This sort of, you take one step, you climb one mountain, and you see a new horizon of possibilities, is that one of the things that keeps you interested in working on Python and maintaining and overseeing the language over time?
55:00 Guido van Rossum: The excitement of actually climbing a mountain and then seeing a whole new valley that you didn't know existed certainly motivates me.
55:08 Michael Kennedy: Yeah. That's excellent. I guess, we'll leave it there. Let me ask you two questions before we get out of here. First, if you're gonna write some Python code, what editor do you open?
55:18 Guido van Rossum: Emacs. Well, actually, I don't open it. It's already open.
55:22 Michael Kennedy: It just stays open.
55:23 Guido van Rossum: Well, my shell also mostly runs in Emacs.
55:26 Michael Kennedy: Yeah, fantastic. Okay. I know you don't like to play favorites but, if there's a notable PyPI package that you really wanna highlight, that maybe people don't know about but you'd like to say, "Hey, you guys should really should check this out, because it's a cool example," or something. What comes to your mind?
55:39 Guido van Rossum: Actually the one thing I want to highlight here is Mypy, which is a static type checker for Python that I didn't write, although I have now been contributing to it for well over a year. But it's still mostly Jukka Lehtosalo's creation. And we're using this at Dropbox, and we have over 400,000 lines of annotated code in a code base that totals over 4 million lines of codes. We have ways to go but we also have more than 500 developers who are interacting with this tool, and who, by and large, are very happy with how adding type annotations makes the code more understandable and more readable, and sort of makes them more confident when they want to undertake refactorings and things like that. And Mypy, the specific reason I want to highlight Mypy as a PyPI package is that, until a week ago, if you try pip install Mypy, you would get some completely unrelated package that is named Mypy that has not being maintained for five years. And we finally convinced the leadership of PyPI to give us that project name. And now we can see a pip install Mypy. Actually, you have to do pip3 install Mypy, and it will actually do what you expect it to do.
57:04 Michael Kennedy: Okay, well, that sounds really excellent. It's great to see you guys using the type annotations and things like that at Dropbox. Guido, we have a final chance for you to give a call to action to the community. What would you like people to do? What's on your mind?
57:18 Guido van Rossum: Well, I really do want people to give Mypy a try. That static type checker is pretty amazing, and I need to point out that there are a still a lot of misunderstandings about what it can do and what it cannot do. Static types are completely orthogonal, or almost completely orthogonal to unit testing for example. It doesn't mean you have to write fewer tests. Occasionally, there are a few trivial tests you don't need to write, because the static checker catches the same issues better. But by and large, a type checker just catches very different category of errors than unit tests. So they complement each other nicely, and together, they give you more confidence. One of the misunderstandings about Mypy is also being that when we started PEP 484, the sort of the introduction of type checking was Python 3 only, but we've actually amended that. And for the past year, we've been using Mypy successfully with Python 2 code base.
58:15 Michael Kennedy: Okay.
58:15 Guido van Rossum: So, you can search PEP 484 for Python 2 support, and it's completely there in Mypy, as well in Python 2 as it does on Python 3.
58:28 Michael Kennedy: Oh, that's really nice.
58:29 Guido van Rossum: Yeah, the other misunderstanding about Mypy or static typing in Python is that people have predicted that it will turn your Python code into Java, which is obviously nonsense.
58:41 Michael Kennedy: There's no interfaces all over the place. Yeah, so, I think actually moving that type annotations to Python 2 is really important because it provides some foundation for when you do wanna upgrade to Python 3.
58:53 Guido van Rossum: Correct. My secret plan at Dropbox is actually that once we have a large enough fraction of the code base annotated, we can start converting into Python 3 in a semi-automated fashion, in a way that would not be possible without those annotations. We're not there yet, but that's my secret plan.
59:13 Michael Kennedy: That's fantastic. All right, well, I wanna say thank you again for the conversation. I really enjoyed talking with you, and I'm sure everyone out there learned a lot, so thank you so much for being on the show.
59:23 Guido van Rossum: My pleasure.
59:24 Michael Kennedy: Yep.
59:24 Guido van Rossum: Hope it's a good one.
59:25 Michael Kennedy: It's gonna be a great one. Bye.
59:26 Guido van Rossum: Bye.