« Return to show page
Transcript for Episode #191:
Python's journey at Microsoft
00:00 Michael Kennedy: When you think about Microsoft, do you think about Python? Maybe not, but you probably should. They've been doing an incredible amount of work to improve Python for folks on Windows as well as the broader community. You can of course look at the wild growth of Visual Studio Code, but did you know that five of the core developers worked there and that the majority of Python development actually happens on Windows? Join me along with Steve Dower, a core developer working at Microsoft who published an amazing retrospective on Python at Microsoft entitled Python and Microsoft: Flying Under The Radar. This is Talk Python To Me, recorded December 6th, 2018. Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I'm @MKennedy. Keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter via @TalkPython. Steve, welcome back to Talk Python.
01:09 Steve Dower: Thanks for having me back.
01:10 Michael Kennedy: Yeah, it's great to have you back. There's so many interesting things that you're doing and you've been really instrumental in pushing Python on its most popular platform actually on Windows, and I think you actually have a few interesting surprises for the listener, so I'm really excited about what we're going to talk about. But let's set the stage by just talking about what you do day to day. You're a Python core developer and you work at Microsoft. Take it from there.
01:36 Steve Dower: Yes, so I really get two hats in most conversations. I'm either wearing my Microsoft hat or my CPython hat. And I feel like most times when I come and chat with you, I'm normally wearing the Microsoft hat which is fun and that's certainly the one I have on today. So for CPython, I'm one of the Windows experts, do a lot of the Windows support and builds and everything. At Microsoft, I'm one of the Python experts and so I get to kind of roam around the company working with a lot of groups as we really try and ramp up on Python in a big way, getting to help out various teams, make sure they're doing a good job. We don't have a huge base of Python culture, Python experts throughout the company. So there's a lot of really really good engineers like expert engineers and all of their languages. And bridging that gap between I'm an expert C# programmer, I'm an expert C++ programmer and how do we make you look like an expert Python programmer is a really big part of what I'm doing right now, but it's a lot of fun.
02:38 Michael Kennedy: Yeah, I guess two things really strike me there. One is you do kind of live in this middle ground where the context matters so much, right? Like when you're at say, PyCon, people are like, oh my gosh, you're a C core developer and you work at Microsoft, what's that like, right? Whereas when you're at Microsoft, they're like what do you mean you're a Python person? Why don't you do C# and C++? What's going on here, right? So there's like this really big context which I bet.
03:03 Steve Dower: Oh yeah, I have virtual hats that I will literally mime putting on and off and I feel like I just need to get real ones.
03:09 Michael Kennedy: Yeah, absolutely. So there's a couple of things we're going to talk about. And the first one, I really want to dig into is Python's journey at Microsoft. Because Microsoft is one of these companies that started out not super open in terms of its open-source contributions and its culture around open-source. Famously Balmer had not super positive things to say about Linux for example. But I feel like Microsoft has really made this transformation and you've written this really cool article, essay maybe on your personal journey that sort of charts some of those things as well. So maybe we'll start there.
03:49 Steve Dower: Yeah, it felt a bit like writing like the start of my memoirs which I feel like I'm too young to start doing that. But no, I was invited. We've got a series of people's kind of open-source journeys, open-source stories coming out right now. We've already had one about C# and rewriting the compiler for that. One about .NET foundation, if I'm recalling correctly and yeah, now one about Python's journey. I was thrilled to get to write it because it has been an exciting time and it's taken a number of years but it's one of those things where I've at least been able to observe it along the way and participate in a few places.
04:30 Michael Kennedy: Yeah, so where does this journey start? Traditionally, Microsoft's been a kind of, we're going to build it here even if they exist, we're going to create our own version of the thing, right? There's famously Java and C# in the early days. C# was something, a reaction to Java, I think that was more of a legal issue around Sun and Oracle than it was more of a not invented thing but there has been that kind of stuff around say like source control and other things that in the early days I feel like, maybe made Python not as welcome as it could have been.
05:03 Steve Dower: Yeah, yeah, there's certainly... When a company gets to a certain size, there's always a lot of that going on, even within the company like we still have teams all over the place inventing the same thing not realizing that another team at the same company is doing it the same time, which is really interesting. And one of the great things that I get to do because I bounced around a lot of teams is actually connect some of these up and say, hey, this other team is doing the same thing as you. But we're definitely a lot better at looking out then we used to be, 'cause there was certainly a time that we'd hear about problems or we'd recognize problems or we'd have problems ourselves, and go well, the only people who can solve this is us, so let's build a solution. And that's dramatically changed. Like the first thing now is, let's look out and see how other people have solved it and let's help them, let's help our developers use it. Let's help developers outside the company be able to use it as well. And it's also just hiring. I mean 'cause people change at the company, the older people leave, the younger people join and they're coming in with a totally different experience these days. All the open-source stuff is everywhere throughout academia, it's everywhere obviously throughout open-source and that's what people are coming into the company with.
06:18 Michael Kennedy: Right, even scientists are now doing pull requests and stuff on GitHub, right? We're writing Python code instead of MATLAB or Mathematica. It's all aspects of academia seem to embrace an open-source not just the computer science side.
06:33 Steve Dower: Yeah, no absolutely. And I mean some of the biggest things that we look at and say oh, this is a computer science thing were created by scientists in the first place. Like Travis Oliphant is quite happy to stand up and say he's not a computer scientist, and yet where did NumPy come from? Where did SciPy come from?
06:51 Michael Kennedy: Exactly, exactly and it's made such a massive massive difference. I guess some of the things that I see that sort of highlight some of those contrasts and I only know from the outside, right? I don't bounce around inside the engineering teams there are things like Microsoft created CodePlex, as sort of an alternative to places like GitHub. I don't know exactly the timing of when GitHub came out versus CodePlex, but then eventually they decided, what, everybody's at GitHub. Let's just move things like the .NET open-source projects over to GitHub properly, ASP.NET famously for example. And then, well let's just buy GitHub because that's like where the action is. And really embracing this place that is, has plenty of Microsoft stuff. They were playing Python and others as well. There's other examples of those types of things where it's like well, it started out maybe this private build your own thing and now we're going to go do... Another one is like the source control story, like TFS and all that, and now Git and GitHub and so on. So I think those are the outside changes I see, but on the inside you've seen maybe... I guess one of the big stories is Python is really starting to gain true traction, right? It's like, it's starting to show up in lots of products and get some real legitimacy, not just the stepchild thing that we have to care for 'cause some people demand it.
08:12 Steve Dower: Yeah, no, it absolutely is. One of the mandates that came down a little while back is that all of, one of the must support languages for APIs and anything that you might want to manage from outside the company, anything you might want to manage on Azure. So if you want to create virtual machines or create new storage accounts, be able to push pull files from wherever, Python is one of the required languages. You can't call your service ready until it has Python support for it along with a handful of other languages. But that, hitting that point was really exciting and also a little bit terrifying because suddenly there were huge huge code bases that were suddenly deemed not ready. And of course we didn't pull anything. It was an internal, the next thing you need to do is add your Python support.
09:01 Michael Kennedy: Right, and so then all these teams with this complicated code base come to you and a couple of colleagues, and say, we need some help.
09:51 Michael Kennedy: Yeah, that's really great. And in your story you talked a little bit about, they expected this to take a long time. You're like no, no, we can knock this out pretty quickly and they're like uh-huh, sure. Well, just give it a try, right? You want to talk about that bit.
10:05 Steve Dower: Yeah, well I mean any... I feel like most people who've built a command line in Python using Argparse are not going to be too surprised at how quickly you can get something working. But I think they were previously doing it in Node.js and they'd started it much much earlier on, and basically ended up with their entire command line parsing library written by themselves. And so, by comparison without being able to take something out of the standard library for a language and just use it, and be able to use it so dynamically. Like they have thousands and thousands of commands which Python can handle because you haven't statically declared everything, you haven't had to write them all out in full. You can loop over a list of files and read stuff out of it and build up your command line that way. And so it was, I mean it was an intense few days. Don't get me wrong. But yeah, the first version came out, you can dive back into the Git history and see my commits. And I was basically going straight to the repo from the start, so they're all there. And yeah, came out with something that had kind of a scattering of features that they said they were interested in. And then got to stand up and present it and hand it off, which is a really nice feeling to be able to give some code to a team and walk through it. And spend an hour showing off the bits and pieces and have them say, you just saved us months of work, thank you so much. That's definitely one of the best feelings that I've had in what I've got to do at Microsoft.
11:32 Michael Kennedy: Yeah, that's a crazy experience. You hear stories about Python having this effect in different ways but that's, that's a pretty stark contrast right there, that's great. Is it still in the Azure CLI? Is that still in Python, that's shipping now?
11:49 Steve Dower: Yeah, it's absolutely still in Python. It bears, I think there's maybe two lines of code that are still there from what I wrote, because they've gone and rewritten all of it. They've actually eventually migrated off Argparse because it turns out that when you are getting up to thousands of subcommands, that has some performance problems.
12:07 Michael Kennedy: Whoever wrote that is like, oh what's the upper limit? What's the at most commands you've ever seen in an app? 24, okay well, this is going to be fine, right?
12:17 Steve Dower: I stumbled into a conversation at a conference between two people working on other command-line libraries. I forget exactly what the libraries were but they were discussing how their performance was kind of falling off around the 50, 60 subcommands level. Was that okay? That's probably okay, and I'm like, hey guys, we got a few more than that. But the library they went and turned it into, as I said, well Microsoft is full of amazing software engineers and a few of them learned Python, didn't necessarily know it to begin with and created what was eventually reflected out into this library called Knack that is a highly scalable command-line parser. The amount of work it takes to get started is higher than Argparse, but the result that you get is much much better.
13:08 Michael Kennedy: How interesting, I had never heard of Knack actually. That's cool.
13:11 Steve Dower: I hadn't actually heard of it either until I was going to the team and saying, hey, there are people who would love your command-line library, have you thought about refactoring out? And they just sent me a GitHub repo back and said here it is. Oh, beautiful. But yeah, I've had real interesting performance discussions with that team. They've pulled a lot of tricks to make Python start up faster and to give some context between how Microsoft thinks about its software and kind of of the community does. This other conversation about 30, 40, 50 subcommands starting up in under a second, it was, yeah, that's acceptable. Meanwhile one of the the engineers from the Azure CLI team is desperately trying to reduce the startup time under 250 milliseconds with thousands of subcommands.
13:53 Michael Kennedy: Wow.
13:53 Steve Dower: And I don't actually know that he realizes how much of an achievement he's already made to get it down that far, with something so big.
14:03 Michael Kennedy: That is so cool. I mean it's projects like that that kind of push the boundaries and make it better for everyone if that gets pushed back into CPython.
14:11 Steve Dower: Yeah, I hope so.
14:12 Michael Kennedy: Yeah, let's talk about the early days. I guess the very first recollection I have of Python and maybe similarly, Ruby, making its way around Microsoft is first I heard of IronRuby and then IronPython. I think that was the order they were created, and then maybe shortly after that was Python tools for Visual Studio. This is not VS Code that people know today, right? This is like traditional Windows only Visual Studio. You had these tools you could plug in for Python, right?
14:43 Steve Dower: Yes, so that's basically the timeline there as well. I came in just after Python Tools for Visual Studio was getting started. I think I was there for the 1.0 ship party. I don't remember exactly, I can probably go back and figure that out but I was definitely there for a party which was one of the earliest releases. But yeah, before that there was a project to make the CLR, the Common Language Runtime, work well for dynamic languages.
15:10 Michael Kennedy: The thing that .NET runs on.
15:11 Steve Dower: The thing that .NET runs on top of to make it work well for dynamic languages which have a whole different style of when you have the code and you compile it into this intermediate language but you still don't know what your name lookups are going to find, which we're totally used to in Python but the common language runtime was not initially designed for that and so there was this project to make a dynamic one that could handle all of that. And IronPython and IronRuby were kind of the test cases for let's show this working on real languages, and both of them got to very good states in terms of supporting the language at that point in time. IronPython is still going in bits and pieces. I do occasionally bump into people at quite large companies with very serious uses of it still. They're actually quite happy with it.
16:02 Michael Kennedy: That's cool, there's some interesting integration cases, right? Like you can do things like the UI framework, WPF in Python on top of IronPython I think. And stuff like that, you're like, wait, you can put these technologies together and, yeah.
16:19 Steve Dower: Yeah, and it works fairly nicely provided you're happy to live in that 2.7 world with severely restricted library support at this point. There were a number of projects to try and bridge that gap which have largely fallen away unfortunately. So the alternate implementations beside CPython are, a little bit weaker in terms of extra library support which is fairly off-putting. The ecosystem is just so critical to Python.
16:44 Michael Kennedy: Yeah, without it, it's just not even hardly the same thing. As a core developer, what are your thoughts on these other runtimes, right? Like PyPy, Jython, IronPython, do you look at them and go, the energy would be better placed just trying to make CPython itself better or they're good experiments, what are your thoughts?
17:06 Steve Dower: My thoughts are very big and complicated on that and I think we'll probably go a little bit deeper in the next topic, when we get to it. But suddenly there's been a huge amount of value come out of the fact that they exist, and that people have worked on them and that there's been collaboration between the reference implementation CPython and the other implementations. Everyone has gotten better as a result of that. Now whether the energy is better redirected, honestly there's not that much energy in the other projects which is a real shame. But it also means that redirecting it, doesn't necessarily add a lot. And I'll point out for the dynamic language runtime, Microsoft did two proof of concepts because having IronRuby and IronPython flushed out things that just having one would not, and in the same way I think having CPython, Jython and PyPy, IronPython flush out issues in what Python the language is, that you would never find out if you only had one implementation.
18:05 Michael Kennedy: That's interesting, so if they say run the test suite, actually this part is really vague, we don't actually know what this means that could help make CPython better for sure, right?
18:14 Steve Dower: Yeah, there were a lot of language semantics that were clarified particularly while IronPython was growing up. PyPy has been coming up a little bit later. It has a lot more support, there's a lot more people working on it than the other projects, but they're trying a lot harder to be compatible with whatever CPython does. And so there isn't as much kind of push and pull on what does the language actually mean here as when the other languages were really going.
18:40 Michael Kennedy: Interesting. This portion of Talk Python To Me is brought to you by Linode. Are you looking for hosting that's fast, simple and incredibly affordable? Well, look past that book store and check out Linode at talkpython.fm/linode. That's L-I-N-O-D-E. Plans start at just $5 a month for a dedicated server with a gig of RAM. They have 10 data centers across the globe, so no matter where you are or where your users are, there's a data center for you. Whether you want to run a Python web app, host a private Git server or just a file server, you'll get native SSDs on all the machines, a newly upgraded 200GB network, 24/7 friendly support even on holidays and a seven-day money-back guarantee. Need a little help with your infrastructure? They even offer professional services to help you with architecture, migrations and more. Do you want a dedicated server for free for the next four months? Just visit talkpython.fm/linode. Let's talk a little bit about your being a core developer and Microsoft employing you, so basically in some sense being a sponsor of your open-source contributions in that sense. Maybe I'll kick it off with a joke that Brian Okken over on PythonBytes showed me. So it's around Christmas time. So there was this cartoon, and there's like one of these Santas at the mall and kids go and sit on the lap of the Santa. The Santa asks what do you want for Christmas, and the Santa says now what do you want for Christmas, girl? Come on, be realistic. She says, I want enough donations that I can just work on my open-source project. He goes, okay, what color do you want your dragon to be? But in a sense, you found this place in the world and there's other companies as well, but where Microsoft is basically supporting your work to make CPython better, right?
20:30 Steve Dower: Yeah, and the story of that coming about is very different from what everyone gets now. I feel like now people just... They are going through the interview process to get a job at Microsoft and just kind of tack on, hey, you should give me one, two days a week to work on this open-source project that matters. And we just go, okay sure. Back when I did it, I spent weeks negotiating with lawyers, I had patent attorneys basically interviewing me to find out what I was going to do.
20:57 Michael Kennedy: Yeah, would you somehow taint all of Microsoft by interacting with open-source code and you've seen some GPL, and then all badness breaks loose, right?
21:06 Steve Dower: Yep, a whole lot of advice on, don't have other code up on the screen while you're working on it. The risk management that we were doing at the time makes total sense, it really does. And when you understand that the aim of our legal team is not to help us win the eventual court case, it's to avoid it in the first place, you can see why they're so protective. Because basically once any kind of IP litigation goes to court, everyone's already lost, really. A couple of lawyers have won, but they're not our lawyers so they're not interested in it going to court. This is other people get paid for that time. So really it's take as much care as possible so that whatever comes up, just gets thrown out as quickly as possible. And so there was a lot of caution there. But again, it's risk management. So those of us who were doing that kind of work at that time. At the same time I was doing that, we had msopentech still which was the totally separate company that just happened to work on the same campus and some of the same buildings as Microsoft and beyond the same email network and everything, who were legally allowed to work on these projects because there was some legal boundary where you couldn't sue all of Microsoft if you decided to sue them. So I was doing it from within the company. Like people like Dena Valen had already done before me and a handful of others.
22:34 Michael Kennedy: Yeah, I guess it's worth mentioning, there's more than just you as the core developer at Microsoft, right? There's Brett Cannon and who else?
22:41 Steve Dower: Barry Warsaw is at LinkedIn but we count him because we like LinkedIn now. Eric Snow and Trent Nelson is starting whenever immigration gets their act together I believe.
22:52 Michael Kennedy: Right on.
22:53 Steve Dower: The fun of being an Australian working at Microsoft in the US.
22:55 Michael Kennedy: Yeah, a lot of hoops to jump through, right? Yeah okay, so it's not just you who have found this place. There's this group of folks inside the company.
23:04 Steve Dower: And again, the transition within Microsoft over the last six years has gone to the point where we aren't forcing open-source contributors to sit down with lawyers to describe what they're going to do ahead of time. That's really nice, it means we have a lot more contributors to projects who are able to support and help develop projects in ways that benefit, obviously they benefit us otherwise we wouldn't get work time to do it. But they benefit the community as well. We're trying to find that balance between kind of straight-up altruistic development on open-source projects and being able to resolve the issues that people are having that are kind of specific to us.
23:44 Michael Kennedy: Yeah, well some of this, it's kind of like basic scientific research. Some of this comes back to help the world in nice ways. Like for example, the Git stuff that you all contributed back to Git so that you can move Windows to Git because kind of like a command line argument thing, like the number of files. Basically virtualizing the Git file system had to be done for Windows to be part of Git, right? Or be on Git.
23:44 Steve Dower: Yeah, we had a number of repositories internally that where just the working directory was being counted in gigabytes of code, and we had one. So one of the repositories I worked on through the transition from TFS to Git, one checkout, when I first checked out that repository, it was something like 180GB, just the working directory.
23:44 Michael Kennedy: Just the main branch.
23:44 Steve Dower: Just the main branch and not even the history. Like just the latest up to date. We had a whole lot of stuff checked in that didn't need to be. Some massive work went on. And I think we got it down to six gig, and said that is the best we can do. And part of that six gig is a script that downloads the other 100 gig of tools into a different directory.
23:44 Michael Kennedy: Wow, that's incredible.
23:44 Steve Dower: Yeah, there's some really really huge code bases that a lot of the big corporations have inside that simply never get heard of outside. Like open-source community run projects are never going to consider those cases because they sound utterly ridiculous. And yet, but all of these really big companies have nothing but utterly ridiculous cases.
23:44 Michael Kennedy: Yeah, and they've been around for so long and so many people worked on it. It's pretty insane, just the number of developers continuously contributing to that stuff. So I feel like one of the big turning points in Python's journey probably hinges around Azure. What do you think?
23:44 Steve Dower: Certainly Azure has created a place for Microsoft to really care about developers in a way that is kind of different from when Windows was the main thing.
23:44 Michael Kennedy: Right, because it used to be well, as long as it works well on Windows and we get that app on Windows, that was the key. But it's hard to say well, all the Python developers are going to come over and run on Windows potentially. Because maybe they're hosting on Linux or whatever. But now, they're all going to run on the internet, right?
23:44 Steve Dower: Exactly, if they're running in our data centers then we're happy to have them and we want to do whatever we can to roll out the red carpet for whichever developers want to come and work with us and it's so much less about the end-user platform, because that's essentially the browser now. It's like everyone's hiding behind the browser, the browsers work everywhere.
23:44 Michael Kennedy: The browser and the phone, yeah.
23:44 Steve Dower: So yeah, it really has created that opportunity to do production apps in Python, which I mean Python was never really designed for rich user facing applications. That's never been one of its use cases. It can certainly do it. There's other languages that have been designed for that kind of thing. And so Python's place has always really been more developers and command-line happy people. And adding the web to that and being able to run your code on Azure enables Python developers in a way that want to encourage, and I don't want to say it's forced us because most of us are happy about it.
23:44 Michael Kennedy: It's broken down the barriers or the walls to pursuing the stuff you'd like to do anyway maybe. How's that?
23:44 Steve Dower: That's basically it. We have somewhere to put our Python developers that we want to be working with now, which is come and run your stuff on Azure. Well, multi-cloud, hybrid cloud whatever it is but we have somewhere for Python developers to run production code which Windows never really had a place for it before.
23:44 Michael Kennedy: Yeah, for sure. So when I go to Azure these days, I feel like, it's like opening up an encyclopedia. It's just like there's so much stuff in here, what do I even do? How many services are there?
23:44 Steve Dower: This actually came up while I was writing the story because I have a mentioned in there, Lorant as well who manages the Azure management SDK in Python. Which basically has modules for every single Azure service there is. And when I drafted it, I'm like it's probably around 50, or right around 50 and then at one point I went and checked. And it's over 100.
23:44 Michael Kennedy: Wow. And so I checked again, I'm like no, it's still over 100. So I asked him and he's like, yeah it's definitely over 100. I don't even know what they all are. I feel like there needs to be a song about it, like there's that song that all the US kids learn in school for all the states.
23:44 Steve Dower: Yeah. Like we need that song for all the Azure services just so we can remember what's going on. Yeah, in terms of getting started, what I think was a really good move and I'm not sure who made it but I'll go give 'em a high five at some point. There's basically a set of free services on Azure. There's like a free sign up, credit card verification or like .edu email address verification. And you get like a limited set. And the thing about it being a limited set is there's one obvious choice for whatever you want to do within that set. And so it's like, I want to push up a website, so well app services here. That's the obvious choice, you don't have six things that you could make work that are all subtly different and serve different use cases better which you can grow up into. Once you're at the point where it's like, this is a really serious workload that my company or my users is depending on, let me optimize it for certain use cases. It really just is. You get one choice. You want a virtual machine? Have a virtual machine. Oh, you want to choose from the 50 different sized virtual machines and different drive options and different GPU options and different memory options? Step up to the paid thing for that, but having that kind of filter on, this is what you need to just do something. Taking away that choice is one of those things that I think is actually a really good user experience at times.
23:44 Michael Kennedy: Yeah, sometimes less is more in that sense. That's pretty cool that the free tier starts that way. So let's spend just a little bit more time on this journey, maybe focus on the areas that you've worked on. I don't know how many of you listeners out there know but you actually, when you interact with Python on Windows, they are interacting with a lot of your work, right?
23:44 Steve Dower: Yeah, so actually the way I got into core development was I was at PyCon, I think I was at my second PyCon ever, Pycon US that is. And had some annoyances with the installer for Windows and so I basically found some of the core developers and said...
23:44 Michael Kennedy: It was cool that it was there but it definitely was a little bit not, tell us some of the challenges that you ran into because of it.
23:44 Steve Dower: It was a simple installer for a simpler time because there was a time on Windows where you just dropped files anywhere, and they run and anyone who has physical access to the machine can do whatever they like. Then the internet happened and it turns out that even people who don't have physical access to the machine have access to the machine, and we really ought to be locking things down a little bit better, and that sent off the whole security kick that eventually brought us to where we are today. Where we get all the way to store packaged apps that managed in a really really really secure and isolated way that didn't exist at the time when the previous Python installer was written, which I think came straight out of the DOS days. So it absolutely served a very important purpose for making Python available on Windows, but it just hadn't grown up at the same rate as Windows was growing up. So I basically found some core developers and said, hey, I would like to contribute some work to improve this, who do I talk to? How do I go about suggesting ideas, proposing the work, contributing the work? And they basically said, give us your contact, we'll chat and we'll get back to you. And they got back to me the next day, because it happened that the person who was currently maintaining Windows, Martin Von Lowis had decided to step away and they were looking for a replacement. Very good timing and so, I was basically co-opted. I'm one of the rare people who basically gets dropped into core development without having contributed. That's not the usual way people become core developers but I feel like it turns out when you're very much in a minority like that, I'm such an affirmative action hire for the CPython core development team, it's not even funny. Because there's very few Windows developers there and so if someone comes along and says, I would like to do Windows and I'll let you all worry about Linux, I don't care about it, I just want to do Windows, they're like, are you even real?
23:44 Michael Kennedy: Yeah, who's this person?
23:44 Steve Dower: They're like yes please, come and help. And so I basically just got to rewrite it as I wanted, to do things like moving the default installs so that it's not writable by anyone on the machine. Because it was installing in Python 2.7 still installs to this default directory, where any user on the machine can add files that the administrator will then automatically run.
23:44 Michael Kennedy: And it's something out of place, like C:\Python, right?
23:44 Steve Dower: Yeah, it's unconventional now. At the time it certainly wasn't. It made total sense originally. But these days, yeah programs go into Program Files or they go under the user app directory if it's restricted to the user, and that helps keep things secured. It sets the access control right so that people aren't modifying files that other people are going to use. And being able to put it in a per user location means that we don't have to require users to be administrator to install anymore. And I feel like that's been the most exciting feature about the new installer for people, is enabling that, don't make me elevate, don't... For a lot of people it's don't make me hit YES on that box but for others it's don't make me go get the administrator to type in their password so I can install Python.
23:44 Michael Kennedy: Yeah, exactly, when you're at home and it means the UAC kicks in and you click OK, like that's okay. I mean it's not super but it's not the end of the world. But if you're at a company, some of the places where it would make a lot of sense like investment places and whatnot, they don't even get to ask. It's just the answer's no. No, you don't go to install stuff, no you don't get raise permissions. And then pip install something becomes challenging as well if it's trying to install into like restricted directories. So you basically made it so you can... All that stuff potentially happens into your user profile where you can control almost all the time.
23:44 Steve Dower: Yeah, and at this point with 3.7, I made the final change which was to do with the C runtime that would still try and install as an admin if you needed it, as we're down to very few people who needed that already. So 3.7, I just kind of cut that cord completely and said, if you're still running on a Windows 7 machine that isn't receiving updates for whatever reason, then we won't try and install this properly. We'll just give it to you so it works and assume that you're not ever going to run into the trouble that we were concerned about previously. But I think my favorite story about people enjoying this feature, I had a teacher who spends a lot of time teaching Python throughout Africa come up to me and it was actually at EuroPython this year and absolutely gushing with praise and thanks for making it possible for his students who are coming in to like evening classes, they've borrowed their mum's work laptop. It's a 10-year-old laptop, the keyboard is broken so they've got a USB keyboard plugged into this laptop because they don't get to buy the latest MacBook Pro every year, right?
23:44 Michael Kennedy: Yeah, of course.
23:44 Steve Dower: And it's running Windows 7 and it's totally locked down and they can still install Python on it because of that change. It's a little mind-blowing that change which feels like, it's technically a good thing to do but it also enables like kids in Africa to learn to program and that's just mind-blowing.
23:44 Michael Kennedy: Yeah, that is so awesome. That's great, and I think actually that sort of speaks a little bit to the reach of talking about Python on Windows and this journey. I mean we talked about, I talked about, oh Azure is what's like unlocking a lot of the, sort of commercial side of the things and around Python for you. But if you look at who is using Python, like what operating system people are using Python on, you go to the conference, maybe it doesn't feel this way but general studies and surveys say, the majority of them is like massively on Windows, right?
23:44 Steve Dower: I've actually been going around speaking to this at conferences recently, which I think we'll have a link to at least one of the recordings of the talks in the notes for this. But my recent presentation I've been giving at conferences is called Python on Windows Is Okay Actually which is a bit of a bait-and-switch title to be honest. Because most of the talk is not actually about Python on Windows because Python on Windows is okay. And one of the best things when I joined the team was it was already okay.
23:44 Michael Kennedy: I think this installer actually makes a big difference. I think getting it installed properly, it is a big deal.
23:44 Steve Dower: It's the first experience, right? And the first experience matters 'cause if you have a bad first experience, then you may never look twice. The big kind of controversial thing in that talk and a lot of the reason that I wanted to give it, was I started looking at the stats. And I looked around the conferences and was like, yeah, the entire Python world is using Macs mostly and occasionally people that have taken a Windows laptop and installed Linux on it instead. And then I'd get up to give a talk and I've got a Windows machine and I can look out and just say all these glowing fruit staring back at me.
23:44 Michael Kennedy: Taunting you.
23:44 Steve Dower: Yeah, but then we start looking at actual numbers and it's so skewed in the other direction looking at things through downloads from PyPI, downloads from Conda, usage in the PSF survey, usage in IDEs, like VS Code and PyCharm. So PyPI accepted Windows is a huge if not the biggest chunk or just a true majority of Python usage.
23:44 Michael Kennedy: Yeah, so let me, just to put some numbers behind this. Let me just read the operating system usage numbers from the JetBrains and PSF survey, the one that you just mentioned, which was not promoted by JetBrains, they just happened to do some of the analysis. So promoted by PSF. We've got other operating systems, whatever that means at 17%. We have Mac OS at 15% and Linux at 19%. I feel like everyone has a Mac at the conferences but 49% of people said Windows. That's like not just a little bit more than others, that's more than Mac and Linux combined.
23:44 Steve Dower: Yeah, and as you say, it was not promoted by JetBrains so much, it was... This was promoted in regular Python channels and so regular Python users are seeing and responding to this and doing it from a Windows machine, which we just don't have that kind of perception in the community that that's the case. And so it was really exciting. My favorite part about giving talks in general is looking at the tweets afterwards, and the things that people have been most shocked by or most excited by. And the amount of photos of me standing in front of the slide with the 14 million downloads per month of Python for Windows is one of my favorite photos now.
23:44 Michael Kennedy: That's a ridiculous number of downloads, that's awesome.
23:44 Steve Dower: It's huge and a lot of that is going to be CI systems to be fair. But most of the downloads from python.org are CI systems. We don't actually think there are 14 million Python developers anywhere in the world right now. I think the estimates are closer to six to seven million. So there's automatic downloads. At the same time, if people are automatically downloading Python for Windows, that means they're running Python code on Windows. So it doesn't actually spoil the point at all for it to be oh, it was a script. It's like well, every single download of the source for Linux was a script. Every single download of the Mac in solo was also a script. That's just the reality, is no matter how you slice the numbers, I describe it as at least half of the Python community is actually running on Windows. And we just don't realize that and a lot of time we don't act like it. So the rest of the talk is basically around things that Python projects or Python packages do or have done that make them not work well on Windows.
23:44 Michael Kennedy: Like what?
23:44 Steve Dower: Things like using string.split on forward slash to separate paths. So paths on Windows use back slashes, and we have a couple of great modules for this so I strongly recommend pathlib which is in the standard library as of 3.4. But we've had OS.path module for dealing with paths and yet people will still just use the normal string functions to split on a forward slash. And if you do that to a path on Windows you get one element with the entire path still in it, because there's no forward slashes anywhere.
23:44 Michael Kennedy: That's a weird root directory.
23:44 Steve Dower: And it's things like that, and my big concern with that is you get, there's 50% of Python developers will install your package, try and use it and it won't work. And it's just a very unwelcoming experience.
23:44 Michael Kennedy: Yeah, it can be and it can be indirect as well. Like for example, Sanic, it's a cool Async web framework. Right? Python. But I believe right now, at least last I heard, it didn't work on Windows because uvlib doesn't work on Windows, it doesn't install properly on Windows and it's just these little glitches just like, ah.
23:44 Steve Dower: And there's always a very long and reasonable chain of things that would have to work perfectly for the final result to work which I understand, which I understand.
23:44 Michael Kennedy: This portion of Talk Python To Me is brought to you by us. Have you heard that Python is not good for concurrent programming problems? Whoever told you that is living in the past because it's prime time for Python's asynchronous features. With the widespread adoption of async methods and the async and await keywords, Python's ecosystem has a ton of new and exciting frameworks based on async and await. That's why we created a course for anyone who wants to learn all of Python's Async capabilities, Async Techniques and Examples in Python. Just visit talkpython.fm/async and watch the intro video to see if this course is for you. It's only $49, and you own it forever, no subscriptions. And there are discounts for teams as well.
23:44 Steve Dower: Part of the problem is that all of those 50% of Python users on Windows, I'm pretty well convinced that 49% of them work at places where they're not actually allowed to release their libraries. I know for a fact that they have fast networking libraries that they've reimplemented things for their platforms that make Python run great, that would be amazing to have out there in the community, that'd be a real enabler for a lot of the rest of us that are not kind of restricted within these companies but they just can't do it.
23:44 Michael Kennedy: Yeah, it was probably built to like drop a half a millisecond on like a trading engine for algorithmic trading or something like this where it's like well, this is our advantage. We're not putting this out.
23:44 Steve Dower: Yeah, absolutely. And that advantage is worth a few millions of dollars for a second as well for a lot of these places.
23:44 Michael Kennedy: Yeah, it's not unreasonable for them from a self-interested perspective, but yeah I hear you. I would actually say in the reverse maybe. Like how many of the core developers or maybe not even core developers, let's just say people working on popular packages like uvlib, not calling them out in particular really but just those types of things, where the developer doesn't even own a Windows machine. So just doesn't bother.
23:44 Steve Dower: And ultimately I think that's the biggest point is when you don't have the machine to run and test on, then you stop thinking about these things. And works on my machine is for volunteers, I think totally reasonable. Like asking people to invest in new hardware, invest in a new software installation, especially the matrix of things that you have to worry about. Even for Linux, I would suspect that most Python packages are not tested on more than one variation of Linux.
23:44 Michael Kennedy: Yeah, for sure.
23:44 Steve Dower: For exactly the same reason. Very few people are going to be running on FreeBSD and Ubuntu for example to make sure that it works on both. So that is a really valid reason for not doing it but at the same time, it doesn't take a lot of work these days to set up a CI system that is going to run on Windows for you, and it doesn't actually take any money at all. Azure pipelines is a fairly recent one that a lot of people are really excited about, myself included, because the open-source thing. If you mark your project as public, so that anyone can see the results, you automatically get unlimited build time, up to 10 concurrent builds and Windows, Mac, Linux a choice. So you can choose, I think there's two Ubuntu images and two Windows images right now. It has docker support if you want to be running builds inside containers of whatever container you like.
23:44 Michael Kennedy: That's pretty cool. Can I plug it into like my GitHub repo?
23:44 Steve Dower: Yeah, absolutely. You can do that before we bought GitHub in fact, that's always been one of the primary uses for it. Yeah, that's pretty awesome. We have CPython builds running on that. Those are not the binding required checks yet. Pandas are switched over most recently. NumPy is there, Conda-Forge is investigating moving all of their builds over to it.
23:44 Michael Kennedy: Oh, that would be huge. I know some people know what Conda-Forge is but maybe not everyone and I think they're really interesting in enabling Windows to not have these problems of building these weird tools because they deliver binaries on the right OS. Maybe tell people real quick what Conda-Forge is so they appreciate that if they don't know.
23:44 Steve Dower: So Anaconda is a distribution, and normally how people think about getting to the Conda tool. So you'll get this big fat installer from Anaconda and install that and get a whole lot of packages and you have this Conda tool for creating environments, installing packages. But Conda is really just a pip equivalent, and rather than getting the packages from the Anaconda Channel you could get them from other people's channels as well. Conda-Forge is a community organization that basically fills in the gaps. They have something like 5,000 GitHub repositories that are basically the build steps for 5,000 packages, and they have a set of packages that are built for, I think they're doing five different platforms now just automatically building all of these packages. So if you've ever run into like trying to pip install something and there's no wheel, Conda-Forge has probably done some of the work or someone's contributed the steps to build the pre-built Conda package so that you can Conda install a whole lot of stuff from there that would otherwise not be available, and they've been doing that through this incredible system of basically abusing as many free CI providers as possible for the various Mac and Windows and Linux builds and...
23:44 Michael Kennedy: We're doing it for the good of the community.
23:44 Steve Dower: Yeah, and I think we timed things right when I mentioned this to one of the Azure pipeline's team members and I don't actually know exactly what our financial motivation would be, but I don't really care. If they're excited to move as many thousands of builds onto one system that does everything that they want and the Conda-Forge team is excited to do that, and everyone's just excited to do it, I don't actually know what anyone's getting out of it. But Conda-Forge is getting builds, users are getting packages and so I'm happy.
23:44 Michael Kennedy: It can at least make the story for installing some of these packages on Windows better by ensuring they have a good infrastructure and it's not too much work, right? So there's maybe something in that area.
23:44 Steve Dower: And it makes installing the packages on Mac and Linux better as well. Anaconda's popular on Windows but not like 90% of their usage.
23:44 Michael Kennedy: Yeah, yeah, it's true.
23:44 Steve Dower: They still have really significant use on Linux and Mac because they'll give you far more reproducible installs across various platforms than using building from source or even wheels on pip are going to give you.
23:44 Michael Kennedy: Sure, I've used it on my Mac OS because something wasn't installing right. I'm just like I'm just going to get the pre-built version. Let me ask you a question about something that's bugged me for awhile, and as somebody who creates courses and tutorials, I often have to have like a couple of steps. Like okay, here's the command that you type when you register this package so you can develop it in development mode on Mac or Linux. Here's the separate command that you run on Windows so that you can do that. So like python3 setup.py develop, right for example? But well, python3 was not a command on Windows and that can be super annoying. You're like on Windows, what you do is you say python but you got to make sure which Python it is is in the right path so that Python 3 comes first, we've had these conversations before, right? So this I think is one of those things but you actually have like a super early announcement or something that's amazing around this, is that right?
23:44 Steve Dower: It's not exactly a secret project but it's... If you're not watching the Python bug tracker then you won't have noticed but hopefully, when we start promoting this a little bit more, you'll see it. With Windows 10, we actually added, so Microsoft added a store, an app store basically. In fact, it came in Windows 8 but in Windows 10, some of the recent updates, it's been growing the ability to install regular old apps. So a lot of people think of app stores as you get your Apple approved iPhone app that only runs on the phone and can't actually do interesting stuff beyond what the app does. Like you can't install a full Python interpreter that can access everything on your system through any of the app stores. What's being added to the Microsoft App Store, the Windows app store is the ability to install those. So Paint.net is actually one of the earlier ones to do this. You can install Paint.net through the store and installs really quickly. It's probably isolated between users. And what we're doing, or at least experimenting with at this point is adding Python to that store so that when it comes out and the aim is to do this with Python 3.7.2 initially and then keep updating it from there is be able to rather than going to Python.org and downloading the Installer, and running through all of those steps, you'll be able to just go to the Microsoft Store and say, get it free, obviously it's going to be a free app and install it and that will give you Python on your machine managed by the store, automatic updates.
23:44 Michael Kennedy: Automatic updates, beautiful yeah.
23:44 Steve Dower: It'll do automatic updates. I don't know how much trouble that's going to cause yet, that's why this is still a little bit experimental. I had to make some significant improvements to the end to make it be able to handle that but I believe it can handle that now.
23:44 Michael Kennedy: Oh, that's great.
23:44 Steve Dower: But the really cool thing that comes with Store apps that isn't available to regular apps is being able to manage path properly. So one of the problems with the path environment variable on Windows or just one of the differences from POSIX like systems which I actually talk about in my Python Windows talk is that on POSIX like systems, you kind of have, here is a directory for my user commands and any commands I've put in there are available to the current user. And here's a directory for my machine commands and any commands in there are available for the whole machine. And Windows hasn't had that. Windows has let me add a directory to the path and then go look in that when you're searching for anything. And so you get apps fighting over who's first on the path, there's no good way to automatically manage it and say well, 3.7 should come before 3.6. But you can't automatically do that and so that's always been messy to the point where that's one of the controversial things I did with the new installer was disable setting that up by default, 'cause it just leads to breakage and confusion that we can't manage. What the store apps let us do is specify aliases for these commands, and those get managed by the operating system. There's like a nice GUI for switching which app gets to have that alias, and they go in a proper directory that is here are my user commands that are available to my user, and it's on path automatically.
23:44 Michael Kennedy: That's awesome.
23:44 Steve Dower: So this store package when it comes out is going to have python.exe on path managed properly so that you'll get that one, you'll get the right one. And then when you update the store package, you'll get the latest version because it goes in the same place and the newer version wins over the older version. But because it's so cheap and easy to do this, I thought hey, why not put python3.exe in there? Well, since we're here why not put 3.7.exe in there as well? So now we have all of those on path with this package. So that python3 command is going to work. And then I thought well, this is still pretty easy, why not keep going? Let's put pip in there, let's put pip3 in there, let's put pip3.7 in there, let's put IDLE in there. And so it very easily got to a point where when you install this package, you can go to any command prompt and type IDLE and it will start running IDLE in that directory. You can type, when you end up with multiple ones, IDLE3.7 will give you the 3.7 and eventually IDLE3.8 will give you the later version of it, it's just there.
23:44 Michael Kennedy: That is such good news, thank you. Thank you for doing that. That is really really cool, so excited to hear it.
23:44 Steve Dower: Today as we're recording, the first preview one is up and so there's more kind of private testing going on but my hope is that with the 3.7.2RC, when that is out which is probably a week or two ago, by the time you're listening to this, it'll be up there publicly so you can just go to the store and type Python and we should be showing up somewhere near the top of the list. And you can use that to install Python on your Windows box. And the cool thing is, that doesn't require admin but it installs in a way that only installs it once for the whole machine which is really nice. So if you have 100 users on a machine, it installs it once and then isolates just the differences per user. So it doesn't take up 100 times the disk space or anything crazy like that.
23:44 Michael Kennedy: That is so cool, because you see so many tutorials and stuff where people just put the POSIX variant up there, right? The type pip3 here or you do this and it's like well, that's not exactly what you need. So I think this, you're thinking that this Windows 10 store app of Python 3 is going to be the way to get Python going forward.
23:44 Steve Dower: I hope so. There's certainly going to be some cases where it won't be the best option and so the old installers aren't going to go away any time soon. But we'll just have to see how people use it and what people need. There's already cases where the existing installers aren't ideal, and this is why we have other distros out there. This is why WinPython exists, Anaconda exists because they offer things that Python.org can't offer. So this is another thing to be offered. And we can track usage, one of the really nice things is we can get all the crash reports so when Python is breaking badly, there's a dashboard for store apps that basically says, here's how reliable your app is over time. And so we'll start getting those reports for Python which will be really interesting.
23:44 Michael Kennedy: Yeah, for sure and you also were able to recompile Python with the later tools, like move seven years or five years into the future on the compilers as part of your project of taking over all this.
23:44 Steve Dower: Yeah, that was actually one of the first things I did was let's move off Visual Studio 2010 since it's now 2015 or whenever I started doing that. I've had a number of people thank me profusely for that, mostly other people who were trying to build Python itself.
23:44 Michael Kennedy: Yeah, because if you're on Windows and you're a developer, you probably have the latest version of Visual Studio or the Windows platform SDK or whatever. But none of those are likely like a seven or eight-year-old version of the compiler.
23:44 Steve Dower: It gets harder and harder over time to get the old compilers. Microsoft supports the latest and try to build in the compatibility that makes it viable for people to move forward as soon as possible. Around the Visual Studio 2015 point is where we actually made some big dramatic changes to make that easier. And so we are at a point now where VS 2015 or 2017 or 2019, which is recently first preview was released will all work and they can all kind of work across each other so that even Python 3.5 which was compiled with the old one, can use extensions built with the newest compiler because we've got that, I think we finally got the compatibility right. We went through about six or seven different ideas on how to make it right. And this time I think we've got it in a way that really just helps developers not have to pull all sorts of crazy tricks with their setup to make things work.
23:44 Michael Kennedy: That's also really good news. I think there's so many positive things around here. This Python on Windows is okay actually. I mean it did highlight some of the issues but also some of the great things. So maybe let's close out this conversation around the journey of Python at Microsoft, by talking about this Python summit at Microsoft. What's the story on this thing? I've only barely heard of it.
23:44 Steve Dower: There's rumors floating around and we've put up a blog post with a bit of information. Being at Microsoft gives me an opportunity to speak to and hear from people at companies that don't normally interact with the community. And so, through that we get to hear about problems and solutions that various people have tried. For fixing things in Python or just in their Python projects and we started hearing some commonalities across a lot of different places. I'm talking like the Facebooks, Instagrams through to like various big banks who are all heavily invested in Python. We started hearing so much in common, but also not actually being aware of everyone else. And we started feeling like hey, maybe we're in the privileged place where we can see all the overlap, and maybe it's on us to do something about it. So we basically put out the call to the people that we knew in these places. So this wasn't like a public open call for content, this was specific invites to companies we know heavily invested in Python, said, hey come, let's get in a room together and basically talk through what our struggles have been, what our efforts look like. It's like half conference, half counseling session almost. And do it in a place where we can have kind of an informal NDA going on about, this is not, we're not broadcasting this on the internet so if you want to actually talk about problems that you're having you don't have to worry that it's going to be a news headline tomorrow and people came. And we had a room filled with trillions of dollars worth of Python development which is something that probably makes absolutely no sense to most the community because we look around and see the same kind of Python developer all around us at most of the conferences. And then in this room, we had the DropBoxes, the Amazons, Facebook, Google show up along with the big banks that are actually making huge amounts of money using Python to the point where if Python disappeared or if Python fell away, they would be losing like billions of dollars of income or being able to move money around and investments and stuff. So it was just a great opportunity to hear directly from them, hear that, and basically find out that yeah, all of our problems are the same. And a lot of them are shared with the community. A lot of packaging concerns affect everyone, a lot of performance concerns and the C API for CPython is a hot topic right now both in the community and in the corporate world. And everyone kind of has different ideas about what to do and how to fix it. And so, a lot of people said it was a great time. They were very excited that there was a chance to share some of this. As I said, it's a little bit, a little bit private kind of thing so we're not putting out reports on specific things from specific companies. But certainly there are those of us like Microsoft who are kind of that channel, right? So we're a little bit of gatekeepers I guess which is not ideal, but at the same time, if it's the only way for... If the only way for the community to hear what the corporations are worried about is for us to kind of aggregate it and say, we spoke to corporations that won't publicly talk about this stuff and you'll just have to trust that they said it's a problem to us and we're kind of repeating anonymously, which the interactions between the corporate world and the community of open-source are actually really really complicated.
01:00:05 Michael Kennedy: Yeah, I'm sure they are.
01:00:06 Steve Dower: Most of what we've talked about today has been how complicated it can be. It's still so complicated, still so much uncertainty about how to interact. There's a lot of investment that these corporations want to make into community projects like Python, like pip, like Numpy and really, there's a huge amount of uncertainty about how to actually do it. And there's a lot of reasons for that and probably most of them are solvable. But the only way to solve them is by having some amount of open dialogue, and we're trying to open that up and make it possible for companies to say, hey this is what we're worried about, this is our concerns, this is what we can offer and try and link up with the community to make some of that actually happen. So apart from the corporations, we did have some community people in the room as well. I think we had seven Python core developers in the room, mostly employees of these companies but they were there. We were looking around going, did that guy get up and give an entire talk about using Python and not say any dollar amount under a million dollars? The only number less than a million was the version number of Python he was talking about. Did that really happen? Does Python get used for that? Yeah, it does, but how do you turn that like millions and billions of dollars into community contributions is a really difficult problem.
01:01:33 Michael Kennedy: Yeah, it sounds like it is and I think, if you guys could unlock that somewhat, that would be so amazing. Because there's so many projects that are underfunded or run by one person in their spare time, that are probably critical to the work that these other folks are doing. If you could somehow get that support to flow the other way or get at least their concerns to help see Python better, that'd be great.
01:01:56 Steve Dower: There's been a few attempts to do this. There's a number of companies out there that are set up for kind of the purpose of redirecting corporate funding into open-source projects. Anaconda, originally Continuum is one of those. Travis Oliphant's new venture, Quantsight is trying to do the same thing. Tidelift is also following a similar thing.
01:02:16 Michael Kennedy: Yeah, Tidelit is really cool.
01:02:18 Steve Dower: It largely comes down to corporations will give out money for exactly one reason and it's to get a return, which means you either have to convince them that altruism is going to get them some good PR, or actually sell them a product. And Tidelift and Anaconda, and Quantsight are all selling the corporations a product and turning that into open-source funding, which is something that individual projects have really struggled to figure out.
01:02:43 Michael Kennedy: Yeah, absolutely like the Patreon model or the PayPal donate button.
01:02:52 Steve Dower: You can't convince a vice president to give money to Patreon. It's not how any of these big companies work.
01:03:00 Michael Kennedy: Yeah, just like, it doesn't make even any sense to how this works, yeah.
01:03:06 Steve Dower: And you have a scale problem as well. We actually faced this with trying to sponsor meetups, is finding the person that can approve $1,000 sponsorship is actually kind of hard. 'Cause no individual engineer can approve that expense 'cause that's too big. But what if the only people you can find look at that and say, why are you bothering me with something so small? Come back when it's six figures or more. And it's actually this really hard spot to get anything done in a large company, when it's too big to go unnoticed but too small to deserve a decision-maker.
01:03:39 Michael Kennedy: Yeah, it's a weird no-man's land in the middle there. Indeed, well it sounds to me like thissummit that you guys had, it's a real positive thing so yeah, keep that up and thanks for the report. Yeah, indeed. Alright, well I think as much as I have a million more questions for you, I think we have to call it an episode on that. So let me ask you the two final questions before you go and maybe we can touch on something else as well. So if you're going to write some Python code, what editor do you these days? Probably not Visual Studio Tools for Python these days, right?
01:04:05 Steve Dower: That was my project for a long time so...
01:04:08 Michael Kennedy: Yeah? You still use it
01:04:09 Steve Dower: I have a lot of sympathy. Right, I just still use it. I wouldn't use my primary editor, like Visual Studio Code right now is probably my primary editor.
01:04:14 Michael Kennedy: That was going to be my guess.
01:04:20 Steve Dower: Yeah, well but the main reason for that is I got up on stage at a conference recently to do demos of it, and just failed miserably at a number of things because I had not spent enough time using it.
01:04:27 Michael Kennedy: It's a really cool editor, but it's one of those you kind of have to live in because there's a lot of... There's not lots of buttons and stuff. You got to kind of know how to interact with it, right?
01:04:37 Steve Dower: Yeah, and I know Visual Studio inside out so I can use that comfortably, but I don't feel like I'm learning anything. So I'm living in Visual Studio Code these days to learn and get a good feel for it. Rightly or wrongly people expect me to know how it works and it would be helpful for that to actually be true when they ask me questions about it. So that is where I'm living right now. But also the extension for that. So the Python extension, when we adopted that, gee, a year, two years. It was like a one-person part-time thing. It now has like a hundred million people working on it or something like that. I can't even keep track of the team. We have six people based in Vancouver, most of whom I've never actually met in person. It's moving along so quickly that... It's at that really exciting time where every month when the update comes in, there's new and exciting stuff and it feels totally fresh and it's just exciting to be using right now, so that's also fun.
01:05:38 Michael Kennedy: Yeah, it definitely seems like there's a ton of energy around Visual Studio Code. I suspect most people know but obviously, it runs on all the platforms, right? It's not just on Windows, even though it shares the Visual Studio name.
01:05:50 Steve Dower: Yeah, Windows, Mac, Linux, having .NET core support for all of those platforms is... I feel like that's the limiting factor right now. Obviously it's Electron based so anywhere that can run, it'll work. But some big pieces, especially the Python support are running at .NET core, so that's about the limit. That's still a really big limit. Even got some old versions of Linux distros that we support because it works there where really most people would not dream of using them. But you can run VS Code there. It's a tool, it's got to run where people want to use it. We can't force you to upgrade your operating system just for the editor.
01:06:27 Michael Kennedy: Yeah, that's great. Alright, I definitely think there's a ton of energy around VS code, and I think that whole Python extension editor side of the story deserves like sort of mention in the Python's journey at Microsoft even though I didn't get it till the very end here. Then notable PyPI package?
01:06:44 Steve Dower: Yes, so one that I'm really excited about right now, the Azure machine learning team recently came out with their new product and the name is very similar to a lot of things we've had before. So it can be tricky to track through exactly what it is, but Azure Machine Learning Service is basically an entire service for being able to do your machine learning tasks. Everything from data cleaning to model management and deployment in a way that works really nicely, balanced between your local editor and running stuff and pushing stuff in the cloud. So it's got a whole lot of nice functionality for hey, run this job in a cluster that is this big, let me know when you're done. And run that training job and then store the model somewhere, you get good history of all the models that have ever been published. And that's one of the products that has come out with only support for Python. If you want to use that, you're using Python. There's actually no other options right now. So I'm excited about that. But one particular part of that package that's really cool is the Azure ML data prep package. So this is one part of it. But it's the part for, I have a file on disk or I have a set of files on disk that have some raw data in it and I need to pre-process. I need to clean, I need to extract data from certain columns and split them into more. I need to add more columns, I need to remove rows, I need to replace missing values. It's a library for doing that. It does a couple of really cool things. So if anyone's used flash fill in Excel, which is this cool feature. You can try, just put in a list of people's names, just first name, last name, however many names in one column and then start splitting them out manually. You put a few examples of like type the first name in the next column then the last name, and do that for two or three. And it will suggest splitting up every single row using those examples. And figure out how to do it. That's been in Excel for a while, it's also in this package. So you can take a raw data file that has, maybe it has the date and time in those same column and you say okay, here's a couple of examples of how I want it split up. Put the date here and the time in a separate column and it will figure out how to do that. And it will use those examples and then you can run it, stream it over a huge huge file that won't fit in memory and it will do it to the whole thing and write out a new file that's being pre-processed.
01:09:04 Michael Kennedy: Man, that's cool.
01:09:07 Steve Dower: Yeah, it's really really neat. I guess there'll be a link in the show notes the info on that one. When we look at how far Microsoft has come in terms of Python usage, and this feels like a great ending to this show actually. This is a new point. We have some amazing technology here that is just Python. It's not even .NET first and then Python, it's not even .NET second right now. It's just, it's Python all the way and that's so exciting. I can't wait to see what comes next.
01:09:39 Michael Kennedy: Yeah, well it sounds like there's a ton of momentum going there and you've done a lot to get it going in the right direction, so that's wonderful. Alright, so people are excited, maybe they want to learn more about this history, some more of the details, things like that, what do they do? Final call to action.
01:09:53 Steve Dower: I'd say go read the Medium post and that's got links scattered throughout interesting things and down the end, there's links to all the current things. So once you've seen where we came from, where we've gotten to, then... I'm trying to make it easy to see what we have on offer today.
01:10:09 Michael Kennedy: Yeah, well, a lot of cool stuff and thanks for doing this historical journey story. It's great.
01:10:14 Steve Dower: Yeah, thanks.
01:10:15 Michael Kennedy: You bet, talk to you later. This has been another episode of Talk Python To Me. Our guest on this episode was Steve Dower, and it's been brought to you by Linode and us at TalkPython Training. Linode is your go-to hosting for whatever you're building with Python. Get four months free at talkpython.fm/linode. That's L-I-N-O-D-E. Want to level up your Python? If you're just getting started, try my Python Jumpstart by Building 10 Apps course. Or if you're looking for something more advanced, check out our new Async course that digs into all the different types of Async programming you can do in Python. And of course, if you're interested in more than one of these, be sure to check out our everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show. Open your favorite pod catcher and search for Python, we should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play and the direct RSS feed at /rss on talkpython.fm. This is your host, Michael Kennedy. Thanks so much for listening, I really appreciate it. Now get out there and write some Python code.