#95: Grumpy: Running Python on Go Transcript
00:00 Google runs millions of lines of Python code. The front end servers that drive youtube.com and YouTube's API are primarily written in Python, and they serve millions of requests per second. On this episode, you'll meet Dylan Trotter, who is working to increase the performance and concurrency of these servers powering YouTube. He just launched a grumpy a Python implementation based on go the highly concurrent language from Google. This is talk Python to me, courted January 12 2017.
00:37 For developers developer, in many senses of the word because I make these applications vows and use these verbs to make this music constructed. To think when I'm coding another software design, in both cases, it's about design patterns. Anyone can get the job done. It's the execution matters. Many interests.
00:57 Welcome to talk Python to me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm at m Kennedy. Keep up with the show and listen to past episodes at talk python.fm and follow the show on Twitter via at talk Python. This episode has been sponsored by hired hand as a new sponsor pi image search Comm. They're announcing a Kickstarter campaign called Deep Learning for computer vision with Python launching on Kickstarter. Right now, I think both of these companies are supporting the show by checking out what they have to offer during their segments. Dylan, welcome to talk Python.
01:35 Thanks. Nice to be here.
01:36 Yeah, I'm really excited to talk about grumpy actually grumpy your Python project. It's gonna be fun.
01:43 Yeah, it's been pretty exciting. A couple weeks since the release. So yeah, I'm excited to talk about it, too. Yeah, it's
01:48 definitely gotten a lot of attention in the open source world on GitHub. And we're gonna dig into a lot of details behind it. But let's start with you and your story. How did you get into programming and Python,
01:58 I started programming, I guess, when I was in high school, I took care of like a intro programming course, and kind of got the bug. And I just kind of took it from there. I was really into like programming little games and stuff like that back then. I did not do CS in in university actually did physics. But I continued to work on programming in my own time a lot. And after or during University, actually, I got a gig at a sort of summer gig at a software company. And that gave me a leg up when I graduated, which was pretty lucky. And so I got a job at a visual effects company doing software there. So there was a bunch of different things like a lot of sort of proprietary languages for the different packages, but Python sort of came out as a front runner in terms of integration with different visual effects packages and stuff like that. And so that's where I started to dig into Python, especially not so much on the sort of like effect side, but more on the pipeline, data management kind of side of things. So there's a lot of asset management and stuff going on in visual effects studios. And, and Python is great for for that sort of stuff.
03:14 Yeah, that's, that's really cool. I did a whole episode on Python and like game development studios and movies and production and stuff. I was really surprised how much Python glues all the tooling together for those folks.
03:27 Yeah, it's it's really deep in there. In fact, when I was working in in that area, it that's when Python sort of started to come to the fore. And so like, my, which is a big, like modeling and animation package, built Python integration around that time. And Houdini is another one similar use cases, that integrator actually, I think from pretty early on Houdini had Python integration. So, so yeah, so we became the de facto visual effects integration. Light language.
03:58 Okay. Yeah. Very cool. And I think that's only only growing, it seems like there's there's a couple areas where Python is sort of past critical mass. And it's kind of like a black hole. Now. It's just sucking everything into it. Totally. Yeah. And that's a good thing. So you don't do visual effects anymore. Although you kind of work in the video world these days. I want to tell everybody what you're up to. Yeah, sure. So
04:21 I'm at YouTube now. I started there about seven years ago, it was a kind of a big shift. For me visual effects was a fun environment. But it was always kind of a dream to work at Google and stuff. So So I took a gig at YouTube. And I've worked on a number of different teams there. Actually, I've worked on sort of user facing features. I was on the channels team for a long time, working on YouTube channels and stuff around that, and eventually got more into the infrastructure side. And so now I'm working on the, what's called the legacy application infrastructure. That group. And our team specifically looks after the application server that serves youtube.com and YouTube API's and those sorts of things. Excellent. So when we're all watching various things on YouTube via cat videos or something educational,
05:16 we have you to thank for keeping the server's running.
05:20 Yeah, well, me and a lot of other people. Yep.
05:23 Yeah, sure. Well, thank you and individually, so Yeah, awesome. Yeah, that sounds like a really fun place to work. Where's YouTube? Where's the center of the universe for YouTube? Is
05:33 that mountain view or somewhere else? San Bruno actually is where the main YouTube campuses. So there's a few different offices around the world. But the biggest group of the biggest sort of geographical concentration is in San Bruno. So there's a few buildings there that are YouTubing. Okay,
05:53 yeah. Nice. That sounds like so much fun. So Python actually plays a really important role at YouTube these days. Let's talk about how it's used now. And then how that kind of came to be.
06:02 Sure. Yeah. So Python is what is running the main application server and a lot of the application code for the YouTube front end and for that serves like the website and and the API service API's that you know, service, your your phone and those sorts of things. So it's sort of like the gateway for most user traffic,
06:26 right? And then maybe the Python code branches back into all sorts of Google services behind the scenes that are in a variety of technologies or something like that, right? Yeah, there's a there's a lot of different technologies and servers involved in the whole thing. Okay. So YouTube wasn't initially a Google creation, right. It was created by some other folks.
06:44 It was founded in in 2005, I think by three guys. The one of which was still there when I joined in 2009. is Chad Hurley, I think at the time is he was the president or something. He left shortly after I joined. But yeah, they they built it in 2005. And it gained a lot of traction really early on. And I guess Google took an interest at some point in 2006, and ended up buying YouTube in November 2016.
07:18 Yeah, I'd say that was a great move for them. Because it's it's such a central part of the internet. Yeah, I feel like it had YouTube, the idea was something that a lot of people probably had the idea, it was a thing that clearly should exist. But when you think of the infrastructure and the bandwidth costs, and just the actual act of creating such such a huge video network, seems prohibitive, but you know, once it came into existence, you know, I guess Google jumped on, that's cool.
07:48 I actually remember thinking what a simple idea it was, and how, like, it seems so crazy at the time that I think the acquisition cost was like 1.6 billion or something like that. And, and I remember reading about that. And I was like, Good Lord, like, you know, you could. How is something so simple? Were so much, but now that number seems so quaint compared to recent recent steps?
08:11 Yeah, yeah, of course, of course. I mean, a lot of companies go through the thinking, it's easy to go through the thinking of you could have just built that yourself. Yeah. Right. I mean, Facebook bought Instagram for like, an insane amount of money. And that's like a team of 12 people, right? For whatever it was, like 19 billion or something. They, they could have easily paid 12 people to build another Instagram, but it's, it's also got people's interests. It's got the users, it's got the the momentum, and that that's the thing I think people buy. Absolutely, that's what you're paying for. Yeah. But they didn't write it in Python at first. Did
08:43 they know they the first implementation, I believe was PHP. And I don't think that lasted very long. I think it was most of that was rewritten in Python pretty early on. Well, before I was there. Sure, sure.
08:55 And I suspect the way YouTube looks today, with the growth of cloud computing, and all the different API's and services probably super different from when what you guys got back in 2006? Right.
09:07 Yeah, it's I mean, I think, you know, the company has grown a lot. The use cases have grown a lot. It's just I mean, it's kind of night and day, it's when I when I first joined, you know, everyone was kind of in one floor of one building. And since then, you know, there's distributed all over the world. And so yeah, it's it's changed a lot. Sure.
09:28 Wow. Okay, so that brings us to today to YouTube. And what this project that you, Was this something that you created this project called grumpy? Or where did this come from? Yeah,
09:41 I talked about some of the challenges we're having with, you know, running Python at scale, and on the blog post, and basically, there's a few different aspects that that affect our ability to run, you know, that many Python servers, the CPython runtime, while it's really great, and it's highly optimized, and and it does a lot of things really well, for our use case, it's never really been a focus for C, Python as a project. You know, we thought, you know, maybe it makes sense to rethink how the runtime is built with a focus on concurrency and, and running large server applications.
10:27 Yeah, and you're not you guys are not the first people to have this idea of, well, maybe we could replace the C time C, C, Python runtime interpreter with something else. There's like, jython, there's ironpython, there's pi pi, there's plugin jet. So there's a lot of stuff happening there. But nobody's gone in the direction that you went in. Right?
10:49 Yeah, it was, it was an interesting, I mean, it's, you know, in a lot of ways, it's kind of crazy. And the thing about go that go runtime that grumpy is based on is that it is kind of designed for very similar use cases, to what we are interested in. So goes tends to be tends to be used for writing highly concurrent server applications. with, you know, a lot of like, sort of message passing and things within the within the application between threads, it seemed like kind of a good fit. And once I started to flesh things out, and to build out some of the core functionality, some of the pieces start to fall into place and start to look actually really compelling.
11:36 And you're like, Hey, we could actually do this. We should stop for just a second. I don't think we've explicitly said your project is called grumpy, which is a replacement for the CPython implementation with a entirely different go implementation. Right? Yeah, that's right. Yeah. Yeah. So so very interesting. I think, you know, go. Obviously, it makes sense for Google to be the ones experimenting with go right. Go comes from Google, doesn't it?
12:03 It does. Yep. It was developed, I think, originally by will rob Pike, and I'm going to mix it up. It's either going Thompson or no, is Yeah, it's Ken Thompson, I believe. Yeah, it was it was developed for, I guess they had observed that I get similar to, you know, the what the observations that we made about running Python programs for Python server programs, they had made sort of Jim or general observations about writing server applications and how languages that existed and didn't quite fit what our use cases.
12:40 Yeah, go is really quite, it's one of the newest languages out there that I would consider a mainstream language. It's not as mainstream as C, but it's definitely getting there and came out in 2012. In version one, sort of officially, is it so it's born within this world of multi core micro services distributed cloud computing stuff, right? Yeah, yeah. Okay, so let's dig into the, what is grumpy? Let's dig in a little bit, like, how do I take so I can take my Python code, I can write some, presumably some web app or something in a web service. And then I can run that on grumpy like, what, what is grumpy do? How does it take my Python code and run
13:24 it so grumpy is takes a little bit of a different tack than CPython. It's actually trans compiler and runtime, whereas you can kind of think of CPython as a like a virtual machine, bytecode, interpreter and runtime. And in that sense, it's kind of like a combination of scythe on and a bundling of scythe on and T five, except that it's all in go.
13:50 Right. So scythe on takes a flavor of Python, and then compiles it to C directly. And like C go is a statically typed, compiled language. And so it's no longer interpreted. It's not even, like jet compiled like Java or dotnet. It's full on compiled, right?
14:13 That's correct. Okay. So the the sort of runtime side of things is actually like, the correspondences like there's Python c API, there's actually a go grumpie API. And so what it's compiling is code that uses that API to mutate objects to pull out state and those sorts of things. And so whereas c Python or vanilla c Python uses a bytecode interpreter to actually drive those API calls, the grumpy and scythe on are actually generating code that drives those API calls. Okay. Yeah, very, very cool. Now in your GitHub repo or the blog post, I
14:54 don't remember where I got this, but you said it's intended to be a near drop in replacement for C, Python. to seven, how's that going? How far? Are you towards that goal? That's a pretty big set of API's to cover.
15:07 Yeah, I'm learning every day, like how big Python is.
15:12 Nobody told me about this weird case I'm gonna have to support. Oh,
15:15 yeah, totally. Yeah. I mean, I've I've been the amount of sort of spelunking. I've done in CPython internals is, I did not expect to all that. But yeah, so it's going pretty well, the core functionality is there. So like, the basic semantics of the language in terms of attribute access, and how types work? And how method dispatch works? That all of that functions, basically fine. The basic types are all there. So lists and dictionaries and things all kind of work does mostly map
15:49 directly to the underlying go? structures, like does a list in Python map to a slice and go and things like that? Or do you have to do more complicated things to map it,
16:00 it's more complicated. And the reason is that Python is so dynamic, right? Like method dispatch is, is so dynamic and attribute access, we can put attributes on just about anything, you know, if it was just the this native go types, then in interview, you wouldn't be able to put an attribute on a on a list, and we're on a slice. Right? Right. So it's actually there's this sort of wrapper types, basically structures that actually map very closely to see pythons object structures. Okay.
16:33 Yeah, I can see that because you're working with a non dynamic language. And yet, it has to support dynamic capabilities. So So you got to somehow put a shim in there for that. Right. That's, that's right. Okay,
16:45 I guess the biggest kind of gaps in terms of supporting or being a drop in replacement are, the standard library still needs a lot of work. So C, Python has a lot of its standard library is actually written as c extension modules, which grumpie does not support. So that's, that's one area of significant divergence between the tumor and we can talk about that more, that's turned out to be sort of a big kind of beast to slay. The nice thing is that with, you know, all those other Python runtimes out there, there's actually, you know, you can find pure Python versions of most things. And so like pi pi, for example, implements a number of libraries that are in Python that aren't implemented in
17:31 CPython, right. So you could say, start this transition or this backfilling of API's by just moving to pure Python implementations that then get sent through grumpy, that actually get compiled to run on go, right. Yep, that's exactly right. And maybe do some profiling and say, Well, you know, people use lists a lot. Let's write that directly. In go or something like this. Right? You can optimize later. Exactly. Yeah. Okay. Yeah. I suspect that there's a long tail of like stuff, this doesn't really need to be optimized that last 5%. Whereas these are the few things that we really should focus on. Right.
18:06 Yeah. So right now, you know, I'm kind of focused on getting support for the whole like, I want, I want to be able to run some common libraries that are written in Python, some I want some program Python programs that are out there, like open source programs to be able to just use grumpy and so like, just getting it to the point where everything runs is the first step, and then you make it fast.
18:30 Okay, yeah, of course, making it work. And then making it fast seems like the right order to me as well. So you said in your blog post, that there's going to be some things that grumpy will never support? And then there's things that it doesn't support yet, but you're working towards? Yeah,
18:45 so so one of the things I mentioned already is the C extension support. The API for C, Python is a bit different than the API for grumpy because it's Well, for one thing is a different language. But also the data structures are a little bit different. The function return values and things are a little bit different. And so there wasn't a good mapping between those API's. And it would be too constraining for to try to make grumpie map perfectly to the C, the C API.
19:19 Sure. Have you looked at the cfsi stuff that pi pi was using?
19:25 Right, so that's, I have not looked very closely at that. That is something that we've looked at internally for other reasons as well. But that is an interesting way to approach the problem and potentially, you know, there are ways to bridge the two API's that cncf If I may be one of those.
19:46 Yeah, okay. does go must have a c c integration option somewhere, right? It does. Yep. Yeah. Okay. The other thing you said is not going to support is things like eval, and again,
19:56 this is like, it is possible to implement something that's a little bit hokey to support eval or exec
20:04 shell out and compile.
20:06 Oh, yeah, exactly. I mean, like, that's, well, I mean, it's funny you think about it. And like that's, that's actually what Python is doing. Right? It's like, except that it's a bytecode compiler, and then it's executing in a VM, if you instead are actually doing a, you know, an actual static compilation, and then executing that. It's not conceptually that much different, except that the tool chain that you have to use to do the compilation step is much heavier. So it's going to be slower. And it just, it kind of doesn't make a lot of sense. I think I could see maybe supporting it for you know, debugging use cases and things like that. I don't think I kind of want to avoid having to worry too much about like, you know, making that performance or whatever.
20:52 Yeah, sure. I, I for one would don't think I would miss it. I think it's fine.
20:56 Yeah. The other thing about exec and eval is, there's very few cases where I've ever come across in all my years of programming Python, where exec or eval was a good idea. So actually, like, I kind of think that it's an unnecessary aspect of language. Yeah, that's interesting, and, you know, is kind of keeping with go in the sense that go is very strict about conventions and some of the best practices that it believes like, for example, if you have an import of a package, and you're not using a package, that's a compilation error,
21:30 things like that, right? Absolutely. So eval, skipping, eval seems like that's all right.
21:47 This push in a talk Python to me is brought to you by hired hired as the platform for top Python developer jobs, create your profile and instantly get access to 3500 companies who will work to compete with you and take it from one 100 users who recently got a job and said, I had my first offer on Thursday, after going live on Monday, and I ended up getting eight offers in total. I've worked with recruiters in the past, but they've always been pretty hit and miss. I tried LinkedIn, but I found hired to be the best and I really liked knowing the salary up front and privacy was also a huge seller for me. Sounds awesome, doesn't it? We'll wait to hear about the signup bonus. Everyone who accepts the job from hard gets $1,000 signing bonus. And as talk Python listeners, it gets way sweeter. Use the link higher.com slash talk Python to me and hired will double signing bonus to $2,000 opportunity's knocking, visit hire.com slash talk Python to me and answer the door.
22:45 Then you said there's a set of things that you're going to support, but it doesn't yet. What are those?
22:50 We talked a little bit about some of this stuff. But like the standard library is not there yet. There's a subset of the standard libraries is available currently, can you give us like a percentage of what how far you are down that path. I
23:04 mean, everyone listening this, this whole project has been like, I don't get up for three or four weeks. So it's not like he should implement at all. I'm just curious, like how far you've you've gotten,
23:14 it's really hard to put a percentage on I guess, I mean, I probably could, like, you know, compare lines of code or something. But I think that what's going to happen is you're going to get sort of a core set of libraries that run all the other libraries, and everything will just kind of fall into place. So I think it's it's sort of more important to count those core libraries. And you know, that's, that's things like, types and collections and operator and all those things. And some of those are already there. I mean, I feel like, it's, it's hard to put a number on it. Yeah,
23:46 sure. Maybe it's one of those things where it's, it seems like you're not very far. And then all of a sudden, that kind of unlocks, and things go really quick.
23:55 That's the dream.
23:58 That's a good way to like it's a, it's a optimistic view of the future.
24:02 That's right. If you're gonna clone the repo off GitHub, and try things out, like, you may be disappointed that that your favorite libraries aren't there. And there's a good chance that if you have a program, that's, that's at all, a piano complex that there are some libraries that are missing for you. I'd say I don't know, maybe 20% or something like that.
24:23 Okay. Well, that's good. You said you also want to support all the built ins. That's right. Yeah, that's obviously a good idea.
24:30 Yeah, those are important. And again, you know, there's a bunch of stuff that just hasn't a bunch of like functions like map and and reduce and things like that, that I haven't got around to haven't needed to support them yet. But they're actually pretty straightforward to implement by and large, so sure. So I think we're we're pretty far along on on that stuff.
24:54 So how much of your focus on grumpy is going to be to make this a problem? subject that you guys could use for your specific use cases at Google, and then make that a skeleton or base. And people can come along and add other features and contribute to the open source project to make it more broad, versus how much you're trying to make this, like, we're trying to re record or entirely replaced the Python.
25:19 So I think that we, I want to see I like, okay, so I put it this way. I'm interested in, you know, solving some of these concurrent use cases that don't have a great answer in C, Python. That's the primary focus, but I guess it might be my optimism is showing the end. But like, I feel like once you kind of have some of those use cases, locked down now people start to use it for for things you didn't expect, right away. I know that, like, scientific computing is an area where Python has a really well established libraries and NumPy is sort of crucial to some of this stuff. And that's got that's C, you know, and involves c extensions. And I think, in the near term, I don't see the grumpy being useful for numerical analysis. And it, you know, that's kind of compounded by go doesn't have too many sort of inroads in that direction, either. So but on the other hand, you know, some of the static, the advantages of like being statically, compiled and type inferencing. And compiling down to native operations, that is potentially useful for, you know, scientific computing and those sorts of things. So, so I kind of see, you know, I want to focus on our immediate use cases, but I have this kind of idea that there's, you know, more opportunities out there once that's once things are kind of working.
26:48 Right. Okay. Yeah, that that seems like a good roadmap to me, makes a lot of sense. So let's talk about the execution engine, which effectually effectively is the execution engine of go versus C, Python. So C, Python, the Python code gets converted to bytecode, those byte code instructions are sent to like a super large force for loops, which sort of thing. And those are interpreted and run, how does go work?
27:17 Go has a runtime, which is to say that there's code that is running, that's managing things like go routines, which are the equivalent of threads, and in go programs, and garbage collection, and things like that. But much of what is actually happening through it a go program is just is actually, you know, low level machine instruction. So there's no program, much like a C program is compiled down to machine code and actually executed natively.
27:48 Right? That makes sense. So you say go has garbage collection, which is awesome. Do you know what kind it is? Is it reference counting? Or is it like Mark and sweep? Or what kind of garbage? Is it deterministic? How's the garbage collector working go?
28:05 This is not my area of expertise, nor mine, but it is not reference counted. So I believe that it is a and actually this has changed significantly, I believe at 1.7. They significantly re retro are sort of retrofitted the the garbage collector, a mostly just around the way that garbage for particular go routines is managed garbage that is sort of local to particular routines. And but it's it's sort of a traditional otherwise, it's pretty kind of a traditional garbage collector, that much similar to what Java has. But it's actually much simpler. Java has a number of different algorithms that supports and a lot of tuning parameters goes garbage collection is fairly, there's much simpler and is targeted for the use case of you know, handling requests in a server application and those sorts of things.
29:03 Yeah, that makes sense. I suspect they are highly parallelized that they get go as well. One thing you said that's nice about executing ultimately on go is you said the deployment story is a little bit simpler.
29:15 You know, Python you do when you deploy a Python program, you are actually including your like py files, or at least your Pac files in the deployment. And so you have to have some way to sort of package them together and ship them off to production or wherever you're running your program.
29:35 Right? And beyond that, also the dependencies and the runtime, right? So you got to have all of those things. That's right, which can make it really tricky. And there's things like pi two and pi dx EC x freeze the the beware project. There's a lot of project trying to make that something you can ship around, but it's not simple.
29:53 Yeah, that's right. And you know, so I'm sure people who have run Python in production have run into you know, versions. mismatches, things like that using the system Python version, which was, you know, different than the one they were developing on and so on. The nice thing about statically compiled programs in general is that you produce a binary and you just, you can put that just about anywhere and it will run. And that's very true for NGO programs, you there's few dependencies, in most cases, most of the the runtime is actually compile or is actually linked into the executable.
30:30 Yeah, that's really cool. What's the size of like a Hello, world compiled output? Do you know,
30:35 I have not looked at the size myself? I think I saw some comment somewhere that said it was something like three megabytes. So it's pretty substantial. But you know that that includes a lot of overhead for the runtime that that wouldn't increase significantly if your program grew, right. Absolutely.
30:54 Like, you know, the next 10,000 lines add 10 K or something. Right, exactly.
30:58 Yeah,
30:59 I think three Meg's is totally fine. To get a good deployment, story stability, you run what you shipped all those things. Like if this was 1994, three, Meg's would be a problem. But it's not today, right? Yeah, that's really nice. So what sort of optimizations you think, are possible, if you run Python on go, if rather than as an interpreted
31:22 system, this is not a area I've dug into significantly yet. My thinking is that if you can determine that a particular for example, a particular integer counter, and a function is only ever an integer type, and it only, you know, uses integer operations, like incrementer, or whatever, then there's no need to go through the whole Python method dispatch in it, creating new integer objects, every time you increment that counter, you can actually just use a native integer and increment using native operations. So that's, that's a really simple example. But not not uncommon. I think once you kind of broaden that to a whole program optimization, that's when things start to get interesting, because then you can think about like, well, if you know that a function is only ever called with particular parameters, or parameters of a particular type, then you can make some assumptions. And again, use native maybe use native data types.
32:25 Sure. What about type annotations? I know that's more of a Python three thing, but would you be able to or interested in having some flavor that takes type annotations, and then uses that for certain types of optimizations?
32:39 Yeah, I thought about this, and I'm a little ambivalent, because, you know, type annotations, the way that they are sort of used today. They're not intended to actually, you know, raise or anything, if they're not respected. It's mostly for analysis before you ship your program to like, you know, make the linting linters job easier and things like that. Right. And so when once it actually in C, Python, once your type annotations, would you actually running your program, that type annotations basically have no effect. So I'm a little hesitant to say that grumpy should use these in a more in sort of a more strict way, because I think that might have effect programs, compatibility and stuff like that. Yeah, absolutely. Do that one. Yeah. There's some real advantages there. If you, if you do make them strict, then you say that a type an argument is an integer, then yeah, it makes the optimizers job way easier, because it can, you know, it doesn't have to do any inferencing. To determine that
33:42 relationship. Obviously, it would break the sort of contract with type annotations, that these are just for editors and linters. And to help you but not actually meant to affect the runtime. That's on the other hand, if you could make some part of code that's like really critical. Go, you know, 10 or 100 times faster by putting a type annotation that's strict. You know, you might be willing to make that trade off. So I have no, I don't know which way would be the right way to go either. But it's interesting to think about.
34:11 Yeah, I'm very curious how that sort of evolves.
34:15 Yeah, yeah. Yeah, I'm gonna keep an eye on it. That's cool. So let's talk about when you launched, so this should be pretty fresh in your mind, right?
34:22 Yeah. This is not a very old project. So about a week we can a day. Yeah. Yeah. So we Well, I guess we migrated the code to GitHub in mid December. And I spent some time over the next month kind of cleaning over the next few weeks cleaning up the code and adding some functionality for the build system that we were not able to use, obviously, the internal Google build system in the open source project. So I had to build some of that out. And then I guess January 4, yeah, I guess it was The fourth,
35:00 that's eight days ago, just for the day the recording Yeah,
35:04 yeah, we did we we sort of coordinated an open source blog post with with the actual making the GitHub repo public and got a little bit of traction on Hacker News. And and yeah, it was kind of astonishing how great the reception was. Yeah, it's
35:21 going like crazy. Like, when I took notes to for this conversation, like four or five days ago, I said, there were 5000 stars on GitHub. Now, maybe that was three days ago. And now there's 6000, almost 6317 contributors, that's a pretty serious uptake for a project that's been out for eight days.
35:40 Yeah, I think the thing that kind of blew me away most was the number of pull requests that I got. I mean, right. On day one, people were digging into the code. And you know, doing like it the code, there are tricky parts to the code. And it's not necessarily obvious how you ought to write certain features. And people, you know, really dug in and started filling out some of this functionality that's missing and start talking about, you know, well, how are we going to support programs or libraries, Python, like third party Python libraries, out of the box, and stuff like that. So it's been great. I've had a really good time working with some some of these people that have been contributing.
36:20 Yeah, I would say, That's really cool. And you talked about the code a little bit, looking on GitHub. GitHub thinks it's 77%, go code 22% Python code and a bit of a makefile. Yeah,
36:32 about right. Yeah, that's about right. And, and a lot of that Python code is actually just tests and benchmarks and things. So it's it most of it is, is go. And actually, I guess, the standard libraries, which most of which are copied from from other places like CPython, there's pretty substantial amount of Python, but that's not like, you know, I don't think about too much about that code, since we don't have to read it or maintain it.
36:55 Yeah, absolutely. So how do you ensure compatibility in this? Like, are you running the standard c Python test?
37:01 That's something that we're working to. So that's sort of milestone number one, I haven't published a roadmap document yet. But getting to the point where we can run the unit test library is going to be a huge milestone because it means we can then run the unit tests that are written for CPython, that would
37:19 be a huge milestone. Yeah, compatibility.
37:22 Exactly. Before we get there, we've been writing small tests to that, you know, demonstrate compatibility concerns and stuff like that. And then running those in both Python and grumpy. Okay.
37:37 Very cool. So let's talk about why you chose go because other three officially blessed languages at Google, there's Python, there's go and Java is that NC++ story these days? Yeah, right. Of course and C++ so for. So why did you choose go like you could have tried Jai THON or something, right.
37:55 json is something that we we've looked into. Jai THON is a really great mature product. It's our experience that it's better to start a project on jython than to migrate to jython. There's a number of compatibility issues, not so much like the kinds of compatibility issues like oh, on an on CPython, this function returns a different type or something like that more, that there are certain constraints of running in the JVM that make certain programs not work very well, or, or those sorts of things like performance issues that sort of crop up in those sorts of things. It sounds like running on the JVM was not the best concurrency server story as it might have been running on go. Because go is more focused on concurrency from the beginning and things like that. That might be more important, you guys. Yeah, I think that was part of it. I mean, like lightweight go routines are definitely a big advantage to go. So Java has native threads, which have large stacks. And so it has sort of a different performance characteristics for concurrent workloads. And so you have to kind of write programs who have parallel programs in a slightly different way for Java, but also, for real time server applications. The JIT actually can be a liability, it becomes difficult to, you know, reproduce certain kinds of certain kinds of issues, debug certain kinds of problems, and consistency, because consistency of how requests are handled is really important in these kinds of applications. And the JIT can make, you know, identical requests behave very differently depending on where in the lifecycle the program is. Sure. Yeah, that makes a lot of sense.
39:51 Being statically typed, you get a little more predictability. Absolutely. Well, not sorry, not statically. typed compiled to like machine instructions, rather than digits. Yeah.
40:00 Yeah, that's right. Yep.
40:01 Hey everyone. Let me take just a moment and tell you about a new sponsor with a cool and timely offer. This portion to talk Python to me is brought to you by deep learning for computer vision with Python, a new book from pi image search comm launching on Kickstarter right now, have you ever wondered how Facebook can not only detect your face and an image, but also recognize and tag you as well? It's not magic. Facebook uses specialized machine learning algorithms called Deep Learning. In pi image search wants to pull back the curtain and show you how these algorithms work. Their new book is designed from the ground up to help you reach expert status even if you've never worked with machine learning or neural networks before. Inside deep learning for computer vision with Python, you'll find super practical walkthroughs hands on tutorials with lots of code in a no fluff teaching style that is guaranteed to cut through all the cruft and help you master deep learning for visual recognition. To learn more about this book, and back the Kickstarter campaign, just head to pi image search.com slash Kickstarter. So how do you run apps on on grumpy? Like if I have Python code? And I want to make it make it go? How do
41:07 I make it go on grumpy This is sort of a hot topic right now in the the issue tracker on GitHub, because like the build system that I have, and is strictly focused on, you know, getting the internal libraries working. And so it doesn't have good support for building a program that's outside that directory structure or using libraries that are in your Python path or anything like that. And so we're debating kind of how exactly it should be supported. So right now, if you want to run a program or compile a library, you have to kind of drop it into that directory structure. And the make system will pick up on it and and try to compile it into go. But ideally, you know, you have some kind of Python path style construct where it can find Python code and build it in a sort of standard way. And that's something that we're working towards.
42:00 Okay, cool. Now, if people want to contribute to grumpie, there's like three major areas that, that make it up, you want to talk about the three areas, so they maybe can use it as a roadmap,
42:11 you can kind of think of it as the trans compiler, which is the tool called grump C, and that takes Python code and actually uses it written in Python, and it uses the ASD module. So it's kind of cheating. What another milestone will be when grump C can compile grump C. And that takes the Python code and spits out some go code. And then you're going to the second part is the grumpy runtime, which is kind of the parallel of the C API. The Trans compiled go code will depend on that runtime. So it imports the runtime and uses the constructs and functions and things in the runtime. And so that's another sort of component that's written strictly and go. And that's where all the sort of data structures and things are defined. And finally, there's the standard library that is a mostly written in or actually exclusively written in Python, but also has some uses some of the grumpy native extensions to actually interface directly with go packages and functions and things. So those are sort of the three areas and there's a lot of work to do in all those different areas, let's say like the standard library is, is the biggest chunk of work to do at this point,
43:29 presumably, you guys chose go because of the concurrency story. Right? And if you have Python code running on go, you want to leverage that concurrency. Do you have to use a different API? This is Python two, seven, so you don't have things like async await? How do I interact with the concurrency model of go?
43:51 Currently, the way that go routines are made available is through the threading libraries, the standard Python threading library, you create a thread and started and that actually starts a go routine instead of a native thread that will work pretty seamlessly with existing code. I don't foresee huge problems there. In terms of like, the differences between those kinds of threads. And again, like, you know, chat, go has the concept of channels, which are sort of a message passing mechanism. And whereas in Python, you have a queue, the queue data structure. And this isn't actually implemented, but I plan to implement the queue using channels. And so you should be able to just write Python, concurrent Python code like you always have, but I think to really take advantage of sort of the concurrency model, you probably, eventually I'd like to implement the async and await Python construct. I think that would be a huge win.
44:50 Yeah, that would be that would be a huge win. And it seems to me like using the threading API is much more coarse grained concurrency than go is really built for. And while it would work, it's not not taking full advantage.
45:06 The idea with go is you can start a go routine, or starting a Go Go routine is extremely lightweight. And passing messages back and forth is the way to sort of share state rather than with sharing memory, or sharing objects. So I think that programs that are written with sort of heavyweight threads in mind aren't necessarily going to be the best possible way to express that functionality. And so, you know, long term I could see, you know, maybe it will actually, because you can access native go constructs, for example, you will be able to, in a grumpy program use go channels directly, you know, that has upsides and downsides, as far as diverged from the Python language and those sorts of things, but
45:55 yeah, but it's not unlike ironpython, or Jai THON or those things, right, where you can reach down into the underlying JVM, or CLR, or something like that.
46:05 That's right. Yep, absolutely. Say,
46:08 so if you're going towards async and await what's the story on Python three, since I feel like the threading concurrency story is a lot better in Python three,
46:17 yeah, I'd love to support Python three, the long term goal is definitely to support it. The reason for 2.7 is that we have a large YouTube at a large existing Python code base. And that was 2.7. So that was the main reason for choosing 2.7 out of the gate, but certainly long term. I'd like to see a whole Python three support it
46:39 right now. That'd be that'd be fantastic. I'd like to see that as well. And it certainly makes sense. If you're working on the YouTube team. YouTube has a tremendously large and widely adopted deployment of Python two, seven, like you want to, you know, work where you can have the biggest impact locally, right? Absolutely. Yeah. So reading the tea leaves, does this mean that grumpy might someday run YouTube?
47:02 I want to hedge a little bit on that. I think there's a sort of a long road ahead before grumpies ready to handle the kinds of large applications that we run on YouTube. So I wouldn't want to speculate about the the long term outcomes. Sure. Yeah, of course.
47:20 You know, let me just imagine, let's imagine a world where it did. That would be probably the first few weeks that it switch to grumpy would be a little bit nerve wracking. Right? Yeah. It would definitely, if YouTube goes down, and it's your fault, that's gonna be a problem. Yeah, exactly.
47:37 I don't want to be that guy.
47:39 Exactly, exactly. Here's the four pages we're giving, you know, just kidding. But it would, if someday that that came to be that would be a really cool outcome of this project. Yeah, absolutely. That's that's sort of the dream. Excellent. Okay, so maybe that's a good place to leave it. Let me ask you. Just a couple questions before we let you out here. If you can write some code, what editor do you use? vim? vim. All right. Very cool. And there's over 96,000 packages on pi pi these days. And I'm sure you've come across some that are kind of unique. You're like, Hey, have you heard about this package? It's pretty cool. You should check it out. You got any coming to mind? You know, it's
48:14 funny. I mean, because I do a lot of my most of my development inside Google, you know, we kind of have a different set of tools we tend to use. I don't have a ton of experience with a lot of pi packages. Yeah.
48:31 So it's a little bit more dark matter. We out here in the larger universe don't get to see a lot of the coolest stuff you guys get to use. I'm sure. It's pretty neat, though. Absolutely. All right. Awesome. So how about a final call for action? Like how can people get started grumpy? What can they do? If they if this resonates with them? Things like that? yeah,
48:48 I mean, we're, we're super interested in in seeing where the project goes, I don't have, like I said, I would like to see where grumpie can be useful besides just large concurrent server applications. community feedback around that is great I people have been filing issues asking about, you know, support for different things. And that's been really illuminating, seeing where people are thinking about where this might be useful. So that's huge. If you have the time and inclination, try it out, just clone the repo and type, make run and try out Python and go and report any issues. That's really useful to us. And, and obviously, there's a ton of work to do. We talked about some of the different things and, you know, contributions of IP or pull requests on GitHub are really appreciated. It's been kind of amazing how much people effort people have put in already. So that's been really exciting for us.
49:45 Yeah, it's, it's a cool project. And I think, if we have yet another powerful, flexible runtime that has some different trade offs that we can make for Python. That's great for everyone. So congratulations on your project and thanks for sharing with everyone. Yeah, thanks
50:00 very much, Michael.
50:01 You bet. Talk to you later. This has been another episode of talk Python to me. Today's guest has been Dylan Trotter, and this episode has been sponsored by hired and pi image search. Thank you both for supporting the show and hardwoods to help you find your next big thing. Visit hire.com slash talk Python to me to get five or more offers with salary and equity presented right up front and a special listener signing bonus of $2,000. struggling to get started with neural networks deep learning and image recognition pi image search comm can help with that. To learn more about their new book deep learning for visual recognition with Python and back the Kickstarter campaign. Just hit the PI image search comm slash Kickstarter Are you or a colleague trying to learn Python? Have you tried books and videos that just left you bored by covering topics point by point, well check out my online course Python jumpstart by building 10 apps at talkpython.fm/course to experience a more engaging way to learn Python. And if you're looking for something a little more advanced, try my write pythonic code course at talk Python FM slash pythonic. And be sure to subscribe to the show open your favorite pod catcher and search for Python we should be right at the top. You can also find the iTunes feed at /itunes, Google Play feed at /play in direct RSS feed at /rss on talk python.fm. Our theme music is developers developers developers by Cory Smith Goes by some mix. Corey just recently started selling his tracks on iTunes. So I recommend you check it out at talkpython.fm/music. You can browse his tracks he has for sale on iTunes and listen to the full length version of the theme song. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Let's mix. Let's get out of here.
51:47 standing with my boys having been sleeping. I've been using lots of rest got the mic back