Learn Python with Talk Python's 270 hours of courses

#95: Grumpy: Running Python on Go Transcript

Recorded on Thursday, Jan 12, 2017.

00:00 Google runs millions of lines of Python code.

00:02 The front-end servers that drive YouTube.com and YouTube's API are primarily written in Python,

00:09 and they serve millions of requests per second.

00:13 On this episode, you'll meet Dylan Trotter, who is working to increase the performance and concurrency

00:19 of these servers powering YouTube.

00:20 He just launched Grumpy, a Python implementation based on Go, the highly concurrent language from Google.

00:28 This is Talk Python to Me, recorded January 12, 2017.

00:56 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities.

01:04 This is your host, Michael Kennedy.

01:06 Follow me on Twitter, where I'm @mkennedy.

01:08 Keep up with the show and listen to past episodes at talkpython.fm, and follow the show on Twitter via at Talk Python.

01:15 This episode has been sponsored by Hired and, as a new sponsor, pyimagesearch.com.

01:21 They're announcing a Kickstarter campaign called Deep Learning for Computer Vision with Python,

01:26 launching on Kickstarter right now.

01:28 Thank both of these companies for supporting the show by checking out what they have to offer

01:32 during their segments.

01:33 Dylan, welcome to Talk Python.

01:35 Thanks.

01:36 Nice to be here.

01:37 Yeah, I'm really excited to talk about Grumpy, actually.

01:41 Grumpy, your Python project.

01:42 It's going to be fun.

01:43 Yeah, it's been pretty exciting a couple weeks since the release.

01:47 So, yeah, I'm excited to talk about it, too.

01:48 Yeah, it's definitely gotten a lot of attention in the open source world on GitHub.

01:52 And we're going to dig into a lot of the details behind it.

01:55 But let's start with you and your story.

01:56 How did you get into programming in Python?

01:58 I started programming, I guess, when I was in high school.

02:01 I took, like, an intro programming course and kind of got the bug.

02:07 And I just kind of took it from there.

02:10 I was really into, like, programming little games and stuff like that back then.

02:13 I did not do CS in university.

02:17 I actually did physics, but I continued to work on programming in my own time a lot.

02:23 And after or during university, actually, I got a gig at a sort of summer gig at a software company.

02:30 And that gave me a leg up when I graduated, which was pretty lucky.

02:35 And so I got a job at a visual effects company doing software there.

02:42 So there was a bunch of different things, like a lot of sort of proprietary languages for the different packages.

02:48 But Python sort of came out as a front runner in terms of integration with different visual effects packages and stuff like that.

02:56 And so that's where I started to dig into Python, especially not so much on the sort of, like, effect side,

03:03 but more on the pipeline data management kind of side of things.

03:07 So there's a lot of asset management and stuff going on in visual effects studios.

03:11 And Python's great for that sort of stuff.

03:14 Yeah, that's really cool.

03:15 I did a whole episode on Python and, like, game development studios and movies and production and stuff.

03:22 I was really surprised how much Python glues all the tooling together for those folks.

03:27 Yeah, it's really deep in there.

03:28 In fact, when I was working in that area, that's when Python sort of started to come to the fore.

03:35 And so, like, Maya, which is a big, like, modeling and animation package, built Python integration around that time.

03:44 And Houdini is another one, similar use cases that integrate.

03:48 Or actually, I think from pretty early on, Houdini had Python integration.

03:53 So, yeah, it sort of became the de facto visual effects integration language.

03:58 Okay.

03:58 Yeah, yeah, very cool.

04:00 And I think that's only growing.

04:01 It seems like there's a couple areas where Python is sort of past critical mass.

04:06 It's kind of like a black hole now.

04:09 It's just sucking everything into it.

04:10 Totally.

04:10 Yeah, yeah.

04:11 And that's a good thing.

04:12 So you don't do visual effects anymore, although you kind of work in the video world these days.

04:19 Why don't you tell everybody what you're up to?

04:20 Yeah, sure.

04:21 So I'm at YouTube now.

04:23 I started there about seven years ago.

04:25 It was kind of a big shift for me.

04:28 Visual effects was a fun environment, but it was always kind of a dream to work at Google and stuff.

04:35 So I took a gig at YouTube.

04:36 And I've worked on a number of different teams there, actually.

04:40 I've worked on sort of user-facing features.

04:43 I was on the channels team for a long time, working on YouTube channels and stuff around that.

04:50 And eventually got more into the infrastructure side.

04:54 And so now I'm working on what's called, I guess, the application infrastructure group.

05:01 And our team specifically looks after the application server that serves YouTube.com and YouTube APIs and those sorts of things.

05:10 Excellent.

05:10 So when we're all watching various things on YouTube, be it cat videos or something educational, we have you to thank for keeping those servers running.

05:19 Yeah, well, me and a lot of other people.

05:22 Yep.

05:22 Yeah, yeah, I'm sure.

05:24 Well, I'll thank you individually.

05:26 So, yeah, awesome.

05:27 Yeah, it sounds like a really fun place to work.

05:30 Where's YouTube?

05:31 Where's the center of the universe for YouTube?

05:33 Is that Mountain View or somewhere else?

05:34 San Bruno actually is where the main YouTube campus is.

05:38 So there's a few different offices around the world.

05:42 But the biggest group of the biggest sort of geographical concentration is in San Bruno.

05:49 So there's a few buildings there that are YouTube.

05:52 Okay.

05:52 Yeah.

05:53 Nice.

05:53 That sounds like so much fun.

05:55 So Python actually plays a really important role at YouTube these days.

05:59 Let's talk about how it's used now and then how that kind of came to be.

06:02 Sure.

06:02 Yeah.

06:03 So Python is what is running the main application server and a lot of the application code for

06:10 the YouTube front end and for that serves like the website and the API service APIs that,

06:18 you know, service your phone and those sorts of things.

06:21 So it's sort of like the gateway for most user traffic.

06:26 Right.

06:27 And then maybe the Python code branches back into all sorts of Google services behind the

06:31 scenes that are in a variety of technologies or something like that.

06:34 Right.

06:34 Yeah.

06:34 There's a, there's a lot of different technologies and servers involved in the whole thing.

06:38 Yep.

06:39 Okay.

06:39 So YouTube wasn't initially a Google creation, right?

06:43 It was created by some other folks.

06:44 It was founded in, in 2005, I think by three guys.

06:50 One of the one of the one of the one of the one of the ones that I joined in 2009 is Chad Hurley.

06:55 I think at the time he was the president or something.

06:59 He left shortly after I joined.

07:01 But yeah, they, they built it in 2005 and it gained a lot of traction really early on.

07:08 And I guess Google took an interest at some point in 2006 and, and ended up buying YouTube in

07:16 November, 2006.

07:17 Yeah.

07:18 I'd say that was a great move for them because it's, it's such a central part of the internet.

07:22 Yeah.

07:23 I feel like it, it had YouTube.

07:25 The idea was something that a lot of people probably had the idea.

07:30 It was a thing that clearly should exist.

07:33 But when you think of the infrastructure and the bandwidth costs and just the actual act

07:39 of creating such, such a huge video network seems prohibitive.

07:44 But, you know, once it came into existence, you know, I guess Google jumped on it.

07:47 That's cool.

07:48 I actually remember thinking what a simple idea it was and, and how like, it seemed so crazy

07:54 at the time that I think the acquisition cost was like 1.6 billion or something like that.

07:58 And, and I remember reading about that and I was like, good Lord, like, you know, you

08:02 could, how, how is something so simple worth so much?

08:06 But now that number seems so quaint compared to recent, recent stuff.

08:11 So, yeah, yeah, of course, of course.

08:13 I mean, a lot of companies go through the thinking, it's easy to go through the thinking of you

08:18 could have just built that yourself.

08:19 Yeah.

08:20 Right.

08:20 I mean, Facebook bought Instagram for like an insane amount of money and that's like a team

08:24 of 12 people, right?

08:25 For whatever it was like 19 billion or something.

08:28 They, they could have easily paid 12 people to build another Instagram, but it's, it's also

08:32 also got people's interests.

08:35 It's got the users.

08:36 It's got the momentum.

08:37 And that, that's the thing I think people buy.

08:40 Absolutely.

08:40 That's what you're paying for.

08:41 Yeah.

08:42 But they didn't write it in Python at first, did they?

08:44 No, they, the first implementation I believe was PHP.

08:47 And I don't think that lasted very long.

08:49 I think it was, most of that was rewritten in Python pretty early on.

08:53 Well, before I was there.

08:54 Sure.

08:54 Sure.

08:56 I suspect the way YouTube looks today with the growth of cloud computing and all the

09:01 different APIs and services is probably super different from when, what you guys got back

09:06 in 2006, right?

09:07 Yeah.

09:08 It's, I mean, I think, you know, the company has grown a lot.

09:11 The, the use cases have grown a lot.

09:14 It's just, I mean, it's kind of night and day.

09:16 It's when I, when I first joined, you know, everyone was kind of in one floor of one building.

09:21 And since then, you know, there's distributed all over the world.

09:24 And so, yeah, it's, it's, it's changed a lot.

09:27 Sure.

09:27 Wow.

09:29 Okay.

09:29 So that brings us to today, to YouTube and what this project that you, was this something

09:37 that you created this project called grumpy or where did this come from?

09:40 Yeah.

09:41 I talked about some of the challenges we were having with, you know, running Python at scale

09:46 and on, on the blog post.

09:48 And it basically, there's a few different aspects that, that affects our ability to run, you know,

09:56 that many Python servers.

09:58 The CPython runtime, well, it's, it's really great and it's highly optimized and, and it

10:05 does a lot of things really well for our use case.

10:09 It's never really been a focus for CPython as a project.

10:14 You know, we thought, you know, maybe it makes sense to rethink how the runtime is built with

10:22 a focus on concurrency and, and running large server applications.

10:27 Yeah.

10:27 And you're not, you guys are not the first people to have this idea of, well, maybe we

10:31 could replace the CPython runtime interpreter with something else.

10:37 There's like Jython, there's IronPython, there's PyPy, there's plugin JIT.

10:44 So there's a lot of stuff happening there, but nobody's gone in the direction that you went

10:48 in, right?

10:49 Yeah.

10:49 Yeah.

10:49 It was, it was, it was an interesting, I mean, it's, you know, in a lot of ways it's kind

10:53 of crazy.

10:53 And the thing about Go, the Go runtime that Grumpy is based on is that it is kind of designed

11:01 for very similar use cases to what we are interested in.

11:06 So Go tends to be, tends to be used for writing highly concurrent server applications with,

11:14 you know, a lot of like sort of message passing and things within the, within the application

11:20 between threads.

11:21 It seemed like kind of a good fit.

11:25 And once I started to flesh things out and to build out some of the core functionality,

11:30 some of the pieces started to fall into place and it started to look actually really compelling.

11:35 And you're like, Hey, we could actually do this.

11:37 We should stop for just a second.

11:39 I don't think we've explicitly said your project is called Grumpy, which is a replacement for

11:45 the CPython implementation with a entirely different Go implementation, right?

11:51 Yeah, that's right.

11:52 Yeah.

11:52 Yeah.

11:53 So, so very interesting.

11:54 I think, you know, Go, obviously it makes sense for Google to be the ones experimenting

12:00 with Go, right?

12:01 Go comes from Google, doesn't it?

12:03 It does.

12:03 Yep.

12:03 It was developed, I think, originally by, well, Rob Pike and I'm going to mix it up.

12:11 It's either Ken Thompson or no, it's, yeah, it's Ken Thompson, I believe.

12:15 Yeah.

12:16 It was, it was developed for, I guess they had observed that I get similar to, you know, the,

12:23 what the observations that we made about running Python programs for Python server programs.

12:29 They had made sort of more general observations about writing server applications and how languages

12:36 that existed didn't, didn't quite fit what our use cases.

12:40 Yeah.

12:40 Go is really quite, it's one of the newest languages out there that I would consider a mainstream

12:45 language.

12:46 It's not as mainstream as C, but it's definitely getting there and came out in 2012 in version

12:51 one sort of officially.

12:52 So it's born within this world of multi-core microservices, distributed cloud computing stuff,

13:01 right?

13:02 Yeah.

13:03 Yeah.

13:04 Okay.

13:04 So let's dig into the, what is grumpy?

13:08 Let's dig in a little bit.

13:09 Like how do I take, so I can take my Python code.

13:13 I can write some, presumably some web app or something in a web service, and then I can run

13:20 that on grumpy.

13:20 Like what, what does grumpy do?

13:22 How does it take my Python code and run it?

13:24 So grumpy is, takes a little bit of a different tack than CPython.

13:29 It's actually a trans compiler and a runtime, whereas you can kind of think of CPython as

13:36 a, it's like a virtual machine bytecode interpreter and runtime.

13:40 And in that sense, it's kind of like a combination of Cython and, you know, a bundling of Cython and

13:48 CPython, except that it's all in Go.

13:50 Right.

13:51 So Cython takes a flavor of Python and then compiles it to C directly.

13:58 And like C, Go is a statically typed compiled language.

14:04 And so it's no longer interpreted.

14:07 It's not even like JIT compiled like Java or .NET.

14:11 It's full on compiled, right?

14:12 That's correct.

14:13 Okay.

14:14 So the sort of runtime side of things is actually like the correspondence is like this Python

14:18 C API.

14:20 There's actually a Go grumpy API.

14:22 And so what it's compiling is code that uses that API to mutate objects, to pull out a state

14:31 and those sorts of things.

14:32 And so whereas CPython or vanilla CPython uses a bytecode interpreter to actually drive those

14:40 API calls, the Grumpy and Cython are actually generating code that drives those API calls.

14:49 Okay.

14:49 Yeah.

14:50 Very, very cool.

14:51 Now in your GitHub repo or the blog post, I don't remember where I got this, but you said

14:56 it's intended to be a near drop-in replacement for CPython 2.7.

15:00 How's that going?

15:02 How far are you towards that goal?

15:04 That's a pretty big set of APIs to cover.

15:07 Yeah.

15:07 I'm learning every day like how big Python is.

15:10 Nobody told me about this weird case I'm going to have to support.

15:14 Oh yeah, totally.

15:15 Yeah.

15:16 I mean, I've been the amount of sort of spelunking I've done in CPython internals is I did not

15:22 expect all that.

15:24 But yeah, so it's going pretty well.

15:27 The core functionality is there.

15:29 So like the basic semantics of the language in terms of attribute access and how types work

15:35 and how method dispatch works, all of that functions basically fine.

15:42 The basic types are all there.

15:43 So lists and dictionaries and things all kind of work.

15:47 Do those mostly map directly to the underlying Go structures?

15:53 Like does a list in Python map to a slice in Go and things like that?

15:58 Or do you have to do more complicated things to map it?

16:00 It's more complicated.

16:01 And the reason is that Python is so dynamic, right?

16:06 Like method dispatch is so dynamic and attribute access.

16:10 You can put attributes on just about anything.

16:12 You know, if it was just this native Go types, then you wouldn't be able to put an attribute

16:20 on a list or on a slice, right?

16:23 Right.

16:23 So it's actually, there's sort of wrapper types, basically structures that actually map very closely

16:29 to CPython's object structures.

16:32 Okay.

16:33 Yeah, I can see that because you're working with a non-dynamic language and yet it has

16:38 to support dynamic capabilities.

16:40 So you got to somehow put a shim in there for that, right?

16:43 That's right.

16:44 Okay.

16:45 I guess the biggest kind of gaps in terms of supporting or being a drop-in replacement are

16:50 the standard library still needs a lot of work.

16:53 So CPython has a lot of its standard library is actually written as C extension modules, which

17:00 Grumpy does not support.

17:02 So that's one area of significant divergence between the two words.

17:06 And we could talk about that more.

17:08 That's turned out to be sort of a big kind of beast to slay.

17:13 The nice thing is that with, you know, all those other Python runtimes out there, there's

17:20 actually, you know, you can find pure Python versions of most things.

17:24 So like PyPy, for example, implements a number of libraries that are in Python that aren't implemented

17:30 in CPython.

17:31 Right.

17:32 So you could say, start this transition or this backfilling of APIs by just moving to

17:39 pure Python implementations that then get sent through Grumpy that actually get compiled

17:45 or run on Go, right?

17:46 Yep.

17:47 That's exactly right.

17:48 And maybe do some profiling and say, well, you know, people use lists a lot.

17:51 Let's write that directly in Go or something like this, right?

17:55 You can optimize later.

17:56 Exactly.

17:56 Yep.

17:57 Okay.

17:57 Yeah.

17:57 I suspect that there's a long tail of like stuff.

18:00 This doesn't really need to be optimized that last 5%.

18:03 Whereas these are the few things that we really should focus on, right?

18:06 Yeah.

18:06 So right now, you know, I'm kind of focused on getting support for the whole, like I want

18:13 to be able to run some common libraries that are written in Python.

18:17 Some, I want some program, Python programs that are out there, like open source programs to

18:22 be able to just use Grumpy.

18:24 So like just getting it to the point where everything runs is the first step and then

18:30 you make it fast.

18:30 Okay.

18:31 Yeah, of course.

18:31 Making it work and then making it fast seems like the right order to me as well.

18:35 So you said in your blog post that there's going to be some things that Grumpy will never

18:40 support and then there's things that it doesn't support yet, but you're working towards.

18:44 Yeah.

18:45 So one of the things I mentioned already is the C extension support.

18:51 The API for CPython is a bit different than the API for Grumpy because it's, well, for one

18:58 thing it's a different language, but also the data structures are a little bit different.

19:02 The function return values and things are a little bit different.

19:05 And so there wasn't a good mapping between those APIs and it would be too constraining for,

19:14 you know, to try to make Grumpy map perfectly to the C API.

19:20 Sure.

19:20 Have you looked at the CFFI stuff that PyPy was using?

19:25 Right.

19:25 So that's, I have not looked very closely at that.

19:29 That is something that we've looked at internally for other reasons as well.

19:33 But that is an interesting way to approach the problem.

19:38 And, and potentially, you know, there are ways to bridge the two APIs that C and CFFI may

19:45 be one of those.

19:46 Yeah.

19:46 Okay.

19:46 Does go must have a C C integration option somewhere, right?

19:51 It does.

19:51 Yeah.

19:52 Yeah.

19:52 Okay.

19:52 And the other thing you said is not going to support is things like eval.

19:55 And again, this is like, it is possible to implement something that's a little bit hokey to support

20:03 eval or exec.

20:04 Shell out and compile.

20:06 Oh yeah, exactly.

20:07 I mean, like that's, well, I mean, it's funny you think about it and like, that's, that's actually

20:11 what Python is doing, right?

20:13 It's like, except that it's a bytecode compiler and then it's executing in a VM.

20:18 If you instead are actually doing a, you know, an actual static compilation and then executing

20:26 that.

20:26 It's not conceptually that much different, except that the tool chain that you have to use to

20:32 do the compilation and stuff is much heavier.

20:33 So it's going to be slower and it just, it kind of doesn't make a lot of sense.

20:38 I think I could see maybe supporting it for, you know, debugging use cases and things like

20:43 that.

20:44 I don't think I, I kind of want to avoid having to worry too much about like, you know,

20:50 making that performant or whatever.

20:52 Yeah, sure.

20:52 I, I, for one would, don't think I would miss it.

20:56 I think it's fine.

20:56 Yeah.

20:57 The other thing about exec and eval is there's very few cases I've ever come across in all

21:04 my years of programming Python where exec or eval was a good idea.

21:08 So actually like, I kind of think that it's an unnecessary aspect of that language.

21:14 Yeah.

21:14 That's interesting.

21:15 And you know, it is kind of keeping with go in the sense that go is very strict about

21:20 conventions and some of the best practices that it believes.

21:23 Like for example, if you have an import of a package and you're not using that package,

21:29 that's a compilation error, right?

21:30 Things like that.

21:31 Right.

21:31 Absolutely.

21:32 Yep.

21:32 Yeah.

21:32 So eval skipping eval seems like that's all right.

21:35 This portion of talk Python to me is brought to you by hired hired is the platform for

21:51 top Python developer jobs, create your profile and instantly get access to 3,500 companies

21:56 who will work to compete with you.

21:58 Take it from one of hired users who recently got a job and said, I had my first offer on

22:02 Thursday after going live on Monday and I ended up getting eight offers in total.

22:06 I've worked with recruiters in the past, but they've always been pretty hit and miss.

22:09 I tried LinkedIn, but I found hired to be the best.

22:12 I really liked knowing the salary upfront.

22:14 Privacy was also a huge seller for me.

22:17 Sounds awesome.

22:18 Doesn't it?

22:18 Well, wait until you hear about the signing bonus.

22:20 Everyone who accepts the job from hired gets a thousand dollars signing bonus.

22:24 And as talk Python listeners, it gets way sweeter.

22:27 Use the link hired.com slash talk Python to me and hired will double the signing bonus

22:31 to $2,000.

22:32 Opportunities knocking.

22:33 Visit hired.com slash talk Python to me and answer the door.

22:37 Then you said there's a set of things that you're going to support, but it doesn't yet.

22:49 What are those?

22:50 We talked a little bit about some of this stuff, but like the standard library is not there yet.

22:56 There's a subset of the standard libraries is available currently.

23:00 Can you give us like a percentage of what, how far you are down that path?

23:04 I mean, everyone listening, this, this whole project has been like, I don't get up for three

23:09 or four weeks.

23:09 So it's not like you should have implemented at all.

23:12 It's just curious, like how far you've gotten.

23:14 It's really hard to put a percentage on it.

23:16 I guess, I mean, I probably could like, you know, compare lines of code or something, but

23:19 I think that what's going to happen is you're going to get sort of a core set of libraries

23:25 that run all the other libraries and everything will just kind of fall into place.

23:29 So I think it's, it's sort of more important to count those core libraries.

23:33 And you know, that's, that's things like types and collections and operator and all those

23:38 things.

23:38 And some of those are already there.

23:42 I mean, I, I feel like, yeah, it's hard to put a number on it.

23:46 Yeah, sure.

23:47 Maybe it's one of those things where it's, it seems like you're not very far and then all

23:52 of a sudden it kind of unlocks and things go really quick.

23:55 That's the dream.

23:56 That's a good way to like, it's a, it's a optimistic view of the future.

24:02 That's right.

24:03 If you're going to clone the repo off GitHub and try things out, like you may be disappointed

24:09 that, that your favorite libraries aren't there.

24:12 There's a good chance that if you have a program that's, that's at all, you know, complex that

24:18 there are some libraries that are missing for you.

24:20 I'd say, I don't know, maybe 20% or something like that.

24:23 Okay.

24:24 Well, that's good.

24:25 And you said you also want to support all the built-ins.

24:27 That's right.

24:28 Yeah.

24:28 That's obviously a good idea.

24:30 Yeah.

24:30 Those are important.

24:32 And again, you know, there's a bunch of stuff that just hasn't a bunch of like functions

24:37 like map and, and reduce and things like that, that I haven't got around to, haven't needed

24:45 to support them yet, but they're actually pretty straightforward to implement by and large.

24:50 So, so I think we're, we're pretty far along on, on that stuff.

24:54 So how much of your focus on Grumpy is going to be to make this a project that you guys could

25:01 use for your specific use cases at Google and then make that a skeleton or base and people can

25:09 come along and add other features and contribute to the open source project to make it more broad

25:14 versus how much are you trying to make this like we're trying to re entirely replace CPython.

25:19 So I think that we, I want to see, I like, okay.

25:24 So, so I put it this way.

25:25 I'm interested in, you know, solving some of these concurrent use cases that don't have a great answer

25:33 in CPython.

25:34 That's the primary focus.

25:35 But I, again, it might be my optimism is showing again, but like, I feel like once you kind of have

25:41 some of those use cases locked down, now people start to use it for, for things you didn't expect

25:48 right away.

25:49 I know that like scientific computing is an area where Python has a really well-established

25:57 libraries and, and NumPy is, is sort of crucial to some of this stuff.

26:01 And that's got, that's C, you know, involves C extensions.

26:05 And I think in the near term, I don't see Grumpy being useful for numerical analysis.

26:11 And, you know, that's kind of compounded by Go doesn't have too many sort of inroads in that

26:17 direction either.

26:19 So, but on the other hand, you know, some of the static, the advantages of like being

26:25 statically compiled and, and type inferencing and compiling down to native operations, that is

26:33 potentially useful for, you know, scientific computing and those sorts of things.

26:37 So, so I kind of see, you know, I want to focus on our, our immediate use cases, but I have this

26:42 kind of idea that there's.

26:44 You know, more opportunities out there once that's, once things are kind of working.

26:48 Okay.

26:49 Yeah.

26:49 Yeah.

26:49 That, that seems like a good roadmap to me.

26:51 It makes a lot of sense.

26:52 So let's talk about the execution engine, which effectively, effectively is the execution engine

26:59 of Go versus CPython.

27:01 So CPython, the Python code gets converted to bytecode.

27:06 Those bytecode instructions are sent to like a super large force for loop switch sort of thing.

27:13 And those are interpreted and run.

27:15 How does Go work?

27:17 Go has a runtime, which is to say that there's code that is running, that's managing things

27:25 like Go routines, which are the equivalent of threads and, and Go programs and garbage collection

27:32 and things like that.

27:33 But much of what is actually happening throughout a Go program is just, is actually, you know,

27:39 low level machine instruction.

27:41 So the Go program, much like a C program is compiled down to a machine code and actually

27:47 executed natively.

27:48 Right.

27:48 That makes sense.

27:49 So you say Go has a garbage collection, which is, is awesome.

27:55 Do you know what kind it is?

27:57 Is it reference counting or is it like mark and sweep or what, what kind of garbage is it

28:02 deterministic?

28:03 How's the garbage collector work and go?

28:04 This is not my area of expertise.

28:07 Nor mine.

28:07 But it is not reference counted.

28:10 So I believe that it is a, and actually this has changed significantly.

28:15 I believe in 1.7, they significantly re retro or sort of retrofitted the garbage collector.

28:22 It mostly just around the way that garbage for particular Go routines is managed garbage that

28:31 is sort of local to particular Go routines.

28:33 And, but it's, it's sort of a traditional, otherwise it's pretty, it's kind of a traditional garbage

28:40 collector that much similar to what Java has.

28:46 But it's actually much simpler.

28:47 Java has a, a number of different algorithms it supports and a lot of tuning parameters.

28:52 Go's garbage collection is fairly, is much simpler and is targeted for the use case of,

28:59 you know, handling requests in a server application and those sorts of things.

29:03 Yeah.

29:04 It makes sense.

29:04 I suspect they highly parallelize that thinking of Go as well.

29:07 One thing you said that's nice about executing ultimately on Go is you said the deployment story

29:13 is a little bit simpler.

29:15 You know, Python, you do, when you deploy a Python program, you are actually including your like

29:23 PY files or at least your PYC files in the deployment.

29:28 And so you have to have some way to sort of package them together and ship them off to

29:33 production or wherever you're running your program.

29:36 Right.

29:36 And beyond that, also the dependencies and the runtime, right?

29:39 So you got to have all of those things.

29:41 That's right.

29:42 Which can make it really tricky.

29:44 And there's things like PY2 AMP, PY2 XE, CX freeze, the, the Bware project.

29:48 There's a lot of project trying to make that something you can ship around, but it's not simple.

29:53 Yeah, that's right.

29:54 And, you know, so I'm sure people who have run Python in production have run into, you

29:59 know, version mismatches, things like that, using the system Python version, which was,

30:04 you know, different than the one they were developing on and so on.

30:07 The nice thing about statically compiled programs in general is that you, you produce a binary

30:13 and you just, you can put that just about anywhere and it'll run.

30:18 And that's very true for Go programs.

30:21 There's few dependencies in most cases.

30:24 Most of the, the runtime is actually compiled or is actually linked into the executable.

30:30 Yeah, that's really cool.

30:31 What's the size of like a Hello World compiled output?

30:35 Do you know?

30:35 I have not looked at the size myself.

30:38 I think I saw some comment somewhere that said it was something like three megabytes.

30:44 So it's, it's pretty substantial, but you know, that, that includes a lot of overhead for

30:49 the runtime that, that wouldn't increase significantly if your program grew.

30:54 Right.

30:54 Absolutely.

30:55 Like, you know, the next 10,000 lines add 10 K or something.

30:58 Right.

30:58 Exactly.

30:59 Yeah.

30:59 I think three megs is totally fine to get a good deployment story, stability.

31:04 You run what you shipped, all those things.

31:06 Like if this was 1994, three megs would be a problem, but it's not today, right?

31:11 Yeah, that's right.

31:12 Nice.

31:13 So what sort of optimizations do you think are possible if you run Python on Go, if rather

31:20 than as an interpreted system?

31:22 This is not an area I've dug into significantly yet.

31:27 My thinking is that if you can determine that a particular, for example, a particular integer

31:33 counter in a function is only ever an integer type and it only, you know, uses integer operations

31:40 like increment or, or whatever, then there's no need to go through the whole Python method

31:46 dispatch and creating new integer objects.

31:50 Every time you increment that counter, you can actually just use a native integer and increment

31:56 using native operations.

31:57 So that's a, that's a really simple example, but not, not uncommon.

32:02 I think once you kind of broaden that to a whole program optimization, that's when things

32:08 start to get interesting because then you can think about like, well, if you know that a

32:12 function is only ever called with particular parameters or parameters of a particular type,

32:18 then you can make some assumptions and again, use native, maybe use native data types.

32:25 Sure.

32:25 What about type annotations?

32:27 And I know that's more a Python three thing, but would you be able to, or interested in having some flavor that

32:35 takes type annotations and then uses that for certain types of optimizations?

32:39 Yeah.

32:40 I thought about this and, and I'm a little ambivalent because, you know, type annotations, the way that

32:46 they are sort of used today, they're not intended to actually, you know, raise or anything if they're

32:52 not respected.

32:53 it's mostly for analysis before you ship your program to like, you know, make the linting, the

33:01 linters job easier and things like that.

33:02 Right.

33:03 And so, when, once it actually in CPython, once your type annotations, once you're actually

33:09 running your program, the type annotations basically have no effect.

33:12 And so I'm a little hesitant to say that grumpy should use these in a more, in sort of a more

33:21 strict way, because I think that might have affect programs compatibility and stuff like

33:27 that.

33:27 Yeah.

33:28 It will absolutely do that.

33:29 Wouldn't it?

33:29 Yeah.

33:29 There's some real advantages there.

33:31 If you, if you do make them strict, then you say that a type, an argument is an integer,

33:36 then yeah, it makes the optimizer's job way easier because it can, you know, it doesn't have

33:40 to do any inferencing to determine that relationship.

33:43 Obviously it would break the sort of contract with type annotations that these are just for editors

33:49 and linters and to help you, but not actually meant to affect the runtime.

33:53 That's right.

33:54 On the other hand, if, if you could make some part of code that's like really critical go,

33:59 you know, 10 or a hundred times faster by putting a type annotation that's strict, you know,

34:05 you might be willing to make that trade off.

34:06 So I have no, I don't know which way would be the right way to go either, but it's interesting

34:10 to think about.

34:11 Yeah.

34:11 I'm very curious how that sort of evolves.

34:15 Yeah.

34:16 Yeah.

34:16 Yeah.

34:16 I'm going to keep an eye on it.

34:18 That's cool.

34:18 So let's talk about when you launched.

34:20 So this should be pretty fresh in your mind, right?

34:22 Yeah.

34:23 It's not a very old project.

34:25 It's about a week, week and a day.

34:27 Yeah.

34:28 Yeah.

34:28 So we, well, I guess we migrated the code to GitHub in mid-December and I spent some time

34:37 over the next month kind of cleaning it or the next few weeks cleaning up the code and adding

34:42 some functionality for the build system that we were not able to use, obviously the internal

34:50 Google build system in the open source project.

34:53 So I had to build some of that out and then I guess January 4th.

34:59 Yeah.

34:59 I guess it was the 4th.

35:01 That's eight days ago just for the day of the recording.

35:03 Yeah.

35:04 Yeah.

35:04 We did.

35:05 We, we sort of coordinated an open source blog post with, with the actual making the GitHub

35:11 repo public and got a little bit of traction on hacker news and, and yeah, it was kind of

35:18 astonishing how great the reception was.

35:21 Yeah.

35:21 It's going like crazy.

35:22 Like when I took notes to, for this conversation, like four or five days ago, I had said there

35:27 were 5,000 stars in GitHub.

35:28 Now, maybe that was three days ago.

35:31 Now there's 6,000, almost 6,317 contributors.

35:35 That's, that's a pretty serious uptake for a project that's been out for eight days.

35:40 Yeah.

35:40 I, I think the thing that kind of blew me away most was the number of pull requests that

35:46 I got.

35:46 I mean, right on day one, people were digging into the code and, you know, doing like it,

35:52 it, the code, there are tricky parts to the code and it's not necessarily obvious how you

35:56 ought to write certain features.

35:59 And people, you know, really dug in and started filling out some of this functionality that's

36:04 missing and started talking about, you know, well, how are we going to support programs or

36:09 libraries, Python, third party Python libraries out of the box and stuff like that.

36:15 So it's been great.

36:16 I've had a really good time working with some, some of these people that have been contributing.

36:20 Yeah.

36:21 Yeah.

36:21 I would say that's really cool.

36:22 You talked about the code a little bit, looking on GitHub, GitHub thinks it's 77% go

36:28 code, 22% Python code and a bit of a make file.

36:31 Yeah.

36:32 That's about right.

36:32 Yeah, that's about right.

36:33 And, and a lot of that Python code is actually just tests, and benchmarks and things.

36:38 So it's, it, most of it is, is go.

36:41 And, and actually, I guess the standard libraries, which most of which are copied from, from

36:47 other places like CPython, there's pretty substantial amount of Python, but that's not like, you know,

36:51 I don't think about too much about that code since we don't have to write it or maintain it.

36:55 Yeah, absolutely.

36:56 So how do you ensure compatibility in this?

36:59 Like, are you running the standard CPython test?

37:01 That's something that we're working to.

37:03 So that's sort of milestone number one.

37:05 I haven't published a roadmap document yet, but getting to the point where we can run

37:11 the unit test library is going to be a huge milestone because it means we can then run

37:16 the unit tests that are written for CPython.

37:19 That would be a huge milestone just on compatibility.

37:21 Exactly.

37:22 Before, we get there, we've been writing small tests to that, you know, demonstrate,

37:30 compatibility concerns and stuff like that.

37:32 And then running those in both Python and Crumpy.

37:36 Okay.

37:36 Yeah.

37:37 Very cool.

37:37 So let's talk about why you chose Go because are there three sort of officially blessed languages

37:44 at Google?

37:45 There's Python, there's Go and Java.

37:47 Is that?

37:48 And C++.

37:48 Is that the story these days?

37:49 Yeah.

37:50 Right.

37:50 Of course.

37:50 And C++.

37:51 So four.

37:51 So why did you choose Go?

37:53 Like you could have tried Jython or something, right?

37:55 Jython is something that we, we've looked into.

38:00 Jython is a really great mature product.

38:03 it's our experience that it's better to start a project on Jython, than to migrate

38:11 to Jython.

38:12 There's a number of compatibility issues, not so much like the kinds of compatibility issues

38:18 like, oh, on an, on CPython, this function returns, a different type or something like

38:24 that more that there are certain constraints of running in the JVM that make certain programs

38:31 not work very well or, or those sorts of things.

38:34 So like performance issues that sort of crop up in those sorts of things.

38:37 It sounds like running on the JVM was not the best concurrency server story as it might've

38:45 been running on Go because Go is more focused on concurrency from the beginning and things

38:49 like that.

38:50 That might be more important to you guys.

38:52 Yeah.

38:52 I think that was part of it.

38:53 I mean, like lightweight Go routines are definitely a big advantage to Go.

38:59 So Java has native threads, which have large stacks.

39:02 And so it has sort of a different performance characteristics for concurrent workloads.

39:09 and so you have to kind of write programs, parallel programs in a slightly different way

39:15 for Java, but also, for real time server applications, the JIT actually can be a liability.

39:23 It becomes difficult to, you know, reproduce certain kinds of, certain kinds of issues,

39:31 debug certain kinds of problems and consistent because consistency of, how requests are handled,

39:39 is really important in these kinds of applications.

39:42 And, and the JIT can make, you know, identical requests behave very differently depending on

39:48 where in the life cycle the program is.

39:50 Sure.

39:50 Yeah.

39:50 That makes a lot of sense.

39:51 Being statically typed, you get a little more predictability.

39:54 Absolutely.

39:55 Well, not, sorry, not statically typed, compiled to like machine instructions rather than digits.

39:59 Yeah.

39:59 Yeah.

40:00 Yeah.

40:00 That's right.

40:01 Yep.

40:01 Hey everyone.

40:02 Let me take just a moment and tell you about a new sponsor with a cool and timely offer.

40:06 This portion of talk Python to me is brought to you by deep learning for computer vision

40:11 with Python, a new book from pi image search.com launching on Kickstarter right now.

40:16 Have you ever wondered how Facebook can not only detect your face in an image, but also recognize

40:21 and tag you as well.

40:22 It's not magic.

40:24 Facebook uses specialized machine learning algorithms called deep learning in pi image search wants

40:30 to pull back the curtain and show you how these algorithms work.

40:32 Their new book is designed from the ground up to help you reach expert status.

40:37 Even if you've never worked with machine learning or neural networks before inside deep learning

40:41 for computer vision with Python, you'll find super practical walkthroughs, hands-on tutorials

40:46 with lots of code and a no fluff teaching style that is guaranteed to cut through all the cruft

40:50 and help you master deep learning for visual recognition.

40:53 To learn more about this book and back the Kickstarter campaign, just head to pi image search.com

40:59 slash Kickstarter.

41:01 Yeah.

41:01 So how do you run apps on, on Grumpy?

41:03 Like if I have Python code and I want to make it, make it go, how do I make it go on Grumpy?

41:09 This is sort of a hot topic right now in the, the issue tracker on GitHub because like the

41:14 build system that I have and is strictly focused on, you know, getting the internal libraries working.

41:20 And so it doesn't have good support for building a program that's outside that directory structure

41:25 or using libraries that are in your Python path or anything like that.

41:29 And so we're debating kind of how exactly it should be supported.

41:32 So right now, if you want to run a program or compile a library, you have to kind of drop

41:38 it into that directory structure and the make system will pick up on it and, and, try

41:45 to compile it into go.

41:46 But, ideally, you know, you have some kind of Python path style construct where it can find

41:53 Python code and build it in a sort of standard way.

41:58 That's something that we're working towards.

42:00 Okay, cool.

42:01 Now, if people want to contribute to Grumpy, there's like three major areas that, that make it up.

42:08 You want to talk about those three areas so they maybe can use it as a roadmap?

42:11 You can kind of think of it as the trans compiler, which is the tool called Grump C and that takes

42:19 Python code and it actually uses it's written in Python and it uses the AST module.

42:23 So it's kind of cheating.

42:25 Another milestone will be when Grump C can compile Grump C and, that takes, the Python

42:33 code and spits out some go code.

42:36 And then you're going to, the second part is the Grumpy runtime, which is kind of the parallel

42:41 of the C API.

42:42 The trans compiled go code will depend on that runtime.

42:48 So it imports the runtime and uses the, constructs and functions and things in the runtime.

42:53 And so that's another sort of component that's written strictly in go.

42:58 And that's where all the sort of data structures and things are defined.

43:02 and finally there's the standard library that is a mostly written in or actually exclusively

43:09 written in Python, but also has some uses some of the Grumpy native extensions to actually

43:15 interface directly with go packages and, and functions and things.

43:19 so those are, so there's sort of the three areas and there's a lot of work to do in,

43:23 in all of those different areas.

43:24 I'd say like the standard library is, is the biggest chunk of work to do at this point.

43:29 Presumably you guys chose go because of the concurrency story, right?

43:34 And if you have Python code running on go, you want to leverage that concurrency.

43:40 Do you have to use a different API?

43:43 this is Python two seven.

43:44 So you don't have things like async or wait.

43:47 How do I interact with the concurrency model of go?

43:50 Currently, the way that go routines are made available is through the threading library.

43:57 So the standard Python threading library, you create a thread and start it.

44:00 And that actually starts a go routine instead of a native thread that will work pretty seamlessly with existing code.

44:09 I don't foresee huge problems there in terms of like the differences between those kinds of threads.

44:15 And again, like, you know, go has the concept of channels, which are sort of a message passing mechanism.

44:22 And whereas in Python, you have a queue, the queue data structure, and this isn't actually implemented, but I plan to implement a queue using channels.

44:31 And so you should be able to just write Python concurrent Python code like you always have.

44:36 But I think to really take advantage of sort of the concurrency model, you probably, eventually I'd, I'd like to implement the async and await Python construct.

44:48 I think that would be a huge win.

44:50 Yeah, that would be, that would be a huge win.

44:52 And it seems to me like using the threading API is much more coarse grained concurrency than go is really built for.

45:02 And while it would work, it's not, not taking full advantage.

45:06 The idea with go is you can start a go routine or starting a go routine is extremely lightweight and passing messages back and forth is the way to sort of share state rather than with sharing memory.

45:20 Or sharing objects.

45:21 So I think that programs that are written with sort of heavyweight threads in mind aren't necessarily going to be the best possible way to express that functionality.

45:34 And so, you know, long-term I could see, you know, maybe, well, actually because you can access native go constructs.

45:44 For example, you will be able to, in a grumpy program, use go channels directly.

45:50 You know, that has upsides and downsides.

45:52 It starts to diverge from the Python language and those sorts of things.

45:55 Yeah, but it's not unlike, Iron Python or Jython or those things, right?

46:00 Where you can reach down into the underlying JVM or CLR or something like that.

46:05 That's right.

46:06 Yep.

46:06 Absolutely.

46:07 Okay.

46:08 So if you're going towards async and await, what's the story on Python three?

46:13 Since I feel like the threading concurrency story is a lot better in Python three.

46:17 Yeah.

46:17 I'd love to support Python three.

46:19 The long-term goal is definitely to support it.

46:22 The reason for 2.7 is that we have a large, YouTube had a large existing Python code base and

46:29 that was a 2.7.

46:31 So that was the main reason for choosing 2.7 out of the gate, but certainly long-term,

46:37 I'd like to see all Python three supported.

46:39 Right.

46:40 Oh, that'd be, that'd be fantastic.

46:41 I'd like to see that as well.

46:42 I mean, it certainly makes sense if you're working on the YouTube team.

46:45 YouTube has a tremendously large and widely adopted deployment of Python two seven.

46:51 Like you want to, you know, work where you can have the biggest impact locally, right?

46:55 Absolutely.

46:56 Yeah.

46:56 So reading the tea leaves, does this mean that Grumpy might someday run YouTube?

47:02 I want to hedge a little bit on that.

47:04 I think there's a sort of a long road ahead before Grumpy's ready to handle the kinds of

47:10 large applications that we run on YouTube.

47:13 So I wouldn't want to speculate about the long-term outcomes there.

47:18 Sure.

47:18 Yeah.

47:19 Yeah.

47:19 Of course.

47:19 You know, let me just imagine, let's imagine a world where it did.

47:23 That would be, probably the first few weeks that it switched to Grumpy would be a little

47:30 bit nerve wracking, right?

47:31 Yeah.

47:31 It would definitely.

47:33 If YouTube goes down and it's your fault, that's going to be a problem.

47:36 Yeah, exactly.

47:37 I don't want to be that guy.

47:39 Exactly.

47:40 Exactly.

47:40 Here's the four pages we're giving you.

47:42 No, just kidding.

47:43 But it would, if, if someday that, that came to be, that would be a really cool outcome of

47:47 this project.

47:47 Yeah, absolutely.

47:48 that's, that's sort of the dream.

47:50 Excellent.

47:50 Okay.

47:51 So maybe that's, that's a good place to leave it.

47:52 Let me ask you just a couple of questions before we let you out of here.

47:56 If you're going to write some code, what editor do you use?

47:58 Vim.

47:59 Vim.

47:59 All right.

48:00 Yeah.

48:00 Very cool.

48:01 And there's over 96,000 packages on PyPI these days.

48:06 And I'm sure you've come across some that are kind of unique.

48:09 You're like, Hey, have you heard about this package?

48:11 It's pretty cool.

48:11 You should check it out.

48:12 You got any, coming to mind?

48:13 You know, it's funny.

48:14 I mean, because I do a lot of my, most of my development inside Google, you know, we

48:20 kind of have a different set of tools we tend to use.

48:26 I don't have a ton of, experience with a lot of PyPI packages.

48:30 Yeah.

48:30 So it's a little bit more a dark matter.

48:33 We out here in the larger universe don't get to see a lot of the cool stuff you guys get

48:38 to use.

48:38 I'm sure it's pretty neat though.

48:39 Absolutely.

48:40 All right.

48:41 Awesome.

48:41 So how about a final call for action?

48:43 Like how can people get started grumpy?

48:44 What can they do if they, if this resonates with them, things like that?

48:48 yeah.

48:48 I mean, we're, we're super interested in, in seeing where the project goes.

48:52 I, I don't have, like I said, I would like to see, where grumpy can be useful

48:57 besides just, you know, large concurrent server applications.

49:01 Community feedback around that is great.

49:04 I, people have been filing, issues asking about, you know, support for different things.

49:08 And that's been really illuminating seeing where people are thinking about where this might

49:12 be useful.

49:12 So that's huge.

49:13 if, you have the time and the inclination, try it out, just clone the repo and type make

49:20 run and, and try out Python and go and, report any issues.

49:25 That's really useful to us.

49:27 And, and obviously there's a ton of work to do.

49:30 we talked about some of the different things and, you know, contributions, via PR,

49:36 pull requests on GitHub are really appreciated.

49:38 It's been kind of amazing how much people effort people have put in already.

49:42 So that's been, really exciting for us.

49:45 Yeah.

49:45 It's, it's a cool project.

49:47 And I think if we have yet another powerful, flexible runtime that has some different trade

49:54 offs that we can make for Python, that's great for everyone.

49:56 So congratulations on your project and thanks for sharing it with everyone.

50:00 Yeah.

50:00 Thanks very much, Michael.

50:01 You bet.

50:01 Talk to you later.

50:02 This has been another episode of talk Python to me.

50:06 Today's guest has been Dylan Trotter.

50:09 And this episode has been sponsored by hired and pie image search.

50:12 Thank you both for supporting the show.

50:14 Hired wants to help you find your next big thing.

50:17 Visit hired.com slash talk Python to me to get five or more offers with salary and equity

50:22 presented right up front and a special listener signing bonus of $2,000.

50:27 Struggling to get started with neural networks, deep learning and image recognition.

50:31 Pie image search.com can help with that.

50:33 To learn more about their new book, deep learning for visual recognition with Python and back the

50:39 Kickstarter campaign.

50:39 Just head to pie image search.com slash Kickstarter.

50:44 Are you or a colleague trying to learn Python?

50:46 Have you tried books and videos that just left you bored by covering topics point by point?

50:51 Well, check out my online course Python Jumpstart by building 10 apps at talkpython.fm/course

50:57 to experience a more engaging way to learn Python.

50:59 And if you're looking for something a little more advanced, try my Write Pythonic Code course

51:04 at talkpython.fm/pythonic.

51:08 Be sure to subscribe to the show.

51:10 Open your favorite podcatcher and search for Python.

51:12 We should be right at the top.

51:13 You can also find the iTunes feed at /itunes, Google Play feed at /play and direct

51:19 RSS feed at /rss on talkpython.fm.

51:22 Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix.

51:28 Corey just recently started selling his tracks on iTunes.

51:31 So I recommend you check it out at talkpython.fm/music.

51:34 You can browse his tracks he has for sale on iTunes and listen to the full length version of the theme song.

51:40 This is your host, Michael Kennedy.

51:42 Thanks so much for listening.

51:43 I really appreciate it.

51:44 Smix, let's get out of here.

52:08 Don't forget.

52:09 you

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon