#4: Enterprise Python and Large-Scale Projects Transcript
00:00 Talk Python to Me.
00:01 Episode number four, with guest Mahmoud Hashemi.
00:04 Recorded Tuesday, April 7th, 2015.
00:08 Music by Ben Thede You may have noticed that we have new intro music for this show.
00:45 If you're a fan of the Developers, Developers, Developers music by Smix,
00:49 don't worry, he's not gone. He'll be back next time.
00:52 However, I chose this music for this show because it's actually not a performance,
00:57 but rather sounds from an auditory experience for Wikipedia called Listen to Wikipedia.
01:04 This is a project written in Python by our guest, Mahmoud.
01:08 We'll talk a little bit about that during the show.
01:10 The sounds you hear corresponds to edits, additions, and deletions to Wikipedia.
01:14 Reading off a few of the topics that cause those sounds are kind of random, but here they are.
01:20 Asphalt, Geography of Peru, Subobject, Fox, 2013 European Olympics, Pretty Little Liars, and Carrie's Grammar School.
01:29 If you need some sweet Zen sounds to program to, or are just interested in this project,
01:35 check it out at listen.hatnote.com and experience Python and Wikipedia and audio all wrapped up into one for yourself.
01:44 That's listen.hatnote.com.
01:49 Now, on to the show.
01:51 Hello, and welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystems, and the personalities.
01:59 This is your host, Michael Kennedy.
02:00 Follow me on Twitter, where I'm @mkennedy, and keep up with the show and listen to past episodes at talkpythontome.com.
02:07 This episode, we'll be talking to Mahmoud Hashimi about Enterprise Python and large-scale Python projects.
02:13 Mahmoud Hashimi is the lead developer of the Python infrastructure team at eBay and PayPal,
02:19 where he focuses his development and instruction energies on service frameworks, API design, and system resiliency.
02:27 Outside of work, he enjoys coding on his open-source projects, which you can find at github.com/Mahmoud,
02:34 as well as creating and maintaining several Wikipedia-based projects, such as Listen to Wikipedia and The Weeklypedia.
02:41 You got it.
02:42 Excellent.
02:43 Mahmoud, welcome to the show.
02:44 Oh, yeah.
02:45 Thank you.
02:46 It's great to be here.
02:47 Like I said before, I'm a very big fan of the prior two episodes, just listening to the MongoDB episode on my way here today.
02:55 Oh, that's fantastic.
02:56 Yeah, we just released that a few hours ago.
02:58 It was great.
02:58 Yeah, it popped up on my phone.
03:00 I listened to it right away.
03:01 All right, so it sounds like you have some amazing stuff going on with Python at PayPal.
03:07 And, you know, the way that I sort of got to know you, I just ran across this blog post that you did called 10 Myths of Enterprise Python.
03:14 And I was just, I was seriously blown away at the detail and sort of the power of your message that Python is this really fantastic, powerful, flexible, and, you know, surprisingly to some people, not to me, but some people, large-scale development platform.
03:33 Well, and, I mean, I don't know what took me so long, honestly.
03:36 Like that post, you know, we were looking for that post years ago.
03:41 And, like, when we were just starting to evangelize Python and, like, work on Python here at PayPal.
03:46 But, you know, like, it was instead, it was, like, scattered throughout the whole Internet.
03:51 The evidence was there.
03:52 I was just looking for something that summed it up in that, like, you know, very sort of, like, modern Internet fashion.
03:58 But, you know, and then I guess, like, I think I wrote the first draft of this maybe, like, a year ago and started using it in internal presentations and so forth.
04:08 But, I mean, yeah, we really did need it something like four or five years ago.
04:12 And what is that quote?
04:14 You should be the change you want to see in the world.
04:15 Exactly.
04:16 Eventually, like, you know, I just went through the whole process to get onto the corporate blog and, like, just get it out there to help, like, you know, anyone else who might be in the same sort of bind.
04:26 You know, just looking for, like, a comprehensive set of evidence to sort of quash these very persistent arguments that seem to be coming up in, you know, the sort of corporate meetings that we all know and love.
04:42 Yeah.
04:44 And what I really love about your article is it's so full of evidence.
04:47 Like, you know, every sentence, every, at least every paragraph has a couple of links to other sort of richer articles that back up whatever it is you're talking about.
04:55 That's great.
04:56 Yeah.
04:57 And, I mean, so, I mean, that's the thing.
05:00 It's, like, I don't even really feel, like, that responsible for all this.
05:03 All of the evidence was there.
05:05 It just needed to be collected under one roof and then hopefully have some sort of credible source to it.
05:10 I mean, the real, like, seller for me here at PayPal is often that Bank of America has thousands of Python developers.
05:16 You know, that's a financial institution, arguably, like, much more important and larger than PayPal.
05:25 And they, like, you know, trust Python with all sorts of, like, activities.
05:29 So, if it's good enough for them, you know, them, it should be good enough for us.
05:32 But the thing is that that's, like, not really advertised that much.
05:35 Even though they're at PyCon, like, doing recruiting, looking for people.
05:39 If you're looking for a job, maybe call it Bank of America.
05:42 So, yeah.
05:43 I mean, we'll go through them one by one here, I guess, today.
05:47 But for the most part, I guess, I came to write that blog post by way of, like, you know, originally joining, like, PayPal as a PHP developer.
05:58 Not for PayPal, mind you.
06:01 But that was just what I knew.
06:03 Like, coming into the company, I saw that there was sort of this, it was like code archaeology, if you would, right?
06:11 Like, there was some evidence of Python usage dating back to the early days of PayPal.
06:16 You said prior to HTTP or something like that, right?
06:19 And prior to Java.
06:20 Quite a way back.
06:21 Yeah, no.
06:22 Way before.
06:23 So, I mean, basically, when, like, original PayPal was all C++.
06:29 And I think still the majority of the traffic in the data center is, like, you know, going between, like, C++ services and so forth.
06:36 So, yeah.
06:37 I mean, there's a long history here going back to, like, 1998.
06:42 Pretty much when I came, like, sort of that ancient civilization was, you know, for the most part gone in 2008, 2009.
06:50 Where had it moved to?
06:52 What were people using?
06:52 You know, I mean, people went and founded YouTube and Yelp and LinkedIn and used Python to much, you know, like, more visible good than I guess probably has come from my own role here at PayPal.
07:08 But it all has to start somewhere.
07:10 Yeah, that's pretty amazing.
07:11 Like I was telling you before the show, I just talked to the Chris McDonough from the Pyramid guys.
07:18 And he said that also Pyramid is being used at Yelp.
07:20 So, a lot of things coming together in these two shows.
07:23 That's really cool.
07:24 I think YouTube is really one of the major sort of things we can put up on a pedestal and say, before you say that it can't, look at this and see if you could build that with whatever your, you know, pet project or pet technology is.
07:37 Another one is Dropbox.
07:39 Those guys are really doing interesting stuff, especially with Guido moving over there full time, which is pretty cool.
07:45 Yeah, yeah, yeah.
07:46 It's pretty neat the sort of, like, energy-inducing effects he has when he sort of, like, moves between these companies.
07:55 Like, he was previously at Google and, like, you know, worked on App Engine there and, you know, was, I guess, like, probably a very key, probably subconscious influence, I would guess, in PayPal.
08:07 Sorry, in Python becoming sort of the third, like, you know, stack there at Google.
08:13 Yeah, that's really cool.
08:33 Michael here.
08:34 Thank you so much for listening to and spreading the word about Talk Python TV.
08:39 The response to the podcast continues to be wonderful and humbling.
08:42 I have a quick comment about supporting and sponsoring the show.
08:45 I'm still looking to line up stable corporate sponsorships, but I wanted to tell you about a community-based campaign I'm launching to allow listeners to directly support the show.
08:53 We are running a Patreon campaign.
08:55 You might not have heard about Patreon, but it's kind of like Kickstarter for things like podcasts, which release frequent small deliverables rather than one-off large engineering projects.
09:04 Visit patreon.com slash mkennedy, that's p-a-t-r-e-o-n dot com slash mkennedy, and watch the video to see how you can donate as little as $1 per episode to support Talk Python to Me.
09:16 This is your chance to ensure that the Python community continues to have a strong public voice.
09:22 Consider supporting us today at patreon.com slash mkennedy, and thanks for listening.
09:46 Before we move into the myths, let me just ask you, how did PayPal and eBay sort of adopt Python so broadly?
09:54 So there was this history, and then some of those people kind of moved on to do minor things like YouTube.
10:01 And then it sounds like you guys are doing a lot of Python stuff now, right?
10:07 Oh, yeah, absolutely.
10:08 So, I mean, basically every aspect of PayPal, and it is like a large business with many different concerns.
10:14 So every aspect you can name, we basically have some contingency of like, you know, some contingent of Python in that group.
10:24 So whether it is like, you know, data analysis in the risk group, or it is like, you know, sort of, let's see, I'm trying not to give away too much here.
10:37 Yeah, don't spill any super secret things you can't talk about, but, you know.
10:40 But, yeah, but I mean, you know, admin tools, batch jobs, mid-tier services, even like, let's see, even, I mean, even some web development and web service like front-end type stuff is going on as well,
10:57 even though it's primarily shifted to other stacks at the moment.
11:00 Sure. Okay. Well, that's really interesting.
11:03 Mm-hmm.
11:04 I think that's a nice lead-in to sort of some of the myths.
11:07 So on your post, you say that you have actually over 50 Python projects running their PayPal, eBay,
11:16 and you sort of go through a list of all the different kinds of things.
11:20 You know, how many people are working on these projects, and are they all active, and what's the story there?
11:24 Well, I mean, Python is sort of, like, you know, kind of renowned for being able to let you move fast,
11:31 and it's interesting how one Python developer can sort of sweep through, like, his new team or his new organization
11:38 and, like, you know, fix a whole lot of things and get a whole lot of balls rolling.
11:43 So, I mean, the 50-plus number is really definitely immediately attributable to, like, Python's own, like, high efficiency.
11:53 Right. It's easy to start and complete 50 projects, more so than, say, C++ or something like Java.
11:59 Right. I think one developer in 2014 got something like three services launched on, like, the eBay side
12:06 using some of our new, like, cloud integration stuff, and that's just one guy.
12:12 He's a team of one.
12:13 And so...
12:15 The meetings are short.
12:16 Oh, yeah, yeah, yeah.
12:17 So, basically, I mean, when I started out, like, there were maybe...
12:23 It started with me making, like, a DL or, like, you know, sort of having an email list on a listserv,
12:28 and they...
12:30 I think maybe, like, 25 people joined up, right?
12:33 And then I think now the DL has on the order of 260 people who use Python on a somewhat regular basis.
12:40 And, you know, I mean, I can't take much...
12:44 I mean, I can't take all the credit for this because, like, a lot of it is due to the sort of, like,
12:50 burgeoning, like, OpenStack wave.
12:52 Like, eBay is one of the main contributors to OpenStack, main users of OpenStack.
12:56 PayPal has also followed suit there.
12:57 And then, also, we've done some acquisitions, you know, of, you know, companies that use Python pretty, like, heavily.
13:05 So, I mean, it sort of, like, had a lot of organic growth.
13:09 I'd say most of the growth is organic, but a lot of it is also attributable to those external projects as well.
13:14 Yeah, sure.
13:15 So, some of it came from the outside and sort of pollinated the internal folks, right?
13:20 Something to that effect.
13:20 Right.
13:21 And that's why I think it's really important that PayPal and eBay sort of participate in that, you know,
13:27 open source ecosystem so that they can continue to derive benefits from that, as we've seen, like, in the past few years here.
13:34 Yeah.
13:35 Do you guys have, like, a corporate GitHub place where you're doing things or anything like that?
13:40 Absolutely.
13:40 GitHub.com forward slash eBay.
13:42 GitHub.com forward slash PayPal.
13:45 You know, I don't personally maintain the Python SDKs by any means for these companies, but I do know the maintainers quite well.
13:52 So, I, you know, I'm willing to take some email spam if you want to email me at mahmoud at PayPal.com.
14:00 And, you know, I'll forward your messages on to the folks in charge of those.
14:06 But they also are pretty responsive on GitHub issues.
14:08 And, I mean, all in all, I think that, like, these Python developers that are in our, like, Githubs are pretty much on the ball when it comes to, like, the whole open source philosophy.
14:20 That's cool.
14:20 Can you open, like, an issue on GitHub for the team?
14:24 Yeah.
14:24 Okay.
14:24 Very cool.
14:25 So, you can participate there.
14:27 Even the Python infrastructure team, you know, which is mostly an internal team, now has, like, a repo up there.
14:32 It's a mid-tier server framework.
14:35 It can be used for Frontier as well, but we primarily use it here for mid-tier stuff.
14:39 And it is called support.
14:43 So, you just go github.com forward slash PayPal forward slash support.
14:46 And, you know, hopefully you'll see me updating the docs.
14:50 Fantastic.
14:51 Fervently.
14:51 Yes.
14:52 That sounds really cool.
14:54 Mm-hmm.
14:55 So, you want to touch on some of the myths?
14:57 Absolutely.
14:58 That's what we're here for, after all.
14:59 Yeah.
14:59 I think there's probably 50 myths if we sat down and brainstormed over enough beer and over enough time.
15:06 It's true.
15:06 I had to cut it down to these 10.
15:08 But, yeah.
15:09 I think you really did hit some interesting ones.
15:12 You know, maybe this is just my sort of myopic view of the world.
15:18 But, to me, it feels like Python is becoming quite a bit more popular in the last five years than it has been.
15:24 Yeah, absolutely.
15:25 And I think that is what is leading into what I think I had as myth number one, which was that Python is a new language.
15:31 Exactly.
15:32 People are just now starting to hear about it.
15:34 They feel like, oh, this thing's taken off.
15:35 But it's not new, is it?
15:38 Yeah, yeah, yeah.
15:40 No, I mean, it's technically older than Java.
15:42 You know, its first release happened three years before the first release of Java.
15:47 And, I mean, it comes from, like, a really long history as well.
15:52 Like, it's not just, you know, sort of.
15:55 Yeah, so it's not a, like, sort of, like, designed in a weak kind of language that just, like, appeared out of nowhere.
16:02 It's based on, like, Guido's long experience with languages like C and ABC.
16:09 Absolutely.
16:11 And I think he took, like, two years.
16:12 He started in, like, 1989, maybe, I seem to remember.
16:15 And it came out in 91, version 1.0, something like that.
16:18 Yeah, well, I was, you know, just a wee one, right?
16:21 Exactly.
16:22 Like, running around, not even thinking about, like, programming.
16:25 But, yeah, it is very mature language on top of that.
16:29 And, I mean, I don't think I go into this in the post, but it has evolved so remarkably.
16:34 Like, all of the changes after, like, version 2.2 with new style classes and all of the, like, you know, rationalizations that have occurred have really, like, proven that it still has the flexibility of a new language, right?
16:48 Where it's constantly under development.
16:50 And you can see a lot of that in, sort of, like, the interesting, sort of, like, discussions and tensions that arise, like, you know, with the Python 3 issue.
17:00 Yeah, the Python 2, Python 3 issues.
17:02 We probably have to have a whole show on that.
17:04 But, you know, it is interesting.
17:05 I actually have some opinions on that.
17:08 But, you know, I think it's interesting how much the language is still sort of evolving in a positive way.
17:17 So, for example, in, what was it, 3.3, they had added yield from to simplify generator methods, which was really cool.
17:25 In 3.4, they added async I.O.
17:27 In 3.5, there's talk of, like, type hinting.
17:30 I mean, these are major sort of additions that are coming on still with 2025-plus years out.
17:37 And many of them have a really nice realistic timeline to them.
17:42 Because, I mean, yeah, Python 3 option, it'll take some time.
17:47 You'll probably do a whole show on that, and you'll have plenty of time to get it out before Python 3 is the majority.
17:51 But, yeah.
17:52 The debate will still be raging if we have that show in six months, I'm sure.
17:55 Absolutely.
17:56 So, yeah.
17:58 But, I mean, Python is not a new language.
18:00 And probably most of the people on this show, like, or who will be listening to this show are going to know that, I think.
18:06 Absolutely.
18:07 But one of the goals I kind of have for this show is, obviously, the people that take the time to, like, sign up and listen to a Python podcast generally are knowledgeable about it pretty well already.
18:18 But, you know, maybe they work in teams where people aren't.
18:21 Like, I have a lot of conversations with folks who are in sort of the more compiled language space.
18:27 And they always think of, oh, Python is like this scripting.
18:30 It's almost like it's a Bash shell script type of thing.
18:33 And I don't think they appreciate sort of the, basically what you laid out in your 10 myths here, which is really cool.
18:39 So, speaking of a compiled language, that's myth number two, right?
18:42 Well, sure.
18:43 And so, myth number two is that Python is not compiled.
18:46 This one is a little bit of a, you know, sort of a, it's a little bit of like sort of a leading myth in a way.
18:54 People do sort of bring this up.
18:56 But the point is that, like, what they're trying to say is that Python is not like C++ and Java.
19:04 I mean, it just ends up coming up always.
19:09 As a company that uses primarily C++ and Java, like, we're going to have a lot of people who sort of protest at, like, the first time they're seeing a REPL used effectively, you know, when you're doing, like, a live demo or something like that.
19:24 So, they say, like, what do you mean there's no compilation?
19:28 What happens if I, like, you know, get something wrong?
19:31 And it's like, well, you got tests and you'll be okay.
19:33 So, like, yeah, Python's not, and the other thing, too, is that Python is technically compiled.
19:39 Like, it does get compiled down to bytecode.
19:41 It happens fast enough that it can happen sort of, like, you know, right before runtime without incurring too much of, like, you know, overhead for most sizes of projects.
19:48 Right, and it does cache that output as the PCH is, right, in the PyCache or whatever it's called in that folder.
19:54 The PYCs and the PyCache and Python 3, yes.
19:58 And so, yeah.
20:00 So, basically, it is a compiled language, and it does execute from bytecode, which is exactly what Java does.
20:06 I mean, the JVM, like, also, like, suffers from a thing called typeer, which, I'm not sure if I mentioned this in the post.
20:12 It probably did.
20:13 But, yeah, it has typeer ratio because it's originally based on a Smalltalk VM, which, and, you know, Smalltalk itself, like, you know, isn't really a, like, statically typed language.
20:24 So, basically, effectively, the bytecode that's emitted by Python is kind of on the same level as the bytecode emitted on the Java side, and the main difference between those two runtimes is the JIT, the just-and-time.
20:37 Yeah, the JIT and the GC.
20:38 That's interesting.
20:39 Yeah, the most notable thing that comes to mind when you say that is sort of the Java implementation of generics or templates, where, in the language, it looks like it's, you know, integers and other types of objects, but it's really just, you know, down to objects at the base, right?
20:54 When it runs.
20:55 The other thing I think is interesting about this is, I'm sure one of the hesitations is, what do you mean you're going to give just the source code away as, you know, bare files, right?
21:05 And, you know, Java and .NET and those things, they don't protect you much, right?
21:11 Because all you've got to do is throw it into a tool, and you see basically the same thing.
21:14 Absolutely.
21:15 I don't rely on that for security, like, whatsoever.
21:18 So, yeah, it's interesting that these other languages can be decompiled.
21:23 And so, in some sense, you're more or less giving away your source code as well.
21:26 And although there are obfuscators, if you really want to get to it, it doesn't matter so much.
21:30 I think that brings us to the real actually important matter is, is Python secure or not?
21:36 Yeah, that's the real question that people should be asking rather than wondering if they should distribute source code.
21:41 Is Python secure?
21:42 To which the answer is yes, definitely.
21:44 To the point that we're willing to, like, actually, like, put a lot of our security onto that platform.
21:51 I mean, and it's not just our own assessment.
21:53 Like, there have been, like, studies done, surveys done of, like, you know, the actual Python source code that determined that it was, like, actually very safe.
22:03 Not really prone to many of the traditional, like, you know, sort of C, the C runtime issues that can occur that we've seen perhaps in, like, openness to sell and so forth.
22:15 Right. A lot of the safety of the language, the fact that you can't work with raw bytes and pointers and offsets is part of it.
22:22 Then also, you also pointed out that the small surface area you can accomplish so much with so little that you don't have to kind of put so much out there in your code.
22:31 So it's easier to guard that smaller code base.
22:33 Right. And that's just, like, a good security policy, a general good security policy.
22:37 So, but, I mean, no, specifically, like, you know, when we were having our discussions about security, using Python with security here at PayPal, like, we did talk about the implications of using CPython.
22:48 And CPython's, like, maturity, as well as, like, the code analysis that has occurred on it, like, you know, was one of the deciding factors.
22:56 Like, basically, if you look at Java or something like that, every couple weeks there's a new Java update because it has tried to, like, you know, introduce a new security model into its runtime involving, like, protected memory and that sort of stuff.
23:11 So, I mean, maybe I should rephrase that.
23:14 Basically, it makes promises that are hard to keep.
23:17 And so they end up having, like, a lot of security releases.
23:20 You don't see those same sort of security releases nearly as often as CPython, even though it has a very wide usage.
23:27 It's on, like, not only all the servers, but also most of, like, you know, consumer computers at this point.
23:33 Right. Definitely all the Macs and Linux machines, anyway.
23:36 Exactly.
23:36 So, I mean, if it were, like, a viable, like, sort of attack, like, if it was sort of, like, a viable vulnerability, if you would, it would probably be exploited by now.
23:48 And so, like, it's, it's, and you can never, like, really prove security.
23:52 You can just sort of go with what seems to be secure.
23:57 And, like, the way that you get a good idea of that is something that has been as widely tested as Python.
24:02 Absolutely.
24:04 Yeah, you cannot prove security by one example where a person's not hacked, right?
24:07 But more and more that it sustains up over time, the more faith we should put in a system.
24:12 Absolutely.
24:13 So I think number four may be my absolute favorite, or at least the most encountered one for me, the folks that I interact with.
24:21 And that's, Python is really just a scripting language.
24:24 And this is, this is the original myth, like, you know, in my experience.
24:28 I agree.
24:29 Yeah.
24:30 So, I mean, when I was giving my first, my first Tech Talk, I remember it like it was yesterday.
24:35 After I finished, we were demoing a new application that basically controls the prices of PayPal.
24:41 It's still used now.
24:42 It's, like, before this application was released, it would take on the order of weeks to get new pricing schedules out for, like, you know, certain vendors and that sort of stuff.
24:54 And this made it so that it became just a matter of, like, hours or minutes.
24:57 And so we were demoing all of this.
24:59 And by the time we got to the end, the first question out of the first person's mouth was, so wait a second, you did this in Python, right?
25:08 And I'm like, oh, well, yes.
25:10 Like, and how do you mean?
25:11 And they're like, but Python is a scripting language, you know, so what is this?
25:16 Is this, like, CGI or what is this?
25:18 Like, how is it that you're, like, talking these complex network protocols with a scripting language?
25:23 As though I had done something very irresponsible.
25:25 That's really funny.
25:27 But, you know, four or five years later, here we are, and it's still, like, doing a bang-up job.
25:31 Yeah, that's fantastic.
25:32 Yeah.
25:33 And so in this myth, I, like, basically go through, as mentioned, like, earlier in the program here, like, you know, I go through all of the different, like, companies that have used Python in so many different ways.
25:43 It is a general purpose language as it describes itself.
25:46 So it has many, many purposes.
25:49 Yeah, I really like your list, and maybe I'll just read a couple of them.
25:52 So we've got, like, Twilio doing telephony infrastructure, obviously payments with you guys, neuroscience and psychology with tons of examples.
26:01 There's all the numerical analysis stuff with NumPy and SciPy.
26:05 Disney, DreamWorks, and LucasArts are all doing animation rendering type stuff.
26:10 Games, backends, email.
26:12 Let's see what else is here.
26:15 I think the security and penetration testing stuff is interesting.
26:18 Big data, you know, Hadoop.
26:20 And then, obviously, we just talked to the Mongo guys.
26:22 There's great support on those types of systems.
26:25 Yeah, and we're spinning up a Disco cluster here, too.
26:29 I mean, Disco's been around for, like, a little while, but it's sort of like an Erlang and Python-based big data thing.
26:37 We're spinning up a cluster here.
26:38 It's actually been a pretty interesting experience so far.
26:40 Oh, I bet that's an interesting thing to work with.
26:43 Definitely.
26:44 My favorite example of all this, right, was sort of during sort of a contentious time in the Python infrastructure team's history here.
26:53 I think Binds10 came out and announced that they were going to, like, use Python for a good part of DNS, right, which is basically as close to infrastructure as, like, common infrastructure as the Internet has.
27:07 So, I mean, I'm not sure where that's at right now.
27:10 Yeah, that's pretty cool, though.
27:12 Yeah, but it was, like, a really, like, sort of – it drove a stake through the heart of this myth that Python's just a scripting language.
27:20 Well, I think one of the other concepts that leads to people thinking that is, well, you can't have a real language without strong typing.
27:28 And so people think Python is weakly typed, right?
27:30 No, that's – so this is another one that I guess people, like, you know, in a practical meeting start getting into theoretical aspects, right?
27:39 And they start getting into type systems, which, I mean, you know, Python is definitely the language of, like, sort of pragmatic doing of things.
27:47 And so, you know, that's my way of saying get shit done.
27:51 I'm not sure what rating you have on this podcast.
27:54 But, you know, I mean, and people will bring up that it's weakly typed.
27:58 And the response to this, the real response to this should just be, like, one, it's not.
28:04 And even if it was, like, so what?
28:06 Like, the thing has already been done by the end of the meeting.
28:09 So it's really about getting, like, results.
28:14 And Python's type system, while wonderful, I'm a very big fan, like, has very little to do with, you know, it's – I don't want to rule it out.
28:26 I don't want to rule it out.
28:26 But I, like, in a moment there – JavaScript is an example that you could actually do without real typing and still get stuff done, you know, the node guys and so on.
28:36 I don't know if I'd include that one.
28:37 But, yeah.
28:38 So, no, it's true, though.
28:40 It's – yeah, Python is, like, a strongly dynamically typed language.
28:45 And it works for strong, dynamic people.
28:48 Yeah, that's fantastic.
28:50 So another myth, you know, people think of the scripting language and they think it's just this interpreted thing.
28:55 And so, well, obviously interpreted code is slow.
28:58 And even if you just focus on CPython, I think the performance story is super nuanced and interesting and non-obvious.
29:06 So, you know, take, like, NumPy, for example.
29:08 So if you did all of your sort of mathematics in pure Python and then interpreted that, that would probably be slow.
29:15 But, of course, they've taken the slow parts and rewritten that in C.
29:18 So that's sort of native and that's super fast.
29:21 And if you can just kind of orchestrate calling into these low-level C functions, then you're talking about really fast even CPython.
29:28 But there's more to it than that, right?
29:29 There's a ton of runtimes or implementations.
29:32 Definitely.
29:33 I mean, Python's demand has led to quite a bit of supply in all sorts of different, like, aspects.
29:39 Runtimes definitely being one of them.
29:41 So, I mean, you've got CPython, which is the standard.
29:44 You've got Jython and IronPython and PyPy.
29:47 And then there are, like, you know, more sort of, like, academic ones, you know, for teaching how a runtime works and so forth.
29:54 So, but the real key here is that, like, calling it interpreted language is sort of a form of micro-labeling that doesn't actually, like, contribute to the overall, like, engineering discussion that should be happening, which is that Python can have such a huge impact on your workflow.
30:10 So, yeah, you can iterate on your projects at such great speed that you end up, like, finding yourself using more advanced techniques or finding out that this area that you would have spent so much time optimizing otherwise is actually not really where the majority of the work of your application is being done.
30:25 So, basically, it allows you to focus on what matters.
30:29 When you take that to a macro level, looking at the whole ecosystem, there are people out there who have gone through that same workflow and have generalized out libraries that you can then take advantage of all of their optimizations.
30:40 So, like, you know, instances where Python has ended up using, like, SIMs, you know, which is, like, sort of vectorized, like, you know, computation that, like, would otherwise be, like, rather difficult to, like, code by hand in C and C++, not to mention distribute.
30:56 Yeah, I mean, it's really important to look at the whole, like, process instead of just, like, individual adjectives about how, like, a given, like, single runtime works.
31:09 Like, the Python way will lead to faster and more efficient code, you know, that makes a big difference on the overall, like, complexity of your application, which can lead to, like, you know, better maintenance aspects as well.
31:26 Yeah, that's really interesting.
31:27 I've talked with some companies that are doing amazing stuff and sort of building their almost entire enterprise foundation on top of Python.
31:35 And they've got, like, the group I was speaking with, I can't, you know, NDA stuff, I can't really talk about it.
31:39 But they had 160 internal enterprise business applications, and they were creating a Python layer to be the foundation of, like, sort of unifying all that data and underlying infrastructure.
31:49 And, you know, if you could do that, then you can do some pretty amazing, amazing stuff.
31:54 That's not a slow system you decide to do that with.
31:56 Yeah.
31:56 And, I mean, with, like, honestly, enterprise is not, like, you know, sort of the domain of, like, performance being king.
32:05 We end up spending, like, more time than we should talking about it because, I guess, people want to go back to their, like, you know, college roots.
32:12 But, honestly, we can afford more machines.
32:15 We put more machines on it because, you know, we need the redundancy anyway, right?
32:19 And we end up having to, like, spend, we should spend more time talking about, like, you know, just how our developers interact with our development tools.
32:30 Because that's really usually where, like, we have a bigger bottleneck, actually, like, getting projects done on time.
32:37 If you do it in Python, you end up having extra time.
32:40 You can engineer your products for correct behavior, and you profile it.
32:44 You know, Python has a decent profile.
32:45 It's built right in.
32:46 And then you can optimize it as need be.
32:48 And, you know, we've even found spare cycles to write some of our hot loops and see.
32:54 Yeah, maybe even have some time to write some unit tests to make sure it works right when you have help.
32:58 Yeah.
32:58 No, quality is what lets us sort of, like, sleep soundly at night.
33:03 That's what's going to, like, you know, actually make for, like, you know, successful business in the long term, generally speaking.
33:09 Right.
33:10 So, that was myth number six, which was that Python is slow.
33:14 And I think you have kind of a pair of myths that talk about scaling.
33:17 And one of them is kind of, well, number seven is Python does not scale.
33:21 You have some really interesting examples on sort of performance scale there.
33:25 Well, yeah, because this one is really just so easily quashed by a counter example, right?
33:31 YouTube, I think, is, what, the second largest website on the Internet right now, right?
33:35 We talk about Dropbox, Disqus, Eventbrite, Reddit, right?
33:39 They may have, like, Twilio with its telephony, right?
33:42 And Instagram and Yelp.
33:44 And, I mean, even games, right?
33:46 Like, even online and Second Life, right?
33:48 They're actually the areas where you find the most radical, like, scaling stories.
33:54 Because unlike enterprise companies like PayPal and Bank of America, they don't just, they're not made of money, right?
34:01 Like, the game has to be fun and they run a tight margin.
34:04 And they do it for, like, the love of crafting, like, a unique sort of, like, self-contained system.
34:11 And they end up, like, you know, creating such technology marvels as stackless and eventlet and all that sort of stuff.
34:18 So, well, stackless and eventlet and, you know, Tornado, AsyncIO, all these sorts of things are in the general realm of concurrency and async processing.
34:28 And that's your myth number eight, right?
34:29 Is that Python, that it does lack good support for sort of concurrency and multithreading?
34:34 Yeah.
34:35 So, this one, I think, is one area where, like, Python probably gets the most legitimate, like, technology flack.
34:42 And that's because, I would say that's because Python sort of has a stated mission of there's only one obvious way to do something.
34:48 And in the realm of concurrency, unfortunately, that's not true.
34:51 So, CPython, like, you know, by itself is sort of a runtime environment.
34:57 When you introduce a concurrency, like, sort of library to that, you change your, like, fundamental, like, runtime behaviors.
35:06 You basically have added a layer on top of CPython's sort of native, like, main thread or whatever you want to call it.
35:13 Now, either you're working with threads or greenlets or promises or deferreds.
35:19 And that really, like, there are so many different opinions that you can have about that, like, area of computing that Python has itself, like, you know, spawned, I think, probably at this point a dozen different ways to do concurrency.
35:34 Right.
35:34 I think one of the things that people immediately jump to, some people anyway, when they hear, I need more concurrency, is let me kick off a bunch of threads.
35:42 And I think, better or worse, the whole Node.js thing that's taken, you know, was really popular sort of coming up a few years ago is showing that you can get really great concurrency with very, very few threads.
35:53 If you're willing to sort of put those threads, you know, reuse those threads when they're generally waiting on a web service call or a database call or disk I.O. or something like that, right?
36:03 Yeah, absolutely.
36:04 And, I mean, it's been really amazing how we've sort of rediscovered these techniques.
36:09 One of the guys on the Python infrastructure team here actually worked on the AIM servers, if you remember, AOL Instant Messenger.
36:16 And, yeah, I mean, you know, that was all, and they did it all with C and callbacks.
36:21 And, you know, like, they actually, like, we ended up doing a lot of the stuff that he did back in the day.
36:27 Like, we ended up redoing that sort of now.
36:30 It's been really nice having sort of a gray beard, if you will, around, because basically he sort of, like, I think back in the day he was working with SSL BIO and, like, OpenSSL BIOs.
36:41 And today we're doing sort of the same thing.
36:44 And, you know, back in the day they also had trouble with threads.
36:47 And these days we also, we still find trouble with threads.
36:50 And these aren't Python-specific troubles.
36:52 It's just that threaded programming bears a few risks.
36:55 And so in our infrastructure, anyway, we've sort of, like, taken, we've made some decisions for our developers that will allow us to pre-mitigate those risks, you know.
37:05 And one of those is to, like, basically sort of limit ourselves to a somewhat fixed number of threads.
37:12 You don't want to have, like, one thread per request, for instance, in a server.
37:17 Because that means that as your load goes up, your contention and overhead, like, sort of also start going up.
37:25 Sure.
37:25 Yeah, and even just the pure memory from just the stack space for each thread can start to become significant when you're talking tens of thousands or hundreds of thousands of threads.
37:34 It's a problem, yeah.
37:35 Right.
37:36 And so what we end up with in a lot of cases for applications that have gone with a threaded model, and these aren't, like, Python applications, they end up topping out with sort of a hard stop.
37:47 They start falling behind on their work without, like, you know, being able to respond and shed load nicely.
37:52 We can sort of chalk up all of our good server behavior to just having the time to actually, like, analyze how our application works, add additional sort of behaviors, and instrument it appropriately.
38:07 Because we're not spending all of our time wrangling with a thread per request model or, you know.
38:14 But you really do have to, I mean, going back to the original issue of concurrency support,
38:19 like, a concurrency library is more than just any other library because you end up having to adopt some aspect of its philosophy.
38:27 And there aren't any real easy answers there.
38:30 You need to look at what they are, find out what, you know, like, sort of is easily digested by you and your brain.
38:37 And then look at examples of how other people have architected their applications if you don't have, like, strong opinions of your own.
38:44 It's a learning process, and this is something I feel that maybe, like, they should spend more time on back in most schools where, unfortunately, they mostly just focus on processes and threads, which, while important, aren't the whole story.
38:57 Yeah, absolutely.
38:58 Well, maybe that's changing in the future.
39:00 A friend of mine has a really interesting saying or way of looking at the world.
39:04 He says, you know, look, when you're writing this multi-threaded concurrent code, if you try to get too tricky, you're writing code, like, right at the limit of your ability to understand what you're doing.
39:14 And debugging code is harder than writing code.
39:17 So you're writing code that you literally can't debug.
39:19 You know what I mean?
39:20 Like, you've just gone right over that barrier.
39:22 Now it's like, I've created this monster I've got to live with.
39:24 So that's pretty interesting.
39:25 And, I mean, the code certainly takes on its own, a soul of its own.
39:30 But, like, basically, when you start running it at scale and it's spread across many pools of many machines, like, it has, it ends up, like, having an almost organic nature when you take into account, like, load balancers and, like, you know, variations in the network and so forth.
39:46 You need to make time for all of those sort of emergent behaviors that are going to come up when you deploy.
39:53 Yeah, absolutely.
39:53 So myth number nine is that Python programmers are scarce.
39:57 Like, I think I mentioned in the myth, like, this is somewhat true, right?
40:02 It depends what you're comparing it to, though.
40:04 Because, but like I said before, you know, we have one developer that goes and creates three production services in one year.
40:11 Do you need a huge team of Python developers to accomplish some project?
40:15 Maybe, maybe not.
40:16 Yeah, maybe not.
40:17 Actually, almost certainly not.
40:19 Almost certainly not as many as you would need for other stacks.
40:23 And that's sort of reflected in, like, every Python team that I've seen, not just here at PayPal, but also, like, across the industries that, like, they end up generally being smaller and more effective.
40:37 And if you, like, the literature sort of backs up that these are actually not bad things to have.
40:44 People worry about people getting hit by buses, and it's nice for, you know, nice of them to be so concerned.
40:50 But, but, but, but really, like, one of the keys with Python is that you can, you can learn it very quickly.
40:56 And it has a really nice learning curve.
40:59 I mean, maybe it's, maybe it's just me, but I had a really nice time transitioning from PHP to Django, digging deeper into the standard library, reading the source code of, like, you know, just sort of, like, modules like Ether tools and collections and these sorts of things.
41:17 And learning about Python from examples that are not opaque, like they were in, like, you know, C++ and Java.
41:26 I felt that those, like, you know, applications, like, those sort of stacks were more opaque and Python was more open.
41:32 And then, like, you know, with the rise of GitHub and so forth, going and learning, like, you know, the fundamentals of web frameworks from Bottle and Django source code.
41:43 And basically, like, you sort of, it sort of has a natural learning curve to it that basically without any of, like, you know, official training or standard training, like, you know, we were able to rise to the level of, like, effective infrastructure engineers.
41:58 And I had a similar experience, you know, when I first learned C++ way back in the day.
42:02 I remember it was this mountain I climbed, you know, and learning Python, I mean, obviously, learning as a second language makes it, a next language, not the first language, makes it easier to do.
42:13 But I think even if I had learned it originally, it would have been a much more enjoyable experience.
42:18 So, and then, you know, you make some interesting points about, well, there may be not so many Python developers or however many there are, there's going to be more.
42:26 Yeah, yeah.
42:27 And, I mean, especially when, like, there are changes happening in education right now where Python is becoming, like, I think, the top teaching language.
42:35 Exactly.
42:35 That's what I was thinking of is that, you know, just the last couple years ago, it flipped, I think, from Java maybe into Python being the most taught programming language in college.
42:45 So, a little while, that will have some big effects, you know.
42:48 Yeah.
42:49 And, well, already we're seeing it.
42:51 I mean, if your company has a policy that they'll, like, for a while, PayPal had a policy where they would only hire experienced engineers, not straight out of college.
43:00 But, already we're seeing people coming out of college being hired at PayPal whose primary experience is in Python.
43:05 So, like, the effect is already there.
43:08 And, I mean, frankly, of course, I think it's a good thing.
43:12 But, especially because Python has so many language features that are so, like, you know, well-documented and well-thought-through and designed through PEPs and, like, you know, the Python enhancement processes.
43:24 So, I mean, it's actually – it's just a more open, approachable language that has so much to teach that even if you're going to end up working in other languages, I recommend studying Python unless it's going to spoil you for other languages, which has been known to occur.
43:42 If you don't want to go to work anymore because you have to go back to writing embedded C, you may want to stay away from Python.
43:48 That's right.
43:49 We like C.
43:50 We like C, too.
43:51 Anyways.
43:51 Okay.
43:52 So, I think your final myth actually had some of the most interesting actual statistics in it.
43:58 That's that Python is not for large projects, as in large number of lines of code.
44:03 Right.
44:03 So, I mean, before we talked about, like, scaling traffic, which YouTube and others, like, you know, clearly disprove that you can just, like – Python has a consistent intuitive runtime and it can scale simply.
44:18 But scaling the developer side of things is not as simple, right?
44:23 People are complex.
44:25 And so – but that said, I mean, there have been many, many, like, you know, examples of Python scaling to the enterprise level.
44:35 And, you know, here at, like, PayPal and eBay is, you know, we have a lot of small teams, but we have some larger ones, too, with, like, multiple experienced developers and a couple of junior developers and so forth.
44:46 So, on the OpenStack side especially.
44:49 So, yeah, Bank of America, like I mentioned before, has, like, you know, 5,000 Python developers.
44:55 And they just spun that out of nowhere.
44:57 Maybe that's why Python developers are scarce.
44:58 Yeah, maybe they grabbed them all up.
44:59 Yeah, you said they have over 10 million lines of Python code, which is kind of crazy.
45:04 I mean, yeah.
45:05 And, I mean, they did that, I think the – I mean, either they did that because JP Morgan was doing it or JP Morgan did it because Bank of America was doing it.
45:12 Or maybe, like, you know, it was just a coincidence.
45:14 But the financial industry has certainly seen, like, you know, a large amount of Python adoption.
45:20 And the important thing, too, is that, like, when your company starts getting big enough, heterogeneity becomes a really important aspect of your recruiting strategy.
45:31 So, like, if you really do actually need 2,000, 3,000 developers, then maybe you shouldn't be banking on all of them, you know, being from one programming discipline, right?
45:47 So, basically, with trends in education and open source dictating a lot of what talents is available, you should focus on having language agnostic protocols and, like, well-designed larger architectures that you can plug many languages' stacks into.
46:06 And that's what we have here at PayPal, where we have C++, Java, Python, Node, and there's even been some fledgling work done with, like, Scala and Go.
46:17 And, certainly, as we – like, adding – so, doing C++ was, like, hard.
46:23 That, like, built PayPal.
46:24 Then adding Java was, like, you know, still pretty hard.
46:28 When we added Python, like, you know, we actually created reference implementations and standards for a lot of these emergent protocols.
46:35 And then other stacks, like, can come and sort of, like, follow suit.
46:38 If you're looking – if you're actually someone – if there's actually someone in charge of a big project listening to this podcast right now, you know, as long as you have good talent, good architecture, you can definitely use Python as part of a large project.
46:54 We have many processes detailed in the blog post that can, like, you know, lead to good practices, best practices for a variety of environments.
47:02 Yeah, that's great.
47:03 Can you talk about, like, static code analysis with PyFlakes and other things like that?
47:07 It's a big world out there.
47:09 It's a big industry.
47:10 And Python has a long history and, like, a lot of experience it can bring to enhance basically any – like, a company of any size, if you ask me.
47:20 I mean, it's not all about, like, you know, just evangelizing and, as they say, being religious about, like, you know, a given technology.
47:29 Python is just a really handy piece that fits into a lot of applications.
47:35 Yeah, that's really fantastic.
47:36 And so people out there listening, if they're sort of having these debates at work or on their projects and, like, I'd really like to use Python, but people keep laughing at me and say it's not – you know, whatever, right?
47:45 Yeah, I really recommend checking out Mahmoud's article, 10 Minutes of Enterprise Python, which I'll link to from the show notes.
47:52 And you can share that or you can share this longer form conversation that we had.
47:58 I just wish I had a time machine so I could send it back to me a few years ago.
48:04 Yeah, I can see that.
48:05 What would you say to your former self at work?
48:08 I would give him this article.
48:10 That's great.
48:11 It really would simplify things.
48:14 But, yeah.
48:14 And also maybe I'd send some stock.
48:18 Yeah, exactly.
48:19 Of course.
48:20 Anyways.
48:21 All right, well, I think that's – you know, this has been a super interesting conversation.
48:24 And a question I'd like to ask my guests on sort of on the way out the door is so much of Python is driven by open source stuff.
48:31 And there's so much great stuff on PyPy and GitHub and so on.
48:35 Do you have, like, a favorite thing that, you know, you maybe want to call some attention to?
48:39 Favorite project?
48:40 I have many favorites.
48:42 Like, one of the ones that I remember looking at recently and being really impressed with the implementation was – it was a very small thing.
48:49 It's called, I think, NetAdder or something like that.
48:51 And it was just a library for working with IPs and IP ranges.
48:54 But, you know, as I'm wont to do, I went and I looked at the code and it was just really exquisitely implemented.
49:00 So, I mean, it's – for some reason, like, my mind always jumps to that, like, because I was like, you know, as I looked at the code, I was just like, I couldn't do a better job.
49:09 I recently saw some guy, I can't remember who, probably in your neighborhood in San Francisco, being interviewed on Bloomberg News about – they had some sort of museum art set up about how code is art and, like, the beauty of algorithms.
49:23 And, you know, maybe that sort of part of it.
49:25 That's a Glitzman.
49:27 He works with you.
49:28 Oh, he works with you?
49:29 Oh, my gosh.
49:29 That's awesome.
49:30 Yeah.
49:31 Yeah.
49:32 So, he was part of one of the acquisitions that we had that used a lot of Python, actually.
49:36 And so, he uses our code internally as well.
49:40 And, no, I mean, he's a super active guy, lots of great ideas.
49:44 I think he's named Benjamin.
49:46 Yeah, that sounds correct.
49:47 I just – I didn't – I was – it was literally on TV, so I didn't, you know, like, save it.
49:51 It was a weird crossover, for sure.
49:55 Let's see.
49:56 But, I mean, in terms of open source projects, I've been spending a lot of time on.
50:00 I mean, I've worked a lot on this one recently called Boltons.
50:03 Basically, like, these are sort of things that I wish were built into Python.
50:07 Like, you know, over the years, I've just sort of accumulated all of these utils, right, that I've seen, like, you know, implementations of in various, like, you know,
50:17 libraries internally, right?
50:19 They're just like, oh, like, why doesn't this exist?
50:21 They'll throw something together that sort of works for them.
50:23 But it's not as well tested and generalized.
50:26 So, this sort of, like, you know, it's just a meta toolbox.
50:31 It's like a toolbox of all these tools for, like, you know, toolboxes for working with, like, a variety of different things, caches and strings.
50:39 And some of them are designed as extensions to, like, Itertools and other built-in, like, modules.
50:46 Itertools was certainly one of, like, I mean, if you ask me what my favorite, like, Python modules were, like, the standard library ones, I have a list for sure.
50:54 Itertools is up there.
50:55 Itertools, collections.
50:57 I mean, I'm a big fan of the select module.
51:01 Like, there are just so many, like, good standard library things built right in.
51:07 That batteries included aspect of Python wasn't just responsible for drawing me to Python in the first place,
51:12 but it was, like, responsible for what I feel was almost like a postgraduate education in, like, you know, programming.
51:19 Like, it's great.
51:21 I'm getting nostalgic now.
51:22 Anyways.
51:23 So.
51:23 Getting emotional about the packages.
51:25 Okay.
51:25 So, is there any final thing you want to kind of call the attention to for the listeners or give a shout-out to or anything like that?
51:32 Well, I mean.
51:32 Like you mentioned in the beginning, we've got, like, one of the things, one of my passion projects on the side is I have a little, like, gig.
51:37 It's not a gig.
51:38 I don't get paid or anything.
51:39 It's called Hat Note.
51:40 You know, we're working on Wikipedia-based projects.
51:42 Maybe you've seen or heard, like, listen to Wikipedia.
51:45 Our new thing is, like, a newsletter that just summarizes all the work that Wikipedians, you know, working on.
51:52 And so, it's called the Weeklypedia.
51:54 You can visit it weekly, like, you know, once per week.
51:58 Weekly.hatnote.
51:59 That's, like, the thing you wear on your head and the thing you write to a loved one.
52:03 Dot com.
52:05 That's fantastic.
52:05 The other one's at listen.hatnote.com.
52:08 Is that right?
52:08 Yeah, yeah, yeah.
52:09 And so, and these are all Python-based as well, open source.
52:14 You can find the code at github.com forward slash hat note.
52:17 You know, those are fun weekend projects that I don't have to hold to.
52:22 They don't have to process money, for example.
52:25 Right, right.
52:27 Yeah, yeah.
52:28 I was, I mean, they are high-quality projects, I like to think, but they don't have to be held to the same scene as Python.
52:33 Right, the consequence of failure is lower, which makes them maybe more relaxing to work on, yeah.
52:37 You nailed it.
52:38 You got me.
52:39 I think that's a wrap.
52:41 Thank you so much, Mahmoud, for being on the show.
52:42 This has been a super interesting conversation.
52:44 My pleasure.
52:44 Yeah, and I think the view inside to some of these big companies, like, it really might give people a different perspective on Python.
52:51 So, I hope so anyway.
52:52 Thank you.
52:54 Yeah, it was great being here.
52:55 Thanks for having me, Michael.
52:56 Yeah, thanks.
52:57 Talk to you later.
52:57 All right.
52:58 Bye-bye.
52:59 This has been another episode of Talk Python to Me.
53:02 This is your host, Michael Kennedy.
53:04 I want to say thank you for listening, and let's let Wikipedia take us on out of here.
53:09 know they heard was interesting.
53:29 The End of the World The End of the World