#4: Enterprise Python and Large-Scale Projects Transcript
00:42 You may have noticed that we have new intro music for this show. If you are a fan of the Developers Developers Developers music by Smixx, don't worry he's not gone, he'll be back next time. 0:52 However, I chose this music for this show because it's actually not a performance but rather sounds from an auditory experience for Wikipedia called "Listen to Wikipedia". 1:03 This is the project written in Python by our guest Mahmoud, we'll talk a little bit about that during the show. 1:10 The sounds you hear correspond to edits, additions and deletions to Wikipedia. Now a few of the topics that cause the sounds are kind of random but here they are: Asphalt, Geography of Peru, Subobject, Fox, 2013 European Olympics, Pretty Little Liars and Carrie's Grammar School.1:28 If you need some sweet Zen sounds to program to or you are just interested in this project, check it out at Listen.hatnote.com and experience Python and Wikipedia and audio all wrapped up in one for yourself. That's listen.hatnote.com. Now, on to the show.
01:50 Hello and welcome to talk Python to me, a weekly podcast on Python, the language, the libraries the ecosystem and the personalities.1:59 This is your host, Michael Kennedy, follow me on twitter where I'm @mkennedy and keep up with the show and listen to past episodes at Talkpythontome.com. 2:06 This episode we'll be talking to Mahmoud Hashemi about Enterprise Python and Large- Scale Python Projects. 2:13 Mahmoud Hashemi is a lead developer of the Python Infrastructure team at eBay and PayPal where he focuses his development and instruction energies on service frameworks and API design, and system resiliency. 2:26 Outside of work, he enjoys coding on his open-source projects which you can find at github.com/mahmoud, as well as creating and maintaining several Wikipedia- based projects, such as Listen To Wikipedia and The Weeklypedia. You got it. Excellent, Mahmoud, welcome to the show.
02:44 Oh yeah, thank you, it's great to be here.2:47 Like I said before, I'm a very big fan of the prior two episodes, I was just listening to MongoDB episode on my way here today.
02:54 Oh, that's fantastic, yeah, we just released that like few hours ago, it's great.
02:58 Yeah, it popped up on my player and I listened to it right away.
03:01 All right, so it sounds like you have some amazing stuff going on, with Python at PayPal, you know the way that it sort of got to know you, I just ran across this blog post that you did called3:12 "Ten Myths of Enterprise Python" and I was just, I was just seriously blown away at the detail and sort of the power of your message that Python is this really fantastic powerful, flexible, and surprisingly to some people, not to me but to some people, large scale development platform.
03:32 Well, I mean, I don't know it took me so long honestly, like that post, you know, we were looking for that post years ago 3:41 , and like when we were just starting to evangelize Python and like work on Python here at PayPal, but, you know like it was inside this like 3:50 [inaudible] at the whole end and the evidence was there, I was just looking for something it summed it up in that like you know very sort of like modern internet fashion. 3:58 But, you know, and then, I guess like, I think I wrote the first chapter of this maybe like a year ago and started using it in internal presentations and so forth, 4:07 but, I mean, I really didn't need it some like four, five years ago and was that quote 4:14 "You should be the change you want to see in the world" eventually like I just went through the whole process to get out to the corporate blog and like just get it out there to help like you know anyone else who might be in the same sort of binds,4:26 you know, just looking for like a comprehensive set of evidence to sort of quash these very persistent arguments that seem to be coming up in, you know the sort of corporate meetings that we all know and love. So...
04:42 Yeah, and what I really love about your article is it's so full of evidence, like every sentence, at least every paragraph has a couple of links to other sort of rich articles that back up whatever it is you are talking about, that's great.
04:56 Yeah, and, I mean, that's the thing, I didn't even really feel that responsible for all this, all of the evidence was there, it just needed to be collected under one roof and then hopefully have some sort of credible source to it. 5:10 I mean, the real like seller from me here at PayPal is often that Bank of America has thousands of Python developers, you know; 5:17 That's a financial institution, arguably like much more important and larger than PayPal, and they, you know trust Python with all sorts of like activities, so if it is good enough for them, 5:30 it should be good enough for us, but the thing is that that's like not really advertised that much even though they are at PyCon 5:37 like doing recruiting, looking for people, if you are looking for a job maybe call a Bank of America; 5:42 So, yeah, I mean, we'll go through them one by one here I guess today, but for the most part I guess I came to write that blog post by way of you know originally joining like PayPal as a PHP developer, 5:58 not for PayPal but that was just what I knew like coming into the company, I saw that there was sort of this, It was like code archaeology if you would, right, there was some evidence of Python usage dating back to the early days of PayPal.
06:16 You said prior to HCP or something like that, right? Prior to Java.
06:20 Yeah, way before. So I mean basically when- like original PayPal was all C++ and I think the majority of the traffic in the days was like going between like C++ services and so forth, 6:36 so, yeah, I mean, there is a long history here going back to like 1998. Pretty much when I came like sort of that ancient civilization was, you know, for the most part gone in 2008 and 2009.
06:50 [inaudible] What were people using?
06:52 You know, I mean, people, went and founded You Tube, and Yelp, and LinkedIn, and used Python too much,7:02 you know, like more visible and good than I guess probably has come from my own role here at PayPal but it had to start somewhere.
07:10 Yeah, that's pretty amazing. Like I was telling you before the show, I just talked to the Chris McDonough from the Pyramid 7:17 and he said that also Pyramid is being used at Yelp, so a lot of things coming together in these two shows. 7:23 That's really cool, I think You Tube is really one of the major sort of things we can put up on a pedals and say before you say that it can't, look at this and see if you could build that with whatever your pet project or technology is. 7:37 Another one is Dropbox- these guys are really doing interesting stuff, especially with the Guido moving over there full time, which is pretty cool.
07:45 Yeah, yeah, yeah, it's pretty neat the sort of like energy inducing effects he had when he sort of like moved between these companies like he was at Google7:57 , and like worked on App engine there and, you know, was I guess like probably a very key subconscious influence I would guess, and so in Python becoming sort of the third, like you know, stack there at Google.
08:12 Yeah, that's really cool.
08:34 Michael here, Thank you so much for listening to and spreading the word about "Talk Python To Me" The respond to the podcasts continues to be wonderful and humbling. 8:42 I have a quick comment about supporting and sponsoring the show. I'm still looking to line up stable corporate sponsorships, but I wanted to tell you about a community based campaign I'm launching to allow listeners to directly support the show. 8:53 We are running a Patreon campaign. You might not have heard about Patreon, but it's kind of like kick start of the things like podcast which release frequent small deliverables rather than one off large engineering projects. 9:04 Visit patreon.com/mkennedy and watch the video to see how you can donate as little as one dollar per episode to support "Talk Python To Me". 9:16 This is your chance to ensure that the Python community continues to have a strong public voice. Consider supporting us today at patreon.com/mkennedy and thanks for listening.
09:46 Before we move into the myth, let me, let me just ask you like how did PayPal and eBay sort of adopt Python so broadly, so there was this history and then people, some of those people kind of moved on, to do you know minor things like You Tube, and then, it sounds like you guys are doing a lot of Python stuff now, right?
10:06 Oh yeah, absolutely. So, I mean, basically every aspect of PayPal, and this is like a large business with many different concerns;10:14 so every aspect you can name we basically have some contingency of like you know, some contingent of Python in that group. 10:24 So whether it is like data analyses in the risk group or it is like, you know sort of, let's see, I try not to give away too much here-
10:36 Yeah, don't spill any super secret things you can't talk about, but, you know...
10:40 But, yeah, but I mean admin tools, batch jobs, mid-tier services, even like, let's see, even some web development and web service like find and type stuff is going on as well, 10:57 even though it's primarily shifted to other stacks at the moment.
11:00 Sure, ok. Well, that's really interesting. I think that's a nice lead in to sort of some of the myths. 11:07 So on your post, you say that you have actually over fifty Python projects running the PayPal, E-bay and you sort of go through a list of all the different kinds of things- how many people are working on these projects, and are they all active and what's the story there?
11:23 I mean, Python is sort of, like, you know, being able to let you move fast and it's interesting how one Python developer can sort of sweep through like his new team or his new organization and like,11:39 you know, fix a whole lot of things and get a whole lot of balls rolling. So, I mean, the fifty plus number is really definitely immediately attributable to like Python's own, like high efficiency.
11:53 Right, it's easy to start and complete fifty projects, more so than say C++ or something like Java.
11:59 I think one developer in twenty-four team got something like three services launched on eBay side using some of our new like Cloud integration stuff and that's just one guy. He's a team of one. And so...
12:14 The meetings are short.
12:16 Oh, yeah, yeah, yeah. So, basically, I mean when I started out, like there were maybe, it started with me making like a DL or like, you know, sort of having an e-mail list on listserv and they, I think maybe like 25 people joined up, 12:32 and I think now the DL has new order of 260 people who use Python on regular basis, and, I can't take much, I mean I can't take all the credit for this because like a lot of this is due to the sort of like burgeoning, like OpenStack wave, like eBay is one of the main contributors so, 12:53 stack users, local stack, PayPal has also followed and then, also we have done some acquisitions of you know companies that use python pretty,13:05 like, heavily, so , I mean, it's sort of like had a lot of organic growth, I'd say most of the growth is organic, but a lot of it is also attributable to those external projects as well.
13:13 Yeah, sure, so some of it came from the outside and sort of pollinated, the internal folks did that ...
13:19 Right, and that's why I think it's really important that PayPal and eBay sort of participate in that you know, 13:26 open source ecosystem so that they can continue to derive benefits from there as we have seen like in the past few years here.
13:34 Yeah, do you guys have like a corporate GitHub place where you are doing things or anything like that?
13:39 Absolutely. github.com/ebay, github.com/paypal, you know, I don't personally maintain the Python STKs for these, but I do know the maintainers quite well, so I'm willing to take some e-mail if you want to email me at email@example.com and you know, 14:01 I'll forward your messages to the folks in charge of those, but they also are pretty responsible on github issues, and I mean all in all I think that like these Python developers in our githubs are pretty much on the ball when it comes to like the whole open source philosophy.
14:19 That's cool. Can you open like an issue on github?14:22 For the team. Yeah, very cool, so you can participate there?
14:26 Even the Python infrastructure team, you know, internal team now has like a mid-tier server framework. 14:34 It can be used for frontier as well but we primarily use it here for mid-tier stuff, and it is called support, so you just go github.com/paypal/support and you know, hopefully you'll see me updating the docks.
14:49 Fantastic. Yes.
14:53 That sounds really cool. So you want to touch on some of the myth?
14:58 Absolutely, that's why we are here for, after all.
14:59 Yeah, I think there's probably 50 myths if we sat down and brainstorm over enough beer, over enough time...
15:06 It's true, I had to cut it down to these ten.
15:08 But I think you really did hit some interesting ones and, you know, I, maybe this is just my sort of, myopic view of the world but, but to me it feel like Python is becoming quite a bit more popular in the last five years than it has been.
15:23 Yeah, absolutely. And I think that is what is leading to into I think myth number one which was "A Python is a new language"
15:31 Exactly. People are just now starting to hear about it, they feel like all this things take it off but, it's...
15:36 It's not new, is it?
15:39 Yeah, yeah, yeah. No I mean it's taking this old Java, you know, its first release happened three years before the first release of Java and I mean it comes from like a really long history as well, it's not just a sort of... 15:54 Yeah, so it's not like sort of like "designed in a week" kind of language that has just appeared out of nowhere, it's based on like Guido's long experience with languages like C and ABC.
16:09 Absolutely, I think he took like two years, he started it like 1989, maybe, I seem to remember that it came out 1991, version 1.0, something like that.
16:17 Yeah, well I was you know just a weeone, like I ran around without even thinking about like, programming. 16:24 But yeah, it is very mature language on top of that, and I mean I don't think I go into this in the post, but it has evolved so remarkably like all of the changes after live version 2.2 with new style classes and all of the, like you know rationalizations that have occurred have really like proven that it still has the flexibility of the new language, right, it's constantly under development.16:49 And you can see a lot of that in sort of like the interesting sort of like discussions and tensions that arise, like you know with the Python 3 issue.
16:59 Yeah, the Python 2, Python 3 issues.17:01 We probably have to have all show with that, but I actually have some opinions on that. 17:04 But, you know, I think it's interesting how much the language is still sort of evolving in a positive way. 17:16 So, for example in- what was it 3.3, they had "yield from" to simplify generator method which was really cool, in 3.4 they added asyncio, in 3.5 there's talk of like type hinting, I mean, these are major sort of additions that are coming on still with 20, 25+ years out.
17:36 And many of them have a really nice, realistic timeline to them, because I mean, yeah, Python 3 will actually take some time you make your whole show on that and you will have plenty of time to get it out before Python 3 is the majority. But yeah.
17:51 The debate will still be raging if we have that show in six months I'm sure.
17:54 Absolutely. So, yeah, but, I mean, Python is not a new language and probably the most of the people on this show like, or who will be listening to this show are going to know that, I think.
18:06 Absolutely. But one of the goals I kind of have for this show is- obviously the people that take the time to like sign up and listen to a Python podcast generally are knowledgeable about it pretty well already, 18:17 but you know maybe they work in teams where people aren't. Like I have a lot of conversations with folks who are in sort of the more compiled language space. 18:26 They always think of "O Python is like this scripty and it's almost like it's the bash script type of thing" 18:32 and I don't think they appreciate sort of the- basically what you laid out in your ten myths here, it's really cool. 18:38 So, speaking of the compiled language, that's myth number two, right?
18:42 Well sure, and so myth number two is that "Python is not compiled". This one is a little bit of, you know sort of, it's a little bit like sort of leading myth in a way, people do sort of bring this up but the point is that like what they are trying to say is that Python is not like C++ and Java. 19:04 I mean, it just ends up coming up always the company that uses primarily C++ and Java like, we are going to have a lot of people who sort of protest at like the first time they are seeing the 19:20 REPL use effectively, you know, when you are doing like a live demo or something like that. 19:25 So, they say like "What do you mean there is no compilation, what happens if I like you know get something wrong?" And I say "Well you've got tests and you'll be ok" 19:33 So, like, yeah, Python is not, and other thing to is that Python is technically compiled like it does get compiled down to by code it happens fast enough so that can happen sort of like you know right before runtime without incurring too much of like you know over head from most sizes of projects,
19:48 Right and it does cache that up as the PYC
19:53 Right, the PYCs and the _pycache_ in the Python 3 essence so, yeah, so basically, it is a compile language and it does execute from bytecode which is exactly what Java does and I mean the JVM like also suffers from the thing called 20:09 [inaudible] which I should have mentioned this in the post, I probably did, but yeah, it has type erasure issue because it is originally based on a Smalltalk with 20:17 [inaudible] which you know Smalltalk itself like you know isn't really a like statically typed language. 20:24 So basically, effectively the bytecode that's emitted by Python is kind of on the same level as the bytecode emitted on the Java side and the main difference between those two runtimes is a the JIT- the just in time
20:37 Yeah, the js, that's interesting. Yeah, the most 20:42 [inaudible] thing that comes to mind when you say that is sort of the Java implementation of generics and templates where in the language it looks like it's, you know, 20:48 [inaudible] and other types of objects that is really just down to object at the base, right when it runs.20:54 The other thing I think is interesting about this is I'm sure one of the hesitations is what do you mean you are going to give just the source code away? 21:01 As, you know, bare files right? And, you know, Java.net and those things, they don't protect you much because all you got to do is throw into a tool and you see basically the same thing. 21:14 Yeah, it is interesting that these other languages can be decompiled and so in some sense you are more or less giving away your source code as well and if you really want to get to it, it doesn't matter so much. 21:29 I think that brings us to the real actually important matter is, is Python secure or not?
21:35 Yeah, that's the real question that people should be asking, rather than wondering if they should distribute source code, is Python secure to which the answer is yes, definitely. 21:44 To the point that we are willing to like actually like put a lot of our security onto that platform, I mean and it's not just our necesment like there have been like studies done, surveys done,21:56 of like you know the actual Python source code that determined that it was like actually very safe not really prone to many of the traditional like you know sort of C runtime issues that can occur that we have seen perhaps in like OpenSSl and so forth.
22:15 Right. A lot of the safety of the language are the fact that you can't work with raw bytes and pointers and offsets,22:20 is part of it but also you pointed out that the small surface area you can accomplish so much with so little that you don't have to kind of put so much out there in your codes so it's easier to guard that smaller code base.
22:33 Right, and that's just a generally good security policy. 22:37 But, I mean, specifically like, you know , when we are having our discussions about security using Python with security here at PayPal, like we do talk about the implications of using CPython, and CPython is like maturity as well as like the code analyses says the current aren't like you know, was one of the deciding factors; 22:55 like basically if you look at Java or something like that, every couple of weeks there's a new Java update because it has tried to like you know, introduce a new security model into its runtime involving like protected memory and that sort of stuff. 23:11 So, I mean, maybe I should rephrase that: basically, it makes promises that are hard to keep and so they have, they end up having like a lot of security releases, you don't see those same sort of security releases nearly as often as CPython even though it has a very wide usage, it's on like not only all of the servers, but also most of it like you know consumer computers at this point.
23:33 Right, definitely all the Macs and Linux machines anyway.
23:35 Exactly. So, I mean, if it were like a viable like sort of attack- 23:40 if it was sort of viable vulnerability if you would, it would probably exploit it by now and so, like, and you can never like really prove security, 23:51 you can just sort of go with what it seems to be secure, and like, the way that you get a good idea of that is something that it has been as widely tested as Python.
24:02 Absolutely. Yeah, you cannot prove security by one example where person 24:06 [inaudible] but, more and more that it stains up over time the more faith we should put in a system, absolutely. 24:12 So I think number four maybe my absolutely favorite or at least the most encountered one for me the folks I interact with and that's "Python is really just a scripting language".
24:23 And this is the original myth, like, you know, in my experience.
24:28 I agree.
24:30 So, I mean, when I was giving my first Tech talk, I remember like it was yesterday, after I finished, we were demoing a new application that basically controls the prices PayPal,24:40 it's still used now it's like before this application was released it would take on the order of weeks to get new pricing schedules out for like you know certain vendors and that sort of stuff, and this made this became just a matter of like hours or minutes, and so we are demoing all of this and by the time we got to the ends, the first question of the first person's mouth was 25:03 "So wait a second, you did this in Python, right?" I'm like, 25:09 "Oh, well, yes, like and how do you mean" and they're like "But Python is the scripting language, you know, so what is this, is this like SGI or what is this, like how is it that you are like talking this complex network product wholes with a scripting language?" 25:23 As if I had done something very irresponsible. Four or five years later here we are and still like doing the bang up jobs so...
25:30 Yeah, that's fantastic.
25:30 Yeah. And so, in this myth I like basically go through, as mentioned earlier in the program here, 25:37 like I go through all of the different like companies that have used Python in so many different ways and it's general purpose language as it described itself, so it has many many purposes.
25:48 Yeah, I really like your list. Maybe I'll just read a couple of them. 25:51 So, we have got like Twilio doing telephony infrastructure- obviously payments with you guys, neuroscience and psychology, there are tons of examples, there is all the numerical analyses stuff with NumPy, 26:04 Disney Dreamworks and LucasArts are all doing animation rendering type stuff, games backends, email, let's see what else is here, I think this the security and penetration testing stuff is interesting, 26:17 big data Hadoop and then obviously we just talked among the guys there's great support on those types of systems.
26:25 We are spinning up a disco cluster here too. I mean, it's, disco has been around for like a little while, but, the idea we are spinning cluster here has actually been pretty interesting experience so far.
26:40 But that's interesting thing to work with?
26:43 Definitely. My favorite example of all this sort of during sort of contentious time in the Python construction team's history here I think binds 10 came out in the analyses, they were going to like use Python for a good part of DNS right, 26:59 which is basically as close to infrastructure as like common infrastructure as the internet has, so I mean, I'm not sure where that's at right now
27:10 Yeah, that's pretty cool though.
27:12 Yeah, but, it was like a really like sort of, it drove a stake through the heart of this myth that the Python is just a scripting language.
27:20 Well, I think one of the other concepts that that leads to people thinking that as well you can't have a real language without a strong typing. 27:27 And so people think Python is weakly typed, right?
27:31 No, that's, this is another one I guess people like, you know, in a practical meeting start getting into theoretical aspects, right, and they start getting the type systems which I mean, you know, 27:42 Python is definitely language of like sort of pragmatic doing of things, and, so, you know, that's my way of saying get this done. 27:52 I'm not sure what rating you have on this program, but I mean and people bring up that it is weakly typed and the response to this, the real response to this should just be like one:28:03 it's not and even if it was like so what? Like, the thing has already been done by the end of the meeting, so it's really about getting the results and Python's type system is wonderful I'm a very big fan, like it has very little to do with...
28:36 I don't know if I include that one but yeah, it's true though, it's yeah, Python is like a strongly dynamically typed language and it works for strong dynamic people.
28:48 Yeah, that's fantastic. So another myth- you know, people think of this scripting language and they think it's just this interpretive thing and so obviously interpretive code is slow and even if you just focus on CPython I think the performance story is super interesting and non-obvious so, you know, take like,29:07 NumPy for example, so if you did all of your sort of mathematics in pure Python and then interpret it there that would probably be slow but of course they have taken the slow part and rewritten that in C so that sort of native and i t's super fast and if you could just kind of orchestrate coin into these little level C functions then you are talking about really fast even CPython but there is more to it than that, right; there's a tone of runtimes or implementations.
29:31 Definitely. 29:32 I mean, Python's demand has lead to quite a bit of supply in all sorts of different like aspects, runtimes definitely being one of them so, I mean, you've got CPython which is the standard, you've got Jython and IronPython and PyPy, and then there are like you know more sort of like academic ones you know for teaching how runtime works and so forth, and so- 29:54 but the real key here is that like calling it interpreted language is sort of a perform of micro- labeling that doesn't actually contribute to the overall like engineering discussion that it should be happening which is that Python can have such a huge impact on your workflow. 30:10 So yeah, you can iterate on your projects at such a great speed that you end up like finding yourself using more advanced techniques or finding out that this area that you would have spent so much time optimizing otherwise is actually not really where the majority of the work of your application is being done. 30:25 So, basically it allows you to focus on what matters when you take that to a macro level looking at the whole ecosystem there are people out there who have gone through that same workflow and have generalized that libraries that you can then take advantage of all of their optimizations, so like, you know,30:41 instances where Pyhton is ended up to using is like SIMS, you know which is like sort of vectorized, you know, like computation that like would otherwise be like rather difficult to like code by hand in C and C++ not to mention distribute. 30:56 Yeah, I mean, it's really important to look at the whole like process instead of just like individual adjectives about how like a given like single runtime works. 31:09 Like the Python way will lead to faster and more efficient code you know that makes a big difference on the overall like complexity of your application which can lead to like you know better maintenance aspects as well.
31:25 Yeah, that's really interesting, I've talked with some companies that are doing amazing stuff and sort of building there almost entire enterprise foundation on top of Python and they've got like- a group I was speaking with, NDA stuff, 31:38 I can't really talk about it, but they had a 160 internal enterprise business applications and they were creating a Python layer to be the foundation of like sort of unifying all that data underline infrastructure and if you could do that then you could do some pretty amazing stuff, that's not a slow system you decide to do that with.
31:56 Yeah, I mean, with like, I was the enterprise is not like you know sort of the domain of like performance being 32:05 [inaudible] we end up spending like more time than we should talking about because I guess people want to go back to their like you know college roots but honestly, 32:13 we can afford more machines, we put more machines on it because you know, we need the redundancy anyway, right, and we end up having to like spend- we should spend more time talking about like you know, 32:25 just how are developers interact with our development tools, because that's really usually where like we have bigger bottleneck actually like getting projects done on time. 32:36 If you do it in Python, you are than having extra time if you engineer your projects through correct behavior and you profile it, you know, Python is decent profile writer and then you can optimize it as need be and you know, we have even found spare cycles to write some of our hot loops and see.
32:53 Yeah, maybe even have some time to write some uni test and make sure it works right.
32:57 No, quality is what lets us sort of sleep soundly at night. That's what's going to actually make for successful business in the long term, generally speaking.
33:09 Right. So, that was myth number 6, which was "The Python is slow", and I think you have kind of a pair of myths that talk about scaling. And one of them is kind of Number 7 "Python does not scale" 33:21 You have some really interesting examples on sort of performance scale there.
33:25 Well yeah, because this one is really just so easily quashed by counter example right- You Tube is I think the second largest website on the internet right now, right. We talk about Dropbox, Disqus, Eventbrite, Reddit, right, Twilio, Instagram and Yelp, and I mean, even games, right, like EVE online and Second Life, right. 33:49 They are actually the areas where you find the most radical like scaling stories because unlike enterprise companies like PayPal or Bank of America, they are not made of money, right,34:00 like the game has to be fun and they run a tight margin and they do it for like the love of crafting like a unique sort of like self contained system and they end up like you know creating such technology marvels as Stackless and Eventlet and all that sort of stuff, so..
34:18 Well, Stackless and Eventlet and, you know, Tornado, I think all these sorts of things are general realm of concurrency and acing processing and that's your myth number 8 right, it's that "Python does lack good support for sort of concurrency multi- threading".
34:34 Yeah, so this one I think is one area where like Python probably gets the most legitimate like technology flack and that's because, I would say that's because Python sort of has a stated mission of- there is one obvious way to do something and it in the realm of concurrency unfortunately that's not true. 34:50 So CPython like you know by itself is sort of a runtime environment; when you introduce a concurrency like sort of library to that you change your like fundamental like runtime behaviors you basically have added a layer on top of CPython sort of native like main thread or whatever you want to call it, 35:12 now either you are working with reds or greenlets or promises or deferreds and that really like, there is so many different opinions that you can have about that like area of computing that Python has itself like you know, probably at this point a dozen different ways to do concurrency.
35:33 Right. I think one of the things that people immediately jump to, some people anyway, when they hear I need more concurrency is let me kick off a bunch of threads, and I think better or worse the whole thing that is either very popular sort of a few years ago is showing that you can get really great concurrency with very, 35:51 very few threads if you are willing to sort of put those threads, reuse those threads when they are generally waiting on database caller, or something like that right?
36:03 Yeah, absolutely. And I mean it's been really amazing how sort of rediscovered these techniques one of the guys on the Python infrastructure team here actually worked on the AIM servers if you remember AOL instant messenger, and yeah, I mean, you know, 36:18 that was all- and they did always the C and callbacks and you know, like they, actually like we end up doing a lot of the stuff back in the day like we end up redoing that sort of now, 36:29 it's been really nice having sort of a graybeard around because basically he sort of like I think back in the day he was working with SSL BIO, and like OpenSSL's BIO's and today we are doing sort of the same thing. 36:43 And you know, back in the day they also had trouble with threads and these days we also- we still find trouble with threads. And these aren't Python specific troubles, there's just that threaded programming bears a few risks, and so in our infrastructure anyway we sort of like taking, we have made some decisions for our developers that will allow us to mitigate those risks, you know.37:05 And one of those is to like basically sort of limit ourselves to a somewhat fixed number of threads, you don't want to have like one thread per request for instance in a server because that means that as your load goes up your contention over head like sort of also start going up
37:25 Sure, and even just a pear memory from just a SAX base for each thread can start to become significant when you are talking tens of thousands or hundreds of thousands of threads. It's a problem, yeah.
37:35 Right. And so, what we end up with in a lot of cases for applications that have 37:41 [inaudible] with the thread in model and these aren't like Python applications, they end up tapping out with sort of hard stop, they start falling behind on their work without like you know being able to respond and shedload nicely, 37:52 we can sort of chalk up all of our good server behavior to just having the time to actually like enlight how our application works add additional sort of behaviors and instrument appropriately because we are not spending all our time rangling with a thread per request model or you know- 38:14 but, you really do have to I mean going back to the original issue of concurrency like a concurrency library is more than just any other library because you end up having to adopt some aspect of its philosophy and there aren't any real easy answers there, 38:29 you need to look at what they are, find out what you know, like sort of is easily digested by you and your brain, and then look at examples of how other people architected their applications if you don't have like strong opinions of your own.38:44 It's a learning process and this is something I feel that maybe like they should spend more time on back in most schools where unfortunately they mostly just focus on processes and threads.
38:55 Yeah, absolutely. Well, maybe that's changing in the future. A friend of mine has a really interesting saying or a way of looking at the world, he says: 39:05 you know, when you are writing this multi threaded concurrent code, if you try to get too tricky, you are writing code like right at the limit of your ability to understand what you are doing, and debugging code is harder than writing code so you are writing code that you literally can't debug. 39:19 You just go and write over that barrier, like I've created this monster I got to live with. So that's pretty interesting.
39:26 And I mean, the codes like basically when you start running it at scale and it's many pools and many machines like it has, it ends up having almost organic nature, when you take into account like load balancers and like you know variations in the network and so forth, 39:46 you need to make time for all of those sort of emergent behaviors that are going to come up when you ...
39:53 Yeah, absolutely. So, Myth number 9: "Python programmers are scarce".
39:57 I think I mentioned it in the myth that like this is somewhat true, right, it depends what you are comparing it to though.40:03 Because, like I said before, you know, we have one developer that goes and creates three production services in one year.
39:57 Right, do you need a huge team of Python developers to accomplish some project- maybe, maybe not.
40:15 Yeah, maybe not. 40:17 Actually almost certainly not; almost certainly not as many as you would need for other stacks and that sort of reflected in like every Python team that I have seen not just here at PayPal but also like across the industries that like they end up generally being smaller and more effective and if you like the literature sort of backs up that these are actually not bad things to have people worry about people getting hit by buses and it's nice to be so concerned, 40:49 but really like one of the keys with Python is that you can learn it very quickly and it has a really nice learning curve. 41:00 I mean, maybe it's just me, but I had a really nice time transitioning from PHP to Django digging deeper into the library reading the source code of like you know just sort of like modules like either tools and collections and these sorts of things and learning about Python from examples that are like C++ and Java, 41:25 I felt that those like you know, like those sort of stacks were more opaque and Python was more open. 41:32 And, then like you know with the github and so forth going and learning like you know the fundamentals of web framework from Bundle and Django source code, basically like you sort of, it sort of has a natural learning curve to it, that basically without any of like you know official training or standard training like you know we were able to rise to the level of like effective infrastructure engineers.
41:58 I had a similar experience you know, when I first learned C++ way back in the day, I remember it, it was this mountain I climbed in learning Python, I mean, obviously learning as a second language it makes it, 42:08 next to language not the first language makes it easier to do but I think even if I had learned it originally, it wouldn't have been much more enjoyable experience, so, and then, you know, you make some interesting points about- well, there may be not so many Python developers or how many there are, there is going to be more.
42:26 Yeah, yeah, I mean, especially when like there are changes happening education right now, where Python is becoming like I think the top teaching language.
42:35 Exactly, that's what I was thinking of, is that just the last couple of years ago it flipped I think from Java maybe into Python being the most taught programming language in college so in a little while that will have some big effects, you know.
42:48 Yeah, and we already are seeing it, I mean, if your company has a policy, like for a while PayPal had policy where they would only hire experienced engineers. Not straight out of college but already we are seeing people coming out of the college being hired to PayPal whose primary experience is in Python. 43:04 So like the effect is already there, and I mean frankly of course I think it's a good thing but especially because Python has so many language features that are so like you know well documented and well thought through and designed through peps and like you know the Python enhances some processes so, I mean, it's actually, 43:26 it's just a more open approachable language that has so much to teach that even if you are going to end up working in other languages I recommend studying Python unless it's going to spoil you for other languages which has been known to refer...
43:42 If you don't want to go to work anymore, because you have to go back to write in embedded C, I mean, you want to stay away from Python, that's right.
43:49 We like C, we like C too anyway.
43:52 Ok, so I think your final myth actually had some of the most interesting actual statistics in it. Python is not for large projects as in large number of flies of code.
44:03 Right, so I mean before we talked about like scaling traffic which You Tube and others like clearly disapprove, you can just like, Python has python is consistent in to runtime and it can scale simply.44:17 But, scaling the developers side of things is not as simple right, people are complex. And so, but, that said, I mean there have been many many examples of Python scaling to the enterprise level and you know here at like PayPal/ eBay we have a lot of small teams and we have some larger ones too, with like multiple experience developers and a couple of Junior developers and so forth. 44:46 So, on the open stack side, especially. So, yeah, Bank of America like I mentioned before has like you know 5 000 Python developers and they just spun that out of nowhere and maybe that's why Python developers are scarce.
44:58 Yeah, maybe they grabbed them all. Yeah, you said that they have over ten million lines of Python code, which is kind of crazy.
45:04 I mean, yeah. I mean, they did that, either they did that because J.P. Morgan was doing it or J.P.Morgan did it because Bank of America was doing it or maybe like it was just a coincidence, but the financial industry is certainly seen like you know a large amount of Python, and the important thing too is that like when your company starts getting big enough 45:27 [inaudible] becomes really important aspect of your recruiting strategy. So, like if you really do actually need 2000, 3000 developers then maybe you shouldn't be banking on all of them you know being from one programming discipline, right？ 45:45 So basically, with trends in education and open source dictating a lot of what talents are available you should focus on having language diagnostic protocols, and like well designed larger architectures that you can plug many language stacks into it and that's what we have here at PayPal where we have C++,Java, Python, Node and there is some work done with like scala and go. 46:17 And certainly as we like- so doing C++ was like hard, like build PayPal and adding Java was like you know still pretty hard and when we added Python we actually created reference implementation and standards for a lot of these emergent protocols and then other stacks like can come and sort of like follow suit. 46:38 If you are looking, if you are actually someone if there is actually someone in charge of the big project listening to this podcast right now, you know, as long as you have good talent, good architecture you can definitely use Python as port of a large project.46:53 We have many processes detailed in the blog post that can like you know lead to good practices best practices...
47:03 Yeah, that's great. You talk about like static code analyses with PyFlakes and other things like that...
47:07 Mhm. It's a big industry and Python has long history in the like a lot of the experience that you can bring to enhance basically company of any size if you ask me. 47:19 I mean, it's not all about like just evangelizing and as I say being religious about like you know a given technology; 47:29 Python is just a really handy piece that fits into a lot of applications.
47:35 Yeah, that's really fantastic. And so people out there are listening and if they are sort of having these debates at work or on their projects and like "I would really like to use Python but people keep laughing at me and say it's not or whatever..." 47:45 I really recommend checking Mahmoud's article "10 Myths of Enterprise Python" which I will link to from the show notes and you can share that or you can share this longer form conversation that we had, so...
47:58 I just wish I had time machine so I get send it back to me a few years ago....
48:03 Yeah, I can see that. What would you say to your former self at work? I would give him this article. That's great.
48:12 It really would simplify things, but yeah, and also maybe I would sell some stock.
48:19 Yeah, exactly, of course...
48:22 This has been a super interesting conversation and the question I like to ask my guests sort of on their way out the door, is so much of Python is driven by open source stuff and there is so much great stuff on PyPy and GitHub and so on, do you have like a favorite thing that you maybe want to call some attention to? Favorite project?
48:40 I have many favorites like one of the ones that I remember looking at recently and being really impressed with the implementation is very small thing, it's called I think net adder or something like that, and it was just a library for working with IPs and IP ranges, but you know, 48:56 as I looked at the code and it was just really exquisitely implemented so I mean it's for some reason my mind always jump so that like because I was like you know, as I looked at the code I was just like "I couldn't do a better job"
49:09 I recently saw some guy, I can't remember who, being interviewed on Blumberg news about they had some sort of museum art setup about how code is art and like the beauty of algorithms and you know maybe that is sort of part of it.
49:25 That's a Gleitsmann, he works with me.
49:28 Oh, he works with you?! Oh my Gosh, that's awesome.
49:31 Yeah. So, he was part of one of the acquisitions that we had that use a lot of Python actually and so he uses our code internally as well and no I mean he is a super active guy a lots of great ideas, I think his name is Benjamin
49:46 Yeah, I think that sounds correct, it was literally on TV so I didn't, you know like, save it.
49:51 It was a weird crossover for sure, let's see, but, I mean, in terms of projects I spent a lot of time on I mean I worked a lot on this one recently called Boltons basically like these are sort of things I wish were built into Python like you know over the years I just accumulated all of these utiles right, that I have seen like you know implementations of various like you know libraries, they are just like oh why doesn't this exist, I'll throw something together sort of works for them. 50:23 But it is not as well tested and generalized so this is sort of like you know it's just a tool box for working with like a variety of different things, cashes and strings, and some of them designed as extensions to like either tools built like modules. If you ask me what my favorite like Python modules were, I have a list for sure. 50:57 [inaudible] tools, collections, I'm a big fan of select module, like there are just so many like good library things built right in, that batteries included aspects of Python wasn't just responsible for drawing me to Python the first place, but it was like responsible for me feeling almost like post graduate education in like you know programming. Like it's great. Anyways, so ...
51:23 Ok, so is there any final thing you want to kind of call attention for the listeners or give a shot or anything like that?
51:32 Like I mentioned at the beginning one of the things I have a little gig listen to Wikipedia our new thing is like a newsletter that just summarizes all the work that Wikipedians working on and so it is called the Weeklypedia, you can visit it at weekly, like once per week, the hatnote like the thing that you wear on your head and the thing you write to, dot com.
52:05 That's fantastic. The other one is Listen.hatnote.com is that right?
52:08 Yeah yeah and so and these are all Python based open source you can find the code at GitHub.com/hatnote you know, those are fun weekend projects I don't have to...
52:22 They don't have to process money for example.
52:26 Right. Yeah, I was, I mean they are high quality projects I like to think but they don't have to be ...
52:41 Thank you so much Mahmoud for being on this show it has been a super interesting conversation and I think the view inside to some of these big companies really might get people a different perspective on Python so, I hope so anyway.
52:52 Thank you. It was great being here, thanks for having me Michael.
52:56 Yeah, thanks, talk to you later.
52:58 Bye, bye.
52:58 This has been another episode of "Talk Python To Me" 53:02 This is your host Michael Kennedy I want to say thank you for listening and let's let Wikipedia take us on out of here...