Monitor performance issues & errors in your code

#12: Deep Dive into Modules and Packages Transcript

Recorded on Wednesday, May 27, 2015.

00:00 Quick- what's the difference between a module, a package and packaging in Python?

00:00 Ok, maybe you should listen to this episode of Talk Python To Me. Number 12, with David Beazley. It'a all about packages and was recorded Monday, May 27th 2015.

00:00 [music]

00:42 Hello and welcome to Talk Python To Me. A weekly podcast on Python: the language, the libraries, the ecosystem and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I am at @mkennedy and keep up with the show and listen to past episodes at talkpythontome.com. This episode we will be talking to David Beazley, about the internals of modules and packages in Python.

01:06 I'm thrilled to tell you that this episode is brought to you by Codeship. Codeship is a platform for continuous integration and continuous delivery as a service. I'll talk more about them later in the show. Please take a moment to check them out at codeship.com or follow them on Twitter where they are at @codeship.

01:24 This episode is also brought to you by Hired. Hired has joined Talk Python as a sponsor because they want to help you find your dream job. Hired is built specifically for developers looking for new opportunities. You can sign up and expect to get five offers within the first week and salary inequity presented right up front. Check them out and get a very special offer at hired.com/talkpythontome. Thank them on twitter where they are at @hired_hq.

01:24 The last show on computer vision with Adrian Rosebrock that was show number 11, we talked about a laptop sticker from years back which really inspired him. It said "Python will save the world. I don't know how, but it will". Well, I thought that was a great sticker and I tracked on a picture of it for you. Check it out at bit.ly/pythonsave.

01:24 I invited David to come on this show and talk about this subject after watching his tutorial online at PyCon 2015. Remember, I have this list of my essential 30 presentations from PyCon 2015, and of course, David is on it. Check it out at bit.ly/pycon2015mk.

02:32 Now, let's get to the interview. Let me introduce David. David Beazley is an independent software developer, teacher and book author living in Chicago. He primarily works on programming tools and teaches programming courses for software developers, scientists and engineers. He is the author of the "Python Essential Reference" and "Python Cookbook".

02:55 David, welcome to the show.

02:56 Hi. How are you doing?

02:58 I'm doing fantastic. Thank you for being on the show, I'm really glad to have you here.

03:01 Ok, thanks for having me.

03:03 Yeah, you bet. I saw your "Packages and Modules" talk at PyCon 2015 and I thought it was really interesting.

03:10 Ok, you survived-

03:13 I did survive, unfortunately my wife was traveling and we have some small kids so I saw it remotely over YouTube. But, I wish I was able to be there. But, yeah, it was really interesting.

03:25 You didn't put it on for the kids?

03:28 Oh, my kids love it when I put on Python instructional videos for them, yeah, they are like door to explore forget it. Give me something to do with like async Python, I'll do that.

03:38 Yeah yeah, good.

03:40 So, we are going to talk a lot about packages and other stuff that you've got going on as well, but let's start at the beginning: where did you get started in programming and Python and all that stuff?

03:49 So I got started, about 19 years ago with Python. I was actually using it in the context of scientific computing, around that time I was trying to figure out some way to script scientific software written in C, in lack of a better description- I was trying to create my own version of MatLab actually.

04:13 So, what kind of science was it?

04:15 I was doing molecular dynamics, material science, sort of on super computers and I had done a lot of C programming for that. And, kind of the biggest problem that we had in that project was not the scientific computing part of it, but it was everything else, like just moving files around, and like ripping data formats apart, kind of like all the annoying day to day stuff. And so I was looking for some way to kind of solve that problem. Prior to discovering Python I had actually written my own scripting language, for which I was sort of resoundingly flamed at graduate school, there was sort of like "Dave, why are you making your own programming language, why don't you just use Tickle or some existing thing" And I had read about Python in an article in I think in computers and physics, or something like that. And I just said "Oh, I'm going to check that out."

05:08 Yeah, and you were like "Hey, somebody else wrote the scripting language, look at that!"

05:10 Yeah, it was way better than mine. The syntax was fairly similar, so-

05:16 Did you also have a white space? Like significant white space?

05:18 I did not have significant white space, but I did have dollar signs and other kind of crazy things either.

05:26 Ok, so you started out in the scientific world and you know, I think Python now is super big for scientific programming; how was it back then?

05:33 Back then it was pretty radical. I mean, this was like, '96, and if you go back that far there were basically no tools available, there were 30 minimal tools and then a lot of what you faced was push back, because people thought like you know, "I can't write real software in a scripting language, you know, why are you fooling around with this scripting language instead of coding in C++?"

06:02 Right, especially if it is computational, right?

06:03 Yeah. I mean I actually got quite a bit of negative response because I was running interactive Python on a supercomputer. People were like "You are just burning out like hundreds of dollars of CPU cycles just typing on the keyboard, what are you doing?" And they didn't really realize that doing that was actually saving us huge amounts of time later, you know, like preventing us from running bad simulations, or taking problems that used to take like 50 hours and reducing them down to 15 minutes, like that kind of stuff. But yeah, it was a lot of push back.

06:42 That's really funny. It's interesting how people can focus on the wrong things, like "Oh we have to have another server for that or otherwise we have to have another person". Or something like this, right?

06:52 I actually got quite a few people sort of mad and talks, I gave some talks about this and every now and then I would get like a question like "How did you get permission from management to do this project?" And I would just say "Well, we didn't ask."

07:08 They didn't explicitly forbid it, so it is permitted.

07:10 Yeah, so it was very much kind of like you know, hidden project, it was not really an approved project.

07:17 Yeah, a gorilla project maybe.

07:20 Although, you know, the people involved, I mean at the time I was doing this work in Los Alamos, there was national lab kind of doing the same things, and a lot of that early work is sort of led to kind of the precursors of things like numpy and the SciPy tools that people are using now, so-

07:39 Yeah, think of the how the world have been different if you had that to start from, right? I wouldn't have made a big difference for you guys, I'm sure.

07:46 You have got to start somewhere, I guess, somebody has to shake thing up I guess, so-

07:52 Well yeah, that's really cool. What are you doing today with Python?

07:57 So right now, well a variety of things, I'm teaching classes, I'm also involved with a startup company, that's been a little bit secretive, but I'm technically co founder of a startup doing some educational tech stuff. And I am actually coding with that, I'm doing a lot of just back end web programming, databases, SQLAlchemy things like that. I'm supposed to be working on a new version of the Python Essential Reference book, I haven't really started that yet, I hope my editor doesn't listen to this, but that is coming soon as well, so-

08:32 Well we'll make it up by promoting the book in the show notes or something like that, how is that?

08:38 Right, right.

08:39 Yeah, the tech education scene is insanely hot right now and VC and people, it's really an amazing place to be and I'm doing a little bit of stuff there as well and just knowing what is going on there is just amazing, right?

08:48 Yeah, a lot of crazy things there, I guess if it wasn't crazy it wouldn't be worth doing I guess.

08:59 Exactly. It would be like going back to high school instead.

09:01 Right.

09:01 [music]

09:01 Codeship is a hosted continuous delivery service focused on speed, security and customizability. You can set up continuous integration in a matter of seconds, and automaticaly deploy when your test have passed. Codeship supports your GitHub and Bitbucket projects. You can get started with Codeship's free plan today. Should you decide to go with the premium plan, Talk Python listeners can save 20% off any plan for the next three months by using the code TALKPYTHON. Check them out at codeship.com, and tell them thanks for sponsoring the show on Twitter where they are at @codeship.

09:01 [music]

09:53 Awesome, so what inspired you to do a tutorial on packages, I mean, that seems like you know, doing a tutorial on for loops or something, but obviously, it was a 2, 3 hour tutorial and it was really great I thought, so- but what started you on this path?

10:09 I have 3 hour tutorial on packages-

10:12 Import, ok, now you can just contemplate this for two and a half hours to get the zen of it right?

10:17 So I guess it was like, probably- it was packages, I mean, modules and packages. I mean one part of it, it's something that I've rarely thought about. It's like you learn Python, 19 years ago it was like "yeah there is the import statement and yeah, sometimes you have to fiddle with the system path " and so forth, you know you sort of get over it. But in teaching a lot of classes, and working with other people it seems like the import statement is like this no-end just hell for people. Like I'll be teaching a class somewhere and then kind of walk around I'll see someone struggling and then like "Yeah I've been fooling around with this for like a half an hour or something" and first I will be like "why didn't you get my attention earlier", but then you go over there and you realize that they are just fiddling around with some stupid thing with the import statement. Like you know, the file is named wrong or that it is in the wrong directory or they are trying to reload something or...

11:14 It could be subtle as well, as like you might have a number as the starting name for your directory or something like that, right?

11:20 Yeah, subtle things or you know, like working with the people on a project, it's like 4 o'clock in the afternoon and somebody is like "why are they cursing so much over there" and I walk over and they are like cursing about like why Python isn't running their code correctly only to realize that they forgot their module loading is a onetime operation.

11:41 Right, exactly.

11:42 You know, it's not reloading...

11:43 I changed it but it is not changing...

11:46 So part of me was just "I should do something on modules and packages, to try and clear confusion maybe" But I think also part of it is the whole module system in Python has been basically rewritten. It's almost like a from scratch rewrite where all of the mechanics of it have actually been pulled out of C, and put up into the Python level so almost all the internals of modules and imports and packages and stuff is sort of way more accessible now, and so I thought it would be kind of interesting to look at that, as well, to sort of take like a totally modern take eye and just see you know like, what is going on with modules and packages.

12:32 Yeah, before we get into the details, I'm sure everybody kind of get the sense what are modules, what are packages, but just to make sure we are all on the same page, I can import things and sometimes I import a module and sometimes I am importing a package. What is the deal there?

12:47 Well, I mean module you know, maybe a simple view of it is, people think that it is like a single file. Single file of source code. Package would be more of a collection of files. Maybe larger application, or framework or something like that, I think one thing that could be a little bit modeled that sometimes things look like modules, when in fact they're big packages, and it can become

12:47 muddled sometimes.

13:17 So it seems to me like with modules I am getting just a file. Like "Hey I have written some code over here, I have written some code over there, let me grab the code in the one file and use it." And with packages, it seems like there is a lot more going on potentially. Right? I've got the __init__.py and I can sort of declare this __all__ variable- can you maybe talk a little bit about the structure of the package and why you might build one?

13:43 Ok. One thing with packages, is- let me back up for a second. If you are thinking about modules, typically you are thinking about something kind of small. Like a single file, maybe a small library, maybe not a huge amount of functionality but, you know, just like just a single point of code that is very easy to use. With the package you are getting into something much more complicated, it is like much larger code base and part of what you want to avoid is just putting all of your code in one file. Like, nobody wants to have a file with 50 000 lines of Python code in there.

14:24 Especially not if you inherit that file.

14:26 Yeah, so especially if it has been given to you. So, with the package you are really thinking about kind of breaking up your code into different submodules, different parts of the application and so forth. And it is more of an organizational tool then anything. Also, just keeping your code separate from the rest of the Python Universe, you know. Avoiding like naming clashes with other bits of code.

14:52 I think the thing that gets complicated with the package is once you do that, once you separate your code out into multiple files, you have to start worrying about all sorts of side issues, like how do those files interact with each other, you know, like how does one submodel refer to another submodule, or how do they combine pieces together. I think that a lot of the tricky parts of packages come into place, you know relationships within the package causes a lot of complexity that you wouldn't see with a single file module.

15:29 Right, I totally agree, and there is a lot of stuff that you could do in that init file to sort of make it explicit about what you want to export. I think of these packages as like reusable libraries that you are going to grab, and they have a bunch of functionality. Whereas modules I kind of feel like "Hey I'm just, I'm reusing a file that I wrote, this file wants access to that part of my code, so I pulled that in"

15:49 And so, you talked about a lot of cool tips and tricks that you can kind of do there; you talked about like the all variable, you talked about importing submodules in the main level module, and stuff like that...

16:02 Yeah, I think one of the things that- I do maybe more of a personal complaint but when you use packages a lot, you can sometimes get like a huge number of imports, that start showing up in your code, where you know, instead of just importing a single file, all of the sudden you are putting in like 20 import statements. I'm not a huge fan of doing that, so you know, these init files that you sometimes see in packages, I mean one use of those is to basically coalesce the pieces into one place. And you know sometimes you can cut down on the number of imports if you have to do.

16:39 Right. Maybe you import the top level thing, like import SQLAlchemy and then you could say "SQLAlchemy dot" and get to the sub package, something like that, right?

16:48 Right, you don't have to worry about, you know how they organize that under the covers of SQLAlchemy, it could be spread across 20 different files, for all you care.

16:58 Yeah, exactly. But you have to manually do that in the init file, right? In order to sort of force the top level import to bring in the subpackages, is that correct?

17:05 You have to take some steps there, yeah. I mean, it's- if you are the author of a package you have to find some way to do that. I've done some tricks involving decorators doing that. The one of the applications I am working on now I mean I have a decorator I can put on functions that would automatically hoist them up into the init file; so I mean there are tricks that you could do to do that kind of thing but-

17:34 That's really cool, is that a decorator that you wrote?

17:35 I just cooked it up myself, yeah.

17:37 Is it on GitHub or something?

17:38 I don't know about GitHub, it is in the tutorial though.

17:44 Right, ok.

17:45 It's somewhere in that tutorial so-

17:48 Nice. One thing that I thought was interesting that you spoke about in the tutorial was that structure guidelines, the PEP8 guidelines sometimes make your code more brittle in packages than otherwise should be?

18:01 Oh yeah, yeah, one of the things with PEP8 I think is it predates some of that work that went on in packages, I mean like PEP8 sort of talks that you should- it sort of says things like "oh, you should just do like an absolute import from kind of the top level package name".

18:18 Right. So if we are writing like the package called "Math" and it had a class called calculator, something like that, we might in the init file say import math.calculator? Where math is the actual name of the package, right?

18:33 I don't know if I would use math as the best ...

18:36 Ok yeah, so I'm lacking creativity, pick another example.

18:39 It's more concerning things like imports within the package, like if you have a package where you know, you've got a - like there is a submodule called "graphics" and there is another module called "data" or something like that, you would end up importing from the top level name down, so if your package is named "spam" for instance, you would have to say import spam.graphics, or import spam.data. And you would do that within the package itself, but the thing that I don't like about that is it ends up hard coding the package name.

19:16 Right, for example if your rename the package, what happens?

19:19 Well, then you have to go change all of your code.

19:21 Great.

19:22 That is the thing that I don't like.

19:23 Is there a fix?

19:24 Well you could use these package relative imports. This is this form I- I am actually surprised how many people had not seen it sometimes, but what it looks like is you have these dots where you would say something like from dot import graphics. Or from dot import data and the dot is just telling Python that you want to load relative to your current location for instance.

19:49 Right. So then it doesn't matter what you call the package?

19:52 Right, right.

19:53 Right, that's really excellent.

19:55 I was sort of thinking about things like versioning and stuff, I mean I could imagine situations where I have code where I need to have older version of some package coexisting with the modern version of the same package, and one of the ways you can deal with that is just a rename the old one.

20:12 Right. Exactly. Graphics.old or graphics_old or something like that. Right?

20:17 Right, right, as long as you don't have the package name hardcoded in there. That all works fine, so

20:25 Yeah, I think it's really good advice. I propose we amend PEP8 to have the package relative import.

20:32 I wonder when that part was written in PEP8, I mean, I hate to admit this in a podcast, I'm sort of a flagrant PEP8 violator...

20:43 Don't worry, nobody can see the code, so they won't really know about it, it's fine.

20:46 Yeah, I don't mean to violate it, but I think it's PEP8 comes after the point at which I got involved with Python, so it never really entered my consciousness, something that I would pay attention to.

21:02 The style was already set, by the time they-

21:03 Yes. I have enough trouble just with the- people are always giving me bad time for using double quotes on all my strings. You know, that's fine in Python, but most people tend to use like the single quote. I use double quotes just because I've done so much C programming.

21:22 Exactly. The single quote in C doesn't do the same thing.

21:25 I just can't get out of C programming. Even though that's not my day to day job.

21:31 Yeah, I have the same problem. I'm kind of back and forth on it.

21:31 [music]

21:31 This episode is brought to you by Hired. Hired is a two sided curated marketplace that connects the world's knowledge workers to the best opportunities. Each offer you receive has salary inequity presented right upfront and you can view the offers to accept or reject them before you even talk to the company. Typically, candidates receive five or more offers in just the first week and there are no obligations, ever. Sounds pretty awesome, doesn't it? Did I mention there is a signing bonus? Everyone who accepts a job from Hired gets a $2000 signing bonus and as Talk Python listeners, it gets way sweeter. Use the link hired.com/talkpythontome and hired will double the signing bonus to $4000. Opportunitie is knocking. Visit hired.com/talkpythontome and answer the call.

21:31 [music]

22:43 One thing that goes hand in hand with packages and trying these things out and so on, is virtual environment. Do you use those often?

22:52 This is going to sound really shocking, but I almost never use them.

22:56 OK.

22:56 I don't know why I never use them, I tend when I want to do something , I tend to just kind of build the Python myself, and I might just put it in its own directory somewhere. I don't know- you know, it's probably just a bad thing to be doing but maybe it's more historical, just having used Python for a while, that's something that's never been that hard to do...

23:21 Right, well you understand what virtual environment does more or less right?

23:27 Yeah, I do, what are the things I do like about I have actually started using a little bit more ever since I got built into Python 3. So that is actually a big win I think for sort of the newer versions of Python since they have the virtual environment feature kind of just baked into the language.

23:56 I think that is really nice. We could go into a whole side conversation on Python 2 versus Python 3.

24:05 Yeah, we probably don't want to go there.

24:06 We probably don't want to go there, but I think it is interesting that things like that are like "hey there is a little less friction if you go down the Python 3 path" I talked with Kenneth Reitz on show number 6. We talked a little bit about that, he was like "I think what we really need is a killer feature that only appears in Python 3 in order to get people really to switch." There is small gains that are making around like Django switching their documentation by default to Python 3 and that made a measurable dent in the world but things like maybe in Python 3 there is no such a thing as a global interpreter lock.

24:41 Oh that would be a big win, I don't know whether that is going to happen any time soon.

24:46 Of course, you know, possibly of something like PyPy, right, or Pyston. Maybe, I don't know, maybe. Anyway, I think it's interesting but yeah, so these things like the virtual environment stuff being built in the Python 3 it is really cool and one more check in the locks and they consider Python 3.

25:04 So another thing that you talked about is like splitting modules into bunch of multiple files. And I really like to do this, like you sort of start it out the conversation, I really really dislike large files. I would much rather have ten 50-ine files than one 500-line file, whatever the math works out to be. It seems like Python kind of dislikes me doing this. It's the more I break stuff into small files the more I have to put all bunch of imports at the top.

25:39 Right. That is one thing you can definitely solve, with the init file. I mean if you put it all in a package you can kind of stitch it all back together in init.

25:49 Right. So I could have like all those in the subfolder which has __init__ so it is a sub-package, and in that __init__ I would have like import from myclass import myclass or whatever, and then it would sort of drop into that top level name space, right?

26:08 Right, right. There is actually quite a few things in the standard library that do that already, you know, things like the collections module does that, multiprocessing does that, I mean if you look at how those are implemented there are actually collections of files but to kind of an end user from kind of the Python side it just looks like you are using a single module basically.

26:32 Yeah, exactly, just collections dot and it is all there. So they pulled that off with the package stuff, with the init file?

26:36 Yeah.

26:38 Ok, that's cool. So one thing that you talked about but I know very little bit about are import hooks. What is the story of import hooks, what can I do with those?

26:45 Well, import hooks. I mean essentially some of the new machinery with import gives you complete control over like locating modules on the system, what happens when you load modules, you can actually completely customize what happens at import. You know, for instance, you can pull modules off of URLs, you could pull modules out of a database, you could even pull in code that wasn't Python. I did a tutorial maybe 3 years ago, 2 years ago. I don't know, it was a Python 3 meta programming tutorial. At the end of that I had some import hooked that loaded in xml file. And translated into Python class definitions at import time, so I am not saying this was a good idea, it was bad. It was sort of an example of some crazy thing that was possible so

27:46 That's pretty awesome. Maybe not recommended though.

27:49 Maybe not recommended.

27:51 Let me rephrase it- it is probably awesome that you can do it, but not awesome if you do do it.

27:56 Yeah, I think I pitched the xml example as something that's been very enterprise ready.

28:00 That definitely sounds enterprise ready.

28:03 Yeah.

28:05 So what about threading? Is there any special consideration for modules if I am like working with threads?

28:11 One of the things that I didn't realize doing the tutorial is just how nasty the import statement combined with threads is under the cover. Module import is not really a totally thread safe thing to do, like, you know, if you are loading a module and Python is in the process of importing something, potentially you could have another thread that tries to import the same module at the same time, and it actually has to take some steps to avoid just complete chaos with that like one of the things that you don't want is you don't want Python to load a module twice, but you wouldn't want because that violates the way the modules work, that they only get loaded once, so if you had two threads trying to import the same module one of them has to do it, the other one has to wait. The other thing that is tricky is that thread that has to wait can't use like a half loaded module either. So it turns out there is a bunch of just nasty stuff with threads and import, I think a lot of that comes from code that does import statements inside functions-

29:27 Yeah, yeah, that's kind of a temporary imported throw-it-away sort of thing right?

29:31 Yeah, right, sometimes people will put like an import inside of a function and then, if that function happens to execute within a thread, the import might not happen until some thread hits that function. And then, all of a sudden you have concurrent imports taking place.

29:50 Yeah, that doesn't sound good. So does Python take care of it for us in the end?

29:53 Python takes care of it for us, thankfully, but some of the code for that is really nasty. I looked at some of the implementation and they are doing all sorts of things like thread deadlock avoidance algorithm, and just all sorts of corner cases. I tried to actually get Python to fail, with some of the cases that they checked for, I was not successful in doing that, which I think gives me maybe some bit of relief thinking "Wow ok if I can't make it fail, then it is unlikely this would happen in most code" but, yeah, there is a lot just really weird nasty corner cases on that.

30:35 Yeah, the whole threading story in Python looks a little shadowy, like it doesn't get exposed to the light nearly as much as a lot of the other pieces. Less used I guess.

30:46 Yes and especially if it interacts with something like import. I mean, you know, already import is pretty dark magic and you combine that with threads and then you are in a weird place at that point, so...

30:59 Yes, you are. So you also said that you can reload modules programmatically, and Python doesn't do that for you but you can. But you also said that's not a super good idea. What's the story of that?

31:10 I would not do that. I mean, so Python has traditionally had this reload state. Python 2, you have to reload thing that would reload the module, and then there has always been this kind of like advice surrounding, like you can do it but you should basically never do it because puppies will die or something if you do that. Do that and some kind of vague ominous sort of mystery surrounding that and then in Python 3 they just took it out all together as a built in sort of... Although you can find it in one of the libraries, so I think it is import lib or something like that. The thing that I think is kind of interesting about reload is that it's one of these things that somebody they could thing that it might be an interesting idea, like "ok I have some Python code running in a server somewhere, and I want to make a code change and then have it load up into my server without restarting or something like that"

32:14 Yeah, maybe I've got a website set in there running and I do not want to have to deal with it, I want to just notice if the new file gets dropped in here just pick it up and run with it, right?

32:21 Right. You know maybe somebody has seen like a demo of Erlang somewhere at some place and they are like I want to be able to do that in Python, you know, do like a hot swap of code or something on the fly. So in the tutorial I talked about reload a little bit, is there anything that you could do to make that work. So, here is some of the problems with the reload. One big problem with it concerns instances of objects that you have created. So let's say you have some code, you have a bunch of instances of classes kind of floating around; if you reload all of the class definitions, what ends up is you have all these existing instances that are basically using the old code. And then any new instance you would make will end up using the new code so you would actually end up with like instances in your code using two different class definitions at the same time.

33:16 Oh, that's kind of crazy. I suspect it runs, but things like static class level data might be kind of broken.

33:23 Static stuff breaks; another thing that breaks horribly is the super call, if you ever used that, one of the arguments to super is the class that you are working with. If you've ever do that you end up getting this extremely cryptic error message

33:44 Something about kittens?

33:45 Yeah about kittens dying. Although- here is kind of the wild thing- you actually can kind of hack this, I mean, one of the things that people- one of the things that you can do with Python objects is that you can change the class attribute, I don't know if anybody's done that but like all objects they have this magic attribute class, that sort of points to the class; you could do some reloading hacks where on a module reload you go through all of the existing instances and then flip their class attribute to the newly loaded class. All of a sudden it's using the new code.

34:25 Yes, maybe it's possible if somebody really wanted to change the standard library that supports this, maybe it could be done but maybe is not really the best idea.

34:38 My gut feeling is that you could probably do it if you wanted to surround your application with these sort of 10 000 lines of code to manage it in some kind of sane way and then maybe it would work, I am just not sure it is even tractable problem to solve in the big picture

35:01 Maybe it's not a good idea anyway. Things like Docker and these microservices and so on it probably means it's less of a big deal to restart your app?

35:14 Yeah, that's my feeling too, I have to admit that code I am working on now if I want to do a deployment of new code I just kill -9 the old one.

35:25 Yeah, sure.

35:28 I mean, the system actually has sort of a monitor or watch dog or something that just watches to see whether the thing is running or not. If it's not, it restarts it.

35:39 If you want to do a deployment you just kill the old one and then it will automatically respawn itself at some point, so...

35:46 Yeah. That works. Nice. So, you talked a little bit about the reloading stuff being different in Python 3. Is there other stuff that you are aware of, like dramatically different from Python 2 to Python 3 like if I am down at the package level, writing code, worrying about those things, do I have to do something to make my Python 2 code friendly to Python 3 upgrades, some things like that?

36:09 I wouldn't say there is a huge number of differences on the Python 2, 3 side, you know, one thing that might impact people is that you can't do a relative import within a package, so if you had a like a package directory spam and then you've got two files in there, you know like foo and bar, foo.py bar.py, in Python 2 you can one of those files like bar.py can just say import foo. And it will find it in the same directory that does not work in Python 3. You would have to say from . import foo or something. So, this relative import feature that is one place that might break in Python 2, like from 2 to 3. I'm not really aware of much else though, I have been using 3 for a while, I haven't really noticed anything that would break across languages like that.

37:04 Ok, that's cool. So you said that from dot sort of relative import styles, does that also work in Python 2?

37:10 That also works in 2-

37:13 It's just there is another syntax in 2 that wouldn't work in 3, right?

37:15 No, it is the same syntax in 2 and 3, actually.

37:18 Oh, Ok.

37:20 The syntax works in both, it's just that the your Python 2 lets you do this relative import, they have taken that away from you, in Python 3...

37:29 Yeah, Ok, yeah that is what I was thinking about. Ok, cool. What are the things that- I don't know, I wanted to ask you about what surprised you about going through like this deep dive into this world, like, what do you feel like you learned from this adventure?

37:42 I think one take away from looking at it is that there has been really an effort at the cleaning up a lot of hacks. Like a lot of the features that have been built into the new import machinery are actually solving problems that people have been solving with Python 2, for maybe 10 years or more. Like you will see some feature in Python 3 and it's like "Oh this is kind of interesting". You know, why did they do that? And you kind of chase it down through maybe you know, descriptions through PEPs and things like that. You realize that the motivation for this maybe came from some thing that somebody had done in Soap or something. Or some big Python package. And they just sort of rethought it in Python 3 and now what is interesting is the old hack that somebody would have done in Python 2. It's just completely unneeded at this point. It's been cleaned up in a totally different way.

38:39 That's really cool, it's just not a problem anymore. Or as used to have to have this kind of special knowledge to survive whatever case they were dealing with, right?

38:46 Yeah, I think that's actually a theme with a lot of Python 3 actually, not just imports, but to- you know, if you look really deeply at a lot of things in Python 3, they are solving problems that people have been dealing with for a long time. But just trying to simplify it or make it more sane if you will, you know, I think that is throughout the language actually, you see a lot of cleanup, like "Ok you no longer have to do this kind of weird hack because it just works in Python 3". The thing is tricky, and that's just a really hard selling point, you know, if you are trying to convince somebody to go from Python 2 to 3 and you say "Well, Python 3 is better, because it cleans up this weird hack that somebody was doing on Python 2 ten years ago" You know, that's often not a compelling story.

39:37 Sure. It's hard to say "You know what, it's easier for the people writing Python 3 the standard library stuff to maintain it so you should use it". "Ok, I don't care about that, that's not my problem!" Right?

39:49 Yeah, and you know, actually one surprising thing in the Python- this is one example of some Python 3 stuff that I thought was surprising, like right now in Python 3, you can ask the import system to locate a module for you without importing it, which is kind of a, you say "That's kind of an obscure thing" but what is kind of interesting about that is it solves a problem that people have doing what I would call a trial import. You've probably seen this pattern from time to time, well somebody will do a try statement and then they'll try to import a module and then they'll just catch like an import error exception and then maybe take action if it doesn't exist-

40:33 Right, maybe do you like some sort of polyfill or try to load the Python 2 version versus Python 3, or something like that, right?

40:40 Yeah. And it turns out there is some really weird obscure failure modes of that like somebody might try to import a module, maybe the module exists, but it can import some other module. And then you end up with these like weird failure modes where you might get error messages related to the wrong thing, that might be kind of pointing you in completely the wrong direction.

41:05 Yeah, you might tell the user "Hey, make sure you have these packages installed" and they are like "I do have these packages installed"

41:09 Right, right, you have like a false message, saying "Hey I wasn't able to detect your package" and then the users are like looking at their directory and they are cursing because they are like "Wait, it's right there, I'm looking at it, like why can't you find this"

41:21 I'm going to email this guy the pip list and show him it's here.

41:24 Yeah, right. And so, you know, some of the things that you can do now I guess in Python terminology it's almost like a look before you leap kind of thing, with import, you can go to the, you can go to the import system and you can say "Hey where is this module, do you have this?" And it could tell you whether it has it or not without actually importing it.

41:47 Interesting. What's the code look like for that, it's not try import, or anything like that?

41:50 You have to import a single function, and you just call it, it's like a function call.

41:56 Ok. Yeah there is some library call, you say "hey does this thing exist, show me where it is".

42:02 Yeah. It actually gives you this thing known as a module spec. It actually tells you a whole bunch of information about it, like what path it's in, like is it Python source code, is it C module, is it a built-in, you know you can actually find out a lot of information about the modules without actually loading it.

42:22 Interesting. Will it tell you the version?

42:24 I don't know, because version is usually inside the file.

42:28 Yeah, it is in the init. Ok, well that's really cool.

42:33 You can do some interesting things with that as well, doing like a sort of module stand ins for instance, I don't know if this will make sense but you can make like a- you can basically ask Python to locate a module, and then what you can do is make a dummy module to take its place temporarily, like you can make like an empty module, and then you can program that to auto load the source code when it is accessed later on. If you've ever had this phenomenon probably people have done this, when you are doing import on some module and then Python just sits there for like 30 seconds, while it loads the entire universe of code behind the scenes, you could use that to kind of solve loading time issues, you know setting up modules where they don't actually load until they are needed, you could do things like that, which are just kind of interesting.

43:25 Could you do something insane like kick off another thread, to import it, and maybe it will already be loaded by the time the code needs it?

43:33 I hadn't thought of that, but yeah, maybe.

43:36 That probably would be wrong, but-

43:39 Insane actually. Yeah so you could do like

43:41 Like a lazy load. Lazy import.

43:43 It would be sort of a lazy load sort of a lazy concurrence.

43:47 Yeah, I hadn't thought of that but yeah, that is sort of devious. I guess you could do that. You might have to explain that, in a code review or something like that.

44:04 Could you just tell me why this is here? Who knows, maybe someday Python will have a lazy import key word, but probably not.

44:12 Yeah I'm just trying to think how that would work, I mean, I guess that would be-

44:17 I think you could just kick off a thread and just do nothing but import it, right? And then technically I mean, Python should manage that concurrency and if you don't end up calling the function that actually needed it until later, maybe, I don't know.

44:33 I'm thinking how this could be useful, like interactively, you could type for and it would come back instantly.

44:40 Oh look they made it fast.

44:40 The person using it would think "Python is awesome, it's so fast!" And then that realizing that that is actually importing the universe, in a thread.

44:55 It’s probably wrong.

44:58 Well a lot of things are wrong, though.

44:59 It's true. So another thing I wanted to ask you about is, you know, init file it kind of felt that is there mostly to like structure your import so I'm going to say import this, this module this module this module, put them in the top level space and so on, but you can write arbitrary code there- do you think that that kind of stuff is like abusing the intent of that or is that kind of it is just taking advantage of what it should be able to do?

45:26 I see modules that do that, I don't have a strong opinion on it, although I have to say kind of rubs me the wrong way a little bit, to see tones and tones of code in the init file. Partly because that's not where I am expecting to see it. If I am looking at somebody else's code I want to know "Oh where is he source code for the database object" or something like that, I'm more inclined to look for maybe a file called DB.py or something, it doesn't occur to me that the code would be in the init file.

45:59 Yeah, I kind of feel the same way.

46:04 I try to keep the init files kind of small.

46:08 Yeah. I agree. Like I said, more small files is my style as well, while just sometimes, you know, I feel like I'm importing a thousand things.

46:17 Yes. I mean i have seen that technique used, I sometimes run into that where maybe somebody started with just a single file module and then all of the sudden they want to have unit tests shipped with it or something, maybe they want to have like a sub-directory of unit test, and they will do things like "Oh I'll just take my whole module and drop it into the init file."

46:38 How could that be wrong?

46:40 You know, and then I'll have like a separate test sub-directory or something, or I'll put my unit test so... I don't know.

46:48 Ok there's two questions that I would like to ask my guest near the end. So I'll ask you those now: do you have any favorite PyPy packages or like libraries that you think are really cool and just want to say "Hey world check out this package it's really awesome."

47:05 Hm, ok, there are the obvious ones, I'm a big fan of Pandas-

47:11 Pandas is cool

47:11 For data analyses, things like Requests, SQLAlchemy, I think these are used by a lot of people. I'm trying to think of a more obscure cool

47:28 While you are thinking, one of the ones that I was reminded of yesterday is something called "passlib" that will do all the management of like hashing correctly passwords and stuff. So you can say here is the password, now I want to use SHA-512 hashing and please iterate that hash 40,000 times so it is super computational and that's like one line of code, it's beautiful.

47:50 Ok, I will have to look at that. Actually this was a standard library module that's in Python 3 I had some people in the class they were fixated on manipulating IP addresses. I was like "you should look at the ipaddress module" that is something that is in the standard library, in 3.4 there is a whole module just related just manipulating IP addresses.

48:14 Wow, I don't think I've even touched that thing.

48:16 Yeah, iterating over like subnets, and that might be something to look at, kind of blow people's mind.

48:23 Yeah, that is pretty cool. Alright, awesome. And the other question is what editor do you like?

48:28 I use Emacs.

48:29 Emacs, all right. Yeah I used Emacs back in the day, these days I've been doing PyCharm, I guess I'm patient. Waiting for stuff to start to get all the features, I don't know.

48:39 I have too much invested in Emacs, at this point to give it up, you know.

48:52 Absolutely, we could almost have a Python 2 versus Python 3 type debate with Emacs versus Vim but we won't go down that path right now.

48:59 I like to use Emacs to kind of troll people a little bit. It's fun. Although actually my big shameful thing with Emacs I have to admit that I never customize it, I know it has all that stuff in there- never go there. I have too many other projects to work on than that, so ...

49:24 You don't need to change editor just get some work done.

49:27 Yeah, just get the work done. I actually wish they would fix IDLE a little bit.

49:32 Yeah, that could use some help.

49:34 They are like the hate on Idle, and I use it a fair amount to teach in classes. It has that one feature- it comes with Python. So if you are looking for just an easy way to get started without having to install stuff, it's great but it could definitely use some love these days, so...

49:55 Yeah, I agree. All right, David I think that might make a show for us, that was a really interesting conversation. Before we head out there is two things: one- I want to say people can find your tutorial on You Tube in the previous show I had like a playlist of my favorite talks in PyCon 2015 and I'll put that in the show notes but you could just go bit.ly/pycon2015mk and that is a list of about 30 really good sessions from PyCon 2015 and yours is definitely in there.

50:27 Ok, great.

50:28 Awesome. Anything else you want to give a shout-out to? Let people know about?

50:30 I don't know, if you like the tutorials come take a class with me in Chicago. That would be the only thing I would say.

50:36 All right, awesome. You have got a website, they should check out for that.

50:39 They can find it on my personal site.

50:41 Ok. Cool, I'll put that in the show notes as well.

50:45 Ok.

50:45 All right David, thank for being on the show, it's been fun.

50:47 Ok thanks a lot.

50:48 Today’s guest was David Beazley, and this episode has been sponsored by Codeship and Hired. Thank you, thank you, thank you for keeping this show going. Please check out Codeship at codeship.com and thank them on Twitter via @codeship. Don't forget the discount for listeners, it's easy, TALKPYTHON all caps no spaces.

50:48 Hired wants to help you find your next big thing. Visit Hired.com/talkpythontome to get 5 or more offers with salary inequity right up front, and a special listeners' signing bonus of 4000 USD.

50:48 Remember, you can find the links from the show at talkpythontome.com/episodes/show/12. Be sure to subscribe to the show. Open your favorite podcatcher and search for Python, we should be right at the top. You can also find the iTunes and direct RSS feed in the footer of the website.

50:48 This is your host, Michael Kennedy. Thanks for listening!

50:48 Smixx, take us out of here...

50:48 [music]

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon