00:00 If you run into a problem with some API or Python code, what do you do to solve it? Well, personally, I throw a few keywords into Google sometimes before even checking the full docs works great. But why does it work so well, because invariably an excellent conversation and answer from Stack Overflow comes back as the top result, and it's usually just what I needed. This week, you'll meet Martine Peters, one of the top Python contributors at Stack Overflow with over 16,500 questions answered into reputation of over half a million. This is talk Python to me, Episode 86, recorded November 2 2016, developer,
00:40 or developer, in many senses of the word because I make these applications and use these verbs to make this music I constructed to think when I'm coding another software design, in both cases, it's about design patterns, anyone can get the job done. It's the execution that matters. Many interests.
01:00 Welcome to talk Python, to me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm at m Kennedy, keep up with the show and listen to past episodes at talk python.fm and follow the show on Twitter via at talk Python. This episode has been sponsored by robar. And go CD. Thank them both for supporting the podcast by checking out what they're offering during their segments. Martin, welcome to talk by them.
01:30 Hi, there. This is exciting. Yeah, it's
01:31 an honor to have you here. You've done some really amazing work in the Python space on Stack Overflow. And we're gonna spend a lot of time talking about what you've done there, what your experiences were, but also kind of a survey some really cool questions that you've pulled out for us to talk about that are representative of good work and, and the basically the Python community at Stack Overflow. But of course, before we get to any of that, let's start with your story. How did you get into Python and programming,
01:59 I started programming quite early on my father was a computer engineer, he fix computers to be you fix a bike, you find the part of breaks in solar, and you want it and back in the 70s and 80s. To find nickel mountain computer, I was given a mix. And basically haven't looked back since price. And I started pricing in 1999, I think when I was a web developer, and I found a new platform called soap, that you think of web pages has components in its objects, which was interesting. And that got me into Python at the same time. My first project I gave a time estimates and Bebo time estimate which blew away my manager, because that never happened before I always go overtime. It's all engineers do and and haven't that was my start with BD was a love at first sight, I think.
02:53 Yeah, that's great. And what kind of project was that? That so project
02:56 it was for a befriended company that was doing software on the web. And when you saw software on the web, you need a web shop. So I built a web tool. And because they because their software was already built in Python, it was very easy to incorporate their license generator into it, too.
03:15 Oh, of course. Yeah. So they were selling Python software you needed run their licensing is just like, yeah, we'll just you know, import that right. We're gonna import Yeah, yeah. Beautiful. That's great that it was is quick and easy. But I think building e commerce sites today, quite a bit easier than it was back in 1999. What do you think we got one recently,
03:33 I tend to stay away from from shop, mostly done in my career done mostly done content management. So think about an intranet website where everybody has to be able to put in, give their input as to what the company is doing or whatever, share between teams.
03:49 Right? So mostly Content Management stuff, which is really, really valuable and helpful. What do you do today? You work at Facebook, right?
03:57 Today, I work at Facebook, and they do very different things. I don't too much work on the website anymore. As you know, Facebook doesn't use PHP for that strange language. I work on the backend side on the source control sites where I echo material, an open source project that lets you do development, distributed source control. Oh, that's cool.
04:17 Yeah, so Mercurial is kind of a cousin to get similar, similar type of source control. What's material written in materials written in Python? Interesting, cool
04:26 is a Python material. It's a Python project. And because it's a Python project, we will make it able to make it skill. So Facebook, everything is 2011. Everything is crazy. Everything is just over the top high numbers and skills off the charts. And to be able to follow the developers and all the changes they make every day. We couldn't actually make get skill there. We can make it full of material did follow. So we can because materials so it's written in Python and highly extensible and highly tweakable. We can make it work at the decreasing levels at Facebook is that That's
05:00 really I'm sure that's so exciting to work in Facebook. Absolutely. It's
05:03 a great place to be.
05:04 Do you work remote? Or is there a placing there in Cambridge and work in London?
05:08 Sorry, I commute four days a week to London to work. And one day a week I stay at home. How much source code does Facebook have? is it like? So
05:17 when you talk about scaling it? Is it the number of people using the system? Is it the amount of data what's the scale look like?
05:24 It's everything. So we put everything into one mono repo. repo is means that we try and put, try and put everything into one place. wonderful story, so that you can efficiently take all the sharing and moving and that really helps speed things up. But at the same time, you can easily come into the fall counts of six figures. So trying to get a full check out of that is not always the best thing you want to do. So we have issues with the number of files, the number of directories, the number of committee today, the number of developers that work on it. All those things play.
05:58 Yeah, I can imagine that. Wow. Yeah. And Google also does this mono repo thing. This is this is a really interesting idea to me, when you have that many files,
06:07 have you open sourced any of this, these tweaks or anything written about them, lots of this stuff is open source. So Facebook has a repository with their own extensions in there. They are not perhaps immediately something you can use, but there's lots of inspiration there. They also contribute straight back to the material project itself. Until recently, the founder of material moko worked at Facebook, he recently decided that offers and years of material work, he wants to do something different. So he moved on to other things. And the material project is now trying to make its way without him. Wow,
06:44 yeah. So you had the guy there to help you build this up. That's really interesting. Let's talk about Stack Overflow, maybe tell the world what Stack Overflow is the one person who doesn't know it.
06:54 Stack Overflow is a site for programmers. And the goal here is to help programmers find answers to their common problems. It was born from frustration with forums and the general forum where someone says that they had a problem with this code. And Could someone help them and 20 people pipe in with it, they have the problem too, and they don't know the solution, and maybe on page 15, someone that's got the solution. And then it disappears into a huge pile of other posts about different things, you never find it back again. Stack Overflow is very much aimed at trying to surface the question and the answers. So you have just a question on anything, you have answers. And people they use gamification to get the best answers on the top. So the general idea is that people thought this was a question that is worth the digital asset is complete. And these are the answers. And this one's the best because they put so much focus on it. That's the idea is that you get unstuck. obedia have knowledge of
07:52 programming knowledge in the you know, Jeff Atwood and Joel spolsky. The two founders, they have nailed that in some some special way. Right? Like
08:00 they found the secret sauce,
08:01 they found they definitely found a secret sauce. And it's working really, really well like pretty much any programming question you have, if it has technical detail to it, there's probably a Stack Overflow result in the top five Google results Exactly. For
08:16 the most common programming problems, you will find the solution there.
08:20 Yeah. And it's not just Well, you should put this curly brace here or whatever. It's also it's much deeper than that. Right. I think I think there's a lot of really interesting questions, as we'll see.
08:31 Absolutely, absolutely. There's such a wide range of things that answer there, provided you stay on topic, which is sometimes a little harder.
08:39 Yeah, I can be but yeah, it's in a very, very much careful about making sure things stay on target and focused and don't don't random basically become the forum that they're trying to to escape from. So let's start. Yeah, so let's start with what makes a good question on StackOverflow. So I guess before we you answer that tell us, I was looking at your, your user profile. And there's some pretty astounding statistics there. So for people who know you have about 500,000, a little over 500,000 for reputation, but you have 16,000 plus answers, and it says you've reached 22 point 9 million people that's really amazing.
09:21 The numbers, they keep on growing because I like answering I like finding solutions to problems. It's it is kind of addictive. It is addictive with the fact that the feedback he gets and he learned so much yourself from this, and I love finding the key to the problem. So if someone has a particular view on the problem, they have a certain idea about how things would work or not work and they clearly don't understand something and I love finding the key that helps them suddenly see Oh, that's why this is how it works. So I'd prefer not just to give you the solution. I also prefer to tell you why.
09:57 Yeah, I noticed that from your answers. Yeah,
09:59 in doing that. You learn yourself whether or not you know why. And that helps me grow as a programmer. Well, yeah,
10:05 I think that's a really good point. Because it's one thing to say this code doesn't run as I expected. You could say, well change it to this. And it would. It's another to say, Well, the reason your code doesn't work, is if you really look inside, like, say, the C Python interpreter. It's doing this. And when it does that, here's what it means. And so here's how you fix it like that. That both in richins, the person's understanding, but yours as well, like you say,
10:32 exactly, I get to learn and apply elsewhere as well. I can use it in round at life. And I don't work is informed by what I learned from entering our circle of law.
10:42 Yeah, as I'm sure it is, I'm sure it is. Let's start with what makes a good question on Stack Overflow.
10:47 So stack overflows is a strict about what it accepts for questions over the years, we've learned certain things don't work. So certain things aren't on topic, and stay away from those sorts of things like asking what is the best web framework? What is the best, it's very subjective, and it's very hard to answer without a little more detail, but what you're doing, most people don't give that detail. And what also happens is that the spammers come around the people that have something to sell, and they say, ours is the best, and then you don't know what our know which one to believe. So those kinds of things are off topic, the same thing for opinion based things and we don't want to go on it, we don't want to end up writing books either. So if you want to know about the best practices for using object oriented programming in when to use one technique, or another technique that would rapidly be closed as too broad. And once you narrow it down to very specific programming problems, then you already start being on topic. And then what you definitely need to do is, first of all, realize that these questions are not just for you, we want them to be there for everybody. So someone else come and googling later on, need to be able to recognize that they have the same problem. And the other thing is, is that if you want answers, you need to help the people that answer you. So you want to show the context, the research that you've done, so and you want to narrow down your program to the essentials. So if you come in and say I have this XML document that I want to parse and get this information and write it to a file, but I want to transform this information this way. And that way, how do I do this? You're going to end up disappointed.
12:25 Yeah, and the XML file is like 1000, lines long with all sorts of complicated stuff. And they haven't focused in on just the essence of the problem. They're basically like, solve this problem for me, right?
12:35 Exactly. So full this problem for me, but you can do at some point is you need to break this down into separate issues. Perhaps you've already succeeded in loading XML file and parsing it, you've already made me made choices there. Tell us about those. So now you've narrowed it down to perhaps only finding specific information in that file, leave out the parent part, but you're going to perform it and write to do something else. Those are separate problems. Those are separate steps. And once you narrow it down to one of these steps, and show us what you put in there, what you expect it to come out, and we'll have it instead with all the error message and all the full trace back and everything that you can that you see, then we can see it too. And often you will come to a very quick answer. Or it might be again, I cannot emphasize this enough, you have to have done your research. Because it could be that the question was already asked before and also before and your question will be closest to duplicate?
13:30 Yeah, so StackOverflow is actually pretty good at helping with you find answers. As you're kind of ignoring the fact that they might exist, I guess the way to think of it is
13:41 we don't hate duplicate cider, we don't dislike duplicate, we, the internet is of course very impersonal. You can't see the other person's smile or when they're warm. So people often sheet is a personal putting that question down thing, duplicates. Sometimes people actually gone I'm sorry, I asked the duplicates. And to that I say that it's all right. duplicates happen. You can't always know exactly what you're searching for. We can help with that. And duplicates actually are valuable to stack overflow as well, because what happens is that you now added more keywords for people to search on. So when you created the duplicate you asked a question it was well asked to be but it was a duplicate. Google still has index, the old your question. And when someone clicks on it on Google, we actually redirect new visitors straight to the the canonical question that you were a duplicate of. So that moment you've helped someone else find it the same solution? Oh, that's interesting. I
14:37 didn't really think of it that way. So the research part is really important. I suspect it really affects the way that you feel about answering the question if like somebody has just put zero effort, or what appears to be zero effort into finding their solution and there's like, I don't know, let's put it on Stack Overflow, see if somebody will do this for me. Probably your willingness to give a good solid answer because Right,
15:00 there's just certain type certain classes of questions that we've seen so many times now. So often arson is sold many times is a duplicate these common errors that some people may lose patience a database, and then you get having done your research, you could have found it in the first five
15:18 results on Google. Right? I took your title, I put it into Google. And here's the Stack Overflow insert that actually has 10 answers and 100 up votes and whatnot. Exactly. Interesting.
15:29 So it's the same kind of thing. I think there was a funny video online, what if Google was an actual person, but you see someone in office and people come in and asked, How is bebi formed to this person's face? And the person behind the desk being a comedian does a great job of looking frustrated or can be asked this question all the times already?
15:50 Yeah, exactly. Nice. Okay. So I think everyone's got a good sense of, more or less, what makes a good question. Let's talk about some of the ones that you found that are noteworthy or whatever. First of all, how many Python questions are on Stack Overflow? Do you know?
16:08 Oh, if I look at the very quickly, it's about 650,000 650,600 50,000 questions today that have been tagged with pisin. The can be more, there are several versions Pacific subtags of bison to seven or bison three, five. And not everybody uses the main tag, when they ask about a specific version, right? It might be it's just tagged
16:33 Jingo, but it would also count as Python
16:36 that might just be Django, specific library questions with Django flask or pandas or NumPy. Those kinds of things. They're all Python questions too, but they might look to the tag. So the might be more.
16:49 Alright. The Probably, yeah, but yeah, it's probably a subset. Okay, cool. So you've out of this 650,000
16:57 plus questions, Joseph, you
16:59 that are really good. And you said that, um, some of the questions that are really interesting, are these What? Questions WAP WT.
17:09 This comes from a talk by a man named Gary Bernard, who did a shorter, very funny talk about last moments, he goes on Wow, completely means those moments when you go, it really shouldn't be doing that. That is such a crazy response to the code. Usually, this is bugs, or they obscure corner cases with the language. They are for me personally, that the most interesting ones, because they definitely will cover areas that I know nothing about. or very little about. If they make me go, Wow, then it's definitely something interesting.
18:53 It shouldn't. So if you say set open for the C and you pass an iterable thing like a list or something, you get one result, if you say curly brace, thing, common thing, common thing, common thing, close curly brace, and you put the same things in that you pass to your set initializer you actually get different set opinions with the properly constructed inputs.
19:18 So the moment I definitely go, that shouldn't happen. That's weird. I'm intrigued. I'm hooked. And the next hour is going to be spent figuring out what's going on. Where did they come from? Why, why why? In this case, it was a bug.
19:32 It was a bug. Okay. And what I'm looking at your answer here, and of course, everybody who's listening, all these questions and answers will be added as links in the show notes. So look your answer here and you're like, Alright, so here I'm trying a few things and we're trying to understand it and Alright, let's just open up the disk module and do a straight disassembly on this thing.
19:55 Yeah, because at that point, you're gonna have to dig in what is the Python interpreter doing? So you need to bytecode and the bytecode is built from the source code drives the interpreter. And then you can start looking at the interpreter source code and see what was going on. And for me, it was actually hard to figure out what was going on because I looked at the most recent version of bison from source control. So from the Mercurial repository, and the bug had been fixed very recently, as well. So you used if you do this with a release version of bison, you will not see the bug, you will see the bug. If you actually look at the source code as it is today,
20:34 you won't see the bug I see Did you go to the source code to try to understand it, like, if I can do it,
20:40 I went to the source code, it looks to me my hypothesis is that the set literal is parsing these things in the wrong direction in wrong order, that it's doing this in the wrong way. But in the source code, I could see that it was parsing it in the right order. So at that moment, I'm trying to figure out what's going on, I couldn't reproduce the issue clearly wasn't matching what the code was. So I actually went to search for issues in the bug tracker, found it and then quickly realized that I was looking at fixed version of the code because it already had been fixed. So what was happening is that for all currently released Python versions, if you use acetylate, fro, the stack is each element in that literal is pushed onto the stack and in the stack is taken in reverse to build the sets. So you always get the items from last to first add it to the sets. Now a set test for equality. So if two objects test for artists equal, even though they look different to you, as a developer, there are different types of data, they still say that they're equal that and the first one in the set will win and will stay in the set and the other ones are rejected, they were not added. So this is just the case someone was using zero, and then the complex number is zero. So zero j and the instance of the decimal classroom, the decimal will do a decimal zero. And because the decimal module in Python two doesn't support equality tests with complex numbers only with integers, he would test equals zero but not against complex zero.
22:10 And so depending on the order in which the they go in there, it would, yeah, change it,
22:15 except one or the other as being equal to what's already in there and rejected and therefore you get different results. Yeah, wow, this D are definitely corner cases. These are obscure stuff, and most people won't come across. But that's the kind of question that I personally enjoy.
22:29 Well, one really interesting to think about why that would even be possible and what's happening, but also you actually found a bug or the person found a bug and you verified the bug, I guess,
22:39 in this case, I confirmed the bug.
22:40 Yeah, exactly. How often Has that happened?
22:44 It doesn't happen that often. And also, for packages, or bugs and corner cases. Python is a pretty solid piece of software. And usually, when I do find the bugs, they are usually very minor. There might be documentation, bugs, or some one of the modules in the standard library might have some unexpected behavior that could be improved. The fundamental issue, like the order of set literals being processed. That's pretty rare.
23:12 Yeah, I suspect most of the bugs that are found are Up above the standard library. They're in some external package or something like that. Right.
23:21 Exactly. So usually, far fewer people look at it. That might be a different project that has a different standard for testing. And so more likely to find in there that I would find them in Corporation.
23:32 The next one that you said was really interesting and I agree is, it's so simple. It's like, eight or nine words, the question, but it had some, but it has 3200 up votes. And the question is, what is a medic class in Python? And what do you use them for?
23:32 So this, I think, is an example of a really awesome answer. So I'm not talking about the one that's marked as accepted, but the wall below it that has the most votes. This is one user that has learned that for certain subjects, it is worth it to, to write an essay to write a full answer. And his name is Satish. And his answer for metaclasses is absolutely epic. This is why we have Stack Overflow. This is what needs to flow to the top. It's meta classes are a, an unfamiliar concept to most people. It's meta classes are the type that creates classes. So classes are a factory thing that you use to make instances of what makes classes and that's a metric loss method loss can produce it losses, it can be a mind boggling and hard to grasp concepts. But he said this does it absolute marvelous job of explaining, explaining this anyway. So he starts with explaining what classes are what objects are, and, and this goes deeper and deeper and deeper and includes not only what metaclasses are and how they work, but also includes excellent advice as to when you want to use them, which usually is not. But when you do need them, and you do need to metaclasses are a very powerful concept. And he does. I think one of the best jobs of explaining that.
23:32 Yeah, it's like a little book chapter almost, that he wrote, I think it's like 15 pages, you know, based on my screen size or whatever. That's, that's pretty impressive. And I really do, I really do think it's a great answer. Yeah, better classes are definitely interesting and useful. But I think we mostly consumed them right? More than create them as average, everyday programmers
23:32 average everyday programmers you mostly consumed and most people don't know that even existed. This is something that frameworks use the Django framework, or the WT forms framework, use meta classes heavily because they get to have access to all the extra to just sit on a class and transform them and manipulate them and make them do something magic.
23:32 Yeah, I was thinking of sequel alchemy is declarative base.
23:32 Yeah, I think Alchemist is very similar to that, again, that's a model where you define each of the fields to find something about the model that is a little bit more than just the object you just put in there. So define a class, it also models, the columns in a table, you usually want those attributes to behave differently in different contexts. Like when you are creating a new instance, or you are displaying something from the database, that moment, that class has to behave differently. And it's that complexity, it can be neatly encapsulated and hidden by using a method class.
23:32 Yeah, it's really, really wonderful. Awesome. Okay. How about accepting next question, accepting the user input until you get a valid response?
23:32 Yeah, I picked that one up, because it's a very good example of, again, what the the price community around StackOverflow does. So instead, overflow itself is a site. But then next to that we have a chat site as well. So there's a chat on stackoverflow.com, where there's various different rooms where people, the members of the community come together, to hang out or to chat about all sorts of weird things, and sometimes even about Python, and there's a Python chatroom. And because people talk about and are connected to Stack Overflow, that room, she's lots of patterns happening. So questions that keep coming up certain problems that keep coming up. And there's sometimes a class of them, like using raw inputs or inputs in your Python program to get a response from a user. Most homework starts with take an integer from the user and put it into something. And that is something many new programmers fall over and keep stumbling into. So the this particular question was written by a member of the Python chatroom together with an answer based on their experience helping people again and again and again with the same issues the same related issues, what do you do to validate input What do you do when the user has caught it wrongly? You want to repeat it? All the pitfalls all collected into one place and as a shelf answer, so this was the both the question and the answer to become one person in the start and then we find later on by the
23:32 by the members of the community. Interesting that so this one, they made the answer their own answer a community wiki. Exactly. I've seen that you can make your you can make an answer a community wiki or you can sort of keep it to yourself and you good reputation, if you keep it to yourself and you don't on a community wiki, you want to talk about that.
23:32 Yeah, the community wiki is, is definitely to be able to lower the bar on improving a post. So in this case, the community definitely felt this is something we as a community can do, because it's pretty much needed. And let's do scattered around the site. And we don't want it to be owned by one person specific. So you should feel more should feel free to actually edit this and improve it. That's why it's made a community wiki. And the moment you actually then invite more people to edit it, then you don't no longer know, who can earn more than other people. If there's upvotes on this show, that's completely taken away.
23:32 Yeah, that makes sense. Also see that it's protected, the question is protected. What does that mean?
23:32 Yes, that is just something a trick we use as moderators. But high rep users can do the same thing. When a post that this becomes very popular, what can happen is that lots and lots of new people, so people with low repetition come in and go, my solution is unique. And they don't have to have the experience to actually see that their solutions perhaps a little bit unique. Or they say, I don't quite understand, they don't quite understand yet on Stack Overflow works. And they didn't understand the answer. So they might post in an answer, but below it. Another question already requests that someone explains how the whole thing works? Well, answers are not question. So that is the one place to post this. So a question like this can attract lots of those kinds of answers. So if I have enough repetition that I can see not only the answer to their posts, which currently says 10 at the top, but most people can see only see the first 346 posts. So there's four more on that same page that has been posted before that either someone knows, have you understood that their solution is not unique. And they have basically repeated the same thing or explained it wrong. Or our questions. This one does work. How does? Yeah, interesting.
23:32 So what kind of power is do you get as you go up in reputation, their full disclosure, my reputation is like 2000. I've answered some questions. I've asked some questions, but nowhere near that I would have any exposure to the stuff that you get,
23:32 there's a quite a wide range of things that we can do. But mostly, what we like to see is that people start, when you start getting invested in Stack Overflow, and you start gaining a reputation. You get experience with how things work. And we trust you more and more. And at that moment, we can rope you in to help. So a lot of these, when once you get past the initial kind of points where you can start voting and starts adding comments which be for to keep spammers out don't give it to you immediately, you start getting into what we call community moderation privileges. So you can start helping the elected community moderators keep cycling, most of the sites is kept by the community. So you can start doing that. You're allowed to edit posts at 2000 points you started, we allow you to edit post without review. So you can go to any other posts and start fixing things. If you see a typo, fine, you have 2000 points or more, we trust you that you can make that change. At some point, you can start helping gardening tags can be marked as synonyms for one another, say, the bison tag is called several synonyms, like by full name, it was not seen as something that needed this separate tag is made a synonym of the bison tag at 200,000 points, you can start helping with that. And then you can start helping close posts or put them on hold and also reopen them again, if you feel that it shouldn't be stayed close. Maybe it was approved and fixed. So now you can be open it again, take wikis, the things that we keep a bit of documentation about each tag is called a tag weekend, that's at some points, you're allowed to edit those without review as well. etc, etc, all the way to 20,000 points. And that's the highest amount of 320 5000 points is the highest level you can get to and we give you access to internal data sets statistics, because then we think you're so invested into this, you really want to know how well we're doing. That's really cool.
23:32 Yeah, and you're a little bit past 25,008 states. So you've had those privileges for a while, but that's really great.
23:32 I think it might be a little posture, exactly like
23:32 400, I don't know, hundreds of thousands faster. That's awesome. So another question that you put up here has to do with creating type level variables and trying to initialize one of them with a list comprehension from another or something like this, right?
23:32 Yeah, this is one of those whatt moments again, I learned a lot about the bisons programming language about how scopes interact. But also a little bit about how the Python developers definitely see the language is something living and we'll make changes if that makes sense. So in this case, this is a weird interaction between scopes for classes. And functions and list comprehensions. Now, normally, when you have a function, that's a scope, and we have the global scope, so we have a local and a global scope, the things are simple, you just have functions. But as soon as you start nesting functions, or you start using functions in classes, introduce more scopes, and you have to start figuring out how you access names that exist outside of the function. So normally, the function if you use a global, it's available, because it can be found in the global namespace and as part of the normal search, but as soon as she starts using classes, a class object is also a scope. So the class definition of building your class object has a separate scope of names, and different rules apply there. Let's say you have a class that defines a class attributes. And then you define a method in that the class which computes is not apparent scope, and is not available to the function in the class because you normally would bind a function as a method and then you work with the instance or the class and there might be multiple of these. And therefore, you need to separate out the lifetime of the of the instance in the class and defining it. So the names in the class scope are not normally available to the functions that you define inside of the class that comes later, they can access those via the class object that has been created. How does a play with this competencies ask? Well, the Python developers started with seeing list comprehensions is very simple constructs, no different than a for loop. If you put the for loop in the function, all the names you use in a for loop, including the targets for the iteration. So the for name in an iterable, each time you iterate through the for loop, the name that you assigned to that you use for that gets assigned an x value, and it's in the, in the sequence, that name lives in the same scope. Everything else lives in for the function, so there's just a local name. So after your for loop is done, you see that local name, and you can still access it. So for i in range 10. In the loop Run, run from zero to nine, at the end of it, I equals nine is still available to the rest of the code list comprehensions rush in exactly the same, just a local thing in your scope. So the for loop target of that is available. The five developers then next created generator expression. So a generator expression is evaluated lazily. So you can sort of see it as a function, you first created it is an object that will behave in a certain way, but only when you make it to later on. So you deferred
23:32 execution, right. So if you want to try to say access the iteration variable in a generator, generator expression, it may have been created, but it may not have been created, depending on whether you've actually executed the generator. Right?
23:32 Exactly. So you actually a double point, you have executed your generator. So these variables don't even exist yet. So generators were given a hidden function scope, they behave just like functions, everything inside it is local to a new function. And that's hidden, so that you can execute it just like a function. And then the local variables are part of the generator objects, like a function would be, and are cleaned up separately, once a generator object is done. And the Python developers as they built this, then realize that this also really applies to this comprehension, this comprehensions can generate expressions are just basically the same thing. Except one of them is neatly executed, the other one is deferred. So in Python three, this comprehensions were given the scope two. Now why is this important? I already talked about how classes have a scope, and then the methods in the class can't access the class scope, they will have to go through the class. This applies to generate expressions and list comprehensions to Well, at least, they do in Python three and Python two, where list comprehensions are still sort of simple things without their own scope that behaves differently. So there was a discrepancy there between Python two and Python three where things changed. And most of the time, people write simple classes and the button rights class attribute and inviter methods, and they don't think about trying to access things that that that they used to find for classes to build when you're writing a C list comprehension to make a attribute of your class, it's totally very surprising that you can't access anything else you already defined for that class, because it's a scope that is not available to you.
23:32 I think this is a really interesting example of understanding why this was changed. Do you know if if the Python core developers are like paying attention to stack overflow and possibly making changes to the next version of Python based on things that appear there? Or does it have to bubble up farther through some other channel
23:32 Stack Overflow is one of the inputs one of the sources where you see there's a long tradition of the Viking developers looking at how the cloud outpost is used. There's there's a long standing meaningless call to paste in tutors, where you can ask questions and people will answer those. It says It's like it's mainly a forum that answers by sending in questions. And I have seen plenty of evidence that that source over users already in forums, the core developers and how they do things, that they see these common problems. A lot of the core developers are themselves consultants that that teach bison, teachers are resources that help companies use bison better. And they use those inputs to and Stack Overflow is just another input in a way. Yeah. So they definitely think that they use you know, I know that several of the Python core developers are regularly answering on Stack Overflow as well. So they definitely get that input too. Yeah, that's,
23:32 that's great. I'm sure they are. But this type of question seems like it can really clearly pull out some of those decisions. And maybe you know, if it came earlier before that was changed, maybe highlight some of the problems. Exactly.
23:32 This portion of talk Python to me is brought to you by gocd. From thoughtworks. Go CD is the on premise, open source Continuous Delivery server. With go CDs comprehensive pipeline and model you can model complex workflows for multiple teams with ease, and go see these Value Stream Map lets you track changes from commit to deployment at a glance, go see these real power is in the visibility it provides over your end to end workflow, you get complete control of and visibility into your deployments across multiple teams say goodbye to release a panic and hello to consistent predictable deliveries. Commercial support and enterprise add ons, including disaster recovery are available. To learn more about gocd visit talk python.fm slash go CD for a free download. Let's talk Python, FM slash g OCD. Check them out, it helps support the show.
23:32 Speaking of newer versions of Python, the next question is why is Python threes superfunction magic super method? Yeah,
23:32 but this difference, they came also from feedback for how things work and don't work for bison. So the super function, I'm going to introduce it real quick is the way you can access methods on parent classes. So when you override in a subclass a method and you still want to use the functionality of the original, you can use super to access methods on classes in your method resolution, or to show the the linearization or the all your base classes. The notation means that we put all your classes in a specific order and that order is always fixed and predictable. So then you played so close, and you want to put an on the on the init method in it. And you want to initialize a few more attributes that you're a more specialized subclass uses. But you still want to use the original init method you use super to get there right now, the way bison inheritance works already mentioned the method of evolution or you want to be able to find which classes next in line, so you can call it's in it methods. When you're using multiple inheritance and diamond patterns, it's not always that clear, but Python solves that for you. So you don't have to worry about it if you use the super, super method, because it will take care of finding the next class for you.
23:32 Right, there's a predictable well known order in which it will search for is you're trying to call on the Superman.
23:32 Exactly. So that we call it a method resolution order. That's the order in which the next one will come about. But for a super to be able to know where it is. Because your class could be could be subclass and sub class and subclass further to free to know where on the method resolution order for the current objects. Because that method is not just a current class, it's for the current instance, that may use a subclass, it has to know where on that line, you are, at this point, this point to this method. So if you have a folder on an essence, and you want to call the next one in line, you need to tell super, I am foo. So you can search past that point you can find foo in the line of other classes in the whole method resolution order and find the next one in like just like
23:32 your current type, the type of object you are isn't enough because you could have called down into some intermediate level of inheritance, which is then calling super or something like it can get really complicated, right?
23:32 Exactly foo could have been subclass to a bar and bar could have called food and it's fire the super trick and then shelf the type of shelves bar and therefore you can't use that. You can't say type of self is bar and therefore I'm going to start searching from there because you're no longer in the barkcloth here and now in the full class, he explicitly invites in to have to tell it, start searching here, that's the first argument is the current class. second argument is the current instance. So you can find the MRO from that. That means you're constantly repeating yourself. So you're writing class foo. And in in all the methods that need to use super, you keep saying, This is class food. This is class food. This is class foo. So that violates the do not repeat yourself principle, the DRI principle, there are other tricks in Python, like using a class decorator, or simply storing your class in a different name classes or objects. So you can give can use a new variable, a new global variable, call it spam, and assign food to it. And then you can call spam. And so needless called spam. So you cannot also at that moment guarantee that the old name still exists. So what might happen if you had a class foo, and you have your you superfood everywhere in there, it may be that moment, if you call for on the internet, the class is no longer available under the name foo, it might be called something different, or something else might have been assigned to foo. So now you have a problem. Because when you created the class and was called one thing, and it's called something else, and the names no longer match, and then you have problems, yes. So super, super, super, super tied to the class existing on the neath the same name. And you have to constantly repeat yourself. So the Python developers trade for long and hard as to how we're going to solve this, how can they avoid having to repeat yourself, and how can it avoid, because it somehow records the current class or which to this to find this methods, so that it is no longer depending on the run on runtime circumstances where the clause still is available at the same name. And that's where bison free super comes from, right? You no longer have to pass in anything anymore, you call super, without arguments. And that moment, something magic happens. And that magic, some people feel is goes against the grain of certain pisin principles. Like there's that there should be obvious things happening that bison is very explicit about this, as Tim Peters once said, in the very famous little recorded bison philosophy, that should be one of his ways to do it. And explicit is better than implicit, super in Python three is implicit. When you don't pass in arguments, it implicitly finds its arguments. So that's why people call this magic. But the reason why they're doing it is to avoid having to repeat yourself, there's a lot of extra things going on in the hood, in that when you create a method in a class, then a closure is created, especially for before so it is as if in the function has to be defined in a context where the name on the class is available in a parent's context.
23:32 Basically, it means it captures it set up when it gets created, and then it can't really be influenced from the outside, it can
23:32 no longer be influenced from the outside. And you no longer have to repeat yourself. And passing it in was also a common source of problems, a common source of errors, where people misunderstood what needs to be passed in, and really should explicitly have to call name foo, and not take the current type of self because then they no longer be fooled and maybe a subclass. And at the time, I did a search on code search engines and found 10s of thousands of errors, or people passing in either shelf that underclass, or passing in type self thinking that's, that's good enough. I don't have to repeat myself here and type foo again, I'll just take it from the shelf. And that's, that's, that leads to problems. Because when you have a subclass that also uses super, you get an infinite loop, because then you say flu shot plus the bar, bar calls foo, and foo passes in bar each time and therefore, you go back up in the circle, and they keep calling the same thing.
23:32 Do you actually get a Stack Overflow exception?
23:32 You get an infinite recursion error, okay. By protection from Stack Overflow.
23:32 Yeah, good. Nice. Alright. So that is really cool. And you know, I think what's really interesting is, as you know, as you mentioned in the beginning, but as we go through these, it's really clear how it's what you learn by going through this exercise of reading the question the answer or answering yourself, it's not just what is the question, what is the answer, but there's so many additional things and techniques and practices that that come through? I think I think it's great.
23:32 I really learned to understand what this module does and how it works, and how the bison evolution loop works purely from our functional Stack Overflow.
23:32 Yeah, I can see that. Alright, so another one is about Python threes range objects, really two questions and sort of relative to Python twos, which just turns into list of the things of List of the integers.
23:32 Yeah. So if I say to you have range, which is just a function that produces a list of integers, and x range, which is the separate object that models the sequence, but it's quite simple, it's quite limited, based on three day renamed x range range, because you don't really need it as a function, you can just call list on the range object and then get the same results. So range in Python three is a separate sequence object, and I love the objects, I think it's a great idea. And wonder why it took so long to make this such a first class object and to make it something that that is part of bison core, the way vinci's 9003 ranges are very lightweight, you just tell it, this is just the start, this is the end. And this is the step size, how we get from start to end. And that's all at the core. So it doesn't need to know all the numbers in between, because she can very quickly just calculate those. If you know that. You started one the end at 10. And the flipside is two and you know that three and five, and seven, and nine, are all part of that sequence, you can just calculate that, right? It's just mass. So why would you have to keep that in memory? So most people still don't realize that range objects are smart like that. So you do see lots of questions about and these are two example questions about someone being surprised social price that if you make the range really large, you think that if you start at one, and you put the endpoint at a 10 million, that's a lot numbers, right? And the human mind will see that that's 10 million numbers in that object that must take a lot of memory, that must be really slow to figure out whether or not the million number 10,000,001 is part of that range. But it isn't. It's just math, it's very easy to calculate.
23:32 Yeah. And it gives you It gives you a good look inside it because the person asking the question says, Well, I could write one with generators and yield and so on. But it takes it forever. And so your answer to it was basically, well, let's look at the actual implementation. And it's not just something that generates a sequence, but can be smarter, like it can implement Dunder contains and get item and Len, and all these different things that have actual, it behaves
23:32 just like other sequences, it's just like lists and tuples of strings. You can ask for any of the elements at any point in that range. But it doesn't have to keep them up front it can calculate and he needs to
23:32 Yeah, yeah, that's cool. It's really like, again, a really nice look inside of things. And I guess the last example we want to look at is about the order, or lack thereof, for sets and dictionaries, which is slightly changing, right?
23:32 That's exactly what's happening. And for many beginners, it is surprising that collections like dictionaries, and sets, don't have a set order as a implementation detail of how Python how bison has implemented dictionaries that comes from usually happens to people also new to programming. So then don't just come to Python, as new. And so first thing that is also usually the first thing that they come to, as a programmer, to programming. Because if you, for example, have done systems programming in C, or C, c++, most people that come from that bank in a bank run already realized this. But to make a dictionary efficient to be able to map a key to a value, you want to be able to store this efficiently and quickly find things. So once you start working on this, you realize that it looks like there's an order but the dictionary can't promise any order, because the order that the keys are put in is just a property of the storage medium that we put things in. Because we are putting a bunch of keys into a much smaller space. And then quickly being able to find if a key exists or not in such a table, because we base it on one called a hash, the property of any of these keys to be reduced to a simple number, explaining that is surprisingly often needed. Once people understand the data they can find. And we can go to another type. And they can point to things like the order dict that also exists in bison, or tell them that bison three, six, the implementation changed. So now, dictionaries actually do remember ordered just because someone found a more efficient way of building the same kind of data type. So 5236 makes dictionaries ordered, not because everybody wants to monitor them to be ordered. But because it was a more efficient way of, of storing information or price in three, six makes dictionaries an order of magnitude smaller in certain cases. And that's
23:32 actually super important because that's in some sense, the backing store for all instances of objects. Exactly.
23:32 Yeah, exactly. Lots and lots and lots and lots and lots of stuff in Python uses dictionaries. So you you can end up with quite a number of those. So making them smaller is a very good idea. Yeah. Through this question is really, like you need to understand data structures to understand what is happening here, because it doesn't make sense like, Well, why wouldn't I just want it to have the order. But of course, you know, at least the older implementations, you want it to be fast more than you want it to be others order Exactly. And it also helped implement a few other new things in Python three, six, like the keyword argument order can be important in certain cases, or the class attribute definition order can be important in certain cases. And they are now as a property of this new dictionary implementation also implemented. So suddenly, when you create a dictionary, create a class, all the attributes are kept in the same order that you define them, which can help form from building libraries, for example. Yeah, different. You have to do crazy tricks to make keep it order.
23:32 Yeah, absolutely. Absolutely. And MongoDB, for example, has its own dictionary type thing, because it doesn't want to reorder and entirely rewrite big parts of stuff. And there's a lot of areas where this order, actually would be kind of nice to have. So pretty cool. So those are some great questions. And I feel like I've really learned a lot I hope people have as well, just hearing you talk about them. But why, like, this is a lot of work. Why do you contribute so much time and energy to stack overflow?
23:32 I already mentioned that I learned a lot from this. I didn't start Stack Overflow because I wanted to learn so much. I started because it was part of the plone management community. And we discovered Stack Overflow early on, that's a good idea to support people that had wanted to learn applying and wanted to use loan and then sort of fell into answering Python questions as well. But I've since heard in an interview with another Stack Overflow luminary and creator of the C sharp compiler error departs on certain interview, I think, with Stack Overflow actually giving his reasoning as to why he started with Stack Overflow and, and answering questions. And that the manager actually told him that he should become an expert in certain things that would be helpful for him to become an expert in a certain subject. And that the best methods of becoming an expert in anything was to find the source of questions and start answering them. Because then you vi discover all the things you don't know yet and start learning those yourself to to try and be able to answer these things. And that really resonates with me, that's really what I'm my I'm entree on StackOverflow so much because it keeps teaching me new things it keeps pushing me into new directions and new areas of expertise that I didn't know, I could do before.
23:32 Yeah, that makes a really enjoy that. Yeah, that makes a lot of sense to me. Your answer there about the expert, especially, I think one of the things that makes somebody an expert is not that they've sat down and done programming for five years, or 10 years or whatever. It's really how you spend that five or 10 years, like, if you just do the same thing over and over and over and don't run into many challenges. And don't force yourself to continue to dig deeper, then you're not that much more of an expert than what you learned in the beginning. But if it's more of the challenges and the problems and the edge cases that have hit you and I feel like Stack Overflow, is that like, concentrated?
23:32 Yes. And it is a constant source of new challenges. And in that way can be sort of addictive as well. So you have to be careful. Sure.
23:32 So speaking of addictive, what if I like was really convinced that 2000 for my reputation was insufficient. And I decided the next week, I'm gonna take eight hours a day and just start answering questions and asking questions and just going crazy, like, obviously, my reputation would go up. But like, would there be other effects? Would people like start contacting me for jobs? Like, Hey, I saw you Really? You're doing great on StackOverflow? You answer these questions, you got to come work for us or things like that
23:32 people have used my stack overflow presence as a reason to contact me. But I think that's not really the norm. So I was looking at this from the point of view of hiring people. How would I use Stack Overflow is if I see someone that has a better reputation Stack Overflow, that it works just like a blog. It gives me a great insight into how you think and how you work and how you look at code and how proficient you are in those things. Because you have this body of things out there that we can look at or dislike open source workflow source projects. So you did a Stack Overflow profile with a good body of answers can be a real assets to let potential employers know that you know your stuff that you know a thing or two about what you're talking about, because you can actually demonstrate this which
23:32 can give you a huge advantage because a lot of times people applying for jobs can't demonstrate their skills. Exactly.
23:32 Exactly. So the second flow is actually really building proof this because they give you ability to develop your CV, or developer story, as they call it are now on the site. And that they actually have a nice side business or actually their main business, next to it to get employees connected with those people that have filled in a profile there,
01:00:17 right, the whole stack overflow careers thing, right?
01:00:20 The whole cycle flow compressing, and that does give you a little can give you a little insight into people your work and how they think and see code. So up to a certain level, getting answers on Stack Overflow, if provided to you, of course, know your stuff a bit, can help you there. If I see someone with my level of reputation, I would start asking, are they spending too much time there?
01:00:42 Will they actually do work for me? Or do they just go into work as answer Stack Overflow question? Interesting.
01:00:49 I said, keep people interested. Right? Yeah, if you would learn a lot. I think if you seek out questions, there is such a thing as the fastest gun in the West, where someone already knows the answer can type faster than you and get the first answer in. And then that gets voted. So if you're in it for the reputation and the up votes, you're going to have to wait a little while because you're going to probably be a little slower than the experienced, hence the people that can type out the answer quickly. Sure. But at the same time, at that time, when you can compare your answer to theirs, and see, Did I miss something? Did I miss a detail? Did I miss a trick? Did I? Did they use something interesting, either the standard library that I didn't know about yet? So next to that, you having to figure out how a mountain is do I know the answer? You also get to see how other people are answering. So that's that's something different than you trying to find the solution to a problem you have the you are actually looking at new questions just to come in and see how other people are answering them at the moment. You could it's a slightly different angle, you get to see how other people think about a problem. And you might learn a thing or two about that.
01:01:55 Okay, very interesting. I probably won't take a weekend go work on Stack Overflow. But it's interesting to think about what would happen if I did
01:02:02 it is certainly an interesting thing you can do for half an hour, or you may be sitting waiting for the bus or something like that.
01:02:07 Yep, absolutely. All right. So let me ask you one final StackOverflow question. And I've seen a couple of articles lately that felt like Stack Overflow was unfriendly to beginners, or to newcomers. And you had an interesting thought on that
01:02:21 there's two angles here. First of all, is to there's often a mismatch of expectations. What Stack Overflow is for Stack Overflow is often seen as a personal Help Desk, something you can click to put the question on, and they'll help me fix my fix my problem, but, and then get upset, because that question might get put on hold or down voted because they didn't meet the expectations and will stack overflow has, which is to build a knowledge base to build a repository of good questions and even better answers that help future visitors. So if you forgot to explain what inputs she gave your program, or what outputs came out of it, or what default error messages, or you show a clear lack of having actually done the research about how to solve your problem, then you might actually find Stack Overflow, very disappointing. And then there's the other thing is that people say that Stack Overflow is only for professionals, and not for newcomers, for new people, newbies new to programming or new to programming language. Again, that's not true. It's usually what happens at the same time, is that people are new to asking questions, and they don't know how to ask a good question. And that then really fires back on. So and then, of course, the fact that because we are trying to stack overflow schools, and so hugely popular, we get maybe a few thousand of such people come in every day and asking questions. So the community and the other hand is running out of steam and running out of power to keep helping each and every one of those learn about sickle flow. So a lot of people don't get the hand holding that they might expect when a new to the side. Why didn't you tell me? Well, Stack Overflow tries to automate this tracing give you the help up front, you get a lot of information about how to ask a good question. There's lots of information in the Help Center. But when you have a programming problem, and your homework is due tomorrow, not everybody reached out and then they skip by that pretty quickly. And then they say,
01:04:25 Yeah, I feel a huge difference based on whether or not the person is legitimately tried to solve the problem and you tell or if they're just being lazy, you know,
01:04:34 I do want to afford the term lazy. It is often just a misunderstanding and a certain sense of urgency can because you have this problem and you're so into this problem and trying to solve it that you can forget to see the larger picture around it's, it's not necessarily that you being lazy. Always sometimes there are a few people that do post a homework on the site, or even their their exam. questions as they sit in the exam I've seen before. I have literally seen posts where that post is nothing of a nothing more than a photograph of a paper that's clearly taken on the needs a desk. No, no, this is not an exam. This is a troll question but can get Can you give a bit to me the next 10 minutes?
01:05:21 I'm going to need that before nine o'clock because I got to turn.
01:05:25 I'm going to turn this in. Some people are lazy, but most people just don't understand what makes a good question and why we're why Stack Overflow is there. And that can lead to frustration.
01:05:39 Yeah. So I think this conversation has definitely helped everyone who's listened to it understand, like, what makes a good question and so on. So hopefully we've done a small part to reduce the frustration as on both sides, the people answering the questions, being frustrated with the unprepared homework folks coming in. I guess, I guess we'll leave it there for Stack Overflow. But Martin, that was very, very interesting. And I think reading through your answers and other people's answers and questions is definitely enlightening. Thanks for the work on that. That's
01:06:13 Before I let you go, two final questions I always ask my guests first of all favorite pie pie package. What would you recommend to people out there that they might not know about?
01:06:22 This is when I discovered that someone else used it in an answer and I love it. I use it I have been using it ever since it's called f t f y or fix this for me for you fix fix this for you. It is a library to fix module Baqi or text encoding errors. So if you ever tried to make sense of UTF, eight encoded text decoded is less than one and a V encoded to something else. The fix it for you library FDNY library will do it for you. It it automatically detects when an encoding was misapplied and fixes it for you, as well as some other common noise in incoming text. I love that. As for for color management, I love it for handling badly decoded webpages, anything like that, fix it for you. It has saved my backside a couple of times already. Now. That's fantastic.
01:07:18 All right, favorite editor if you're gonna write some Python code, or you open up,
01:07:21 if I'm going to terminal vim and Gemini on my desktop, it's Sublime Text three.
01:07:26 Nice. Yeah, I was just using Sublime Text earlier. I like it a lot. All right, how about a final call action, which people do to get more involved with Stack Overflow
01:07:34 colon and try and answer stuff. There's always there are always niche tags that need more people answering things like bison may be overflowing with people that know no other things. But if you have a specific expertise in programming, there usually is a tag for you. And finding more answers to these questions are always helpful.
01:07:54 All right, excellent. Well, thank you for all the work you've done on Stack Overflow around Python. It's It's really amazing. And thanks for being on the show. great to talk to you. Oh,
01:08:03 great. Thanks. Bye.
01:08:05 This has been another episode of talk Python to me. The guest today has been Martin Peters. This episode has been sponsored by robar and gocd. Thank you both for supporting the show. rhobar takes the pain out of errors that give you the context insight you need to quickly locate and fix errors that might have gone unnoticed until your users complain of course, fast talk Python to me listeners track a ridiculous number of errors for email@example.com slash talk Python to me. Go CD is the on premise open source Continuous Delivery server will improve your deployment workflow but keep your code and builds in house. Check out go CD at talk python.fm slash g OCD and take control over your process. Or you are a colleague trying to learn Python. Have you tried books and videos that just left you bored by covering topics point by point, well check out my online course Python jumpstart by building 10 apps at talk python.fm slash course to experience a more engaging way to learn Python. And if you're looking for something a little more advanced, try my right pythonic code course at talk python.fm slash pythonic. You can find the links from this episode at talk python.fm slash episodes slash show slash 86. Be sure to subscribe to the show open your favorite podcatcher and search for Python we should be right at the top. You can also find the iTunes feed at slash iTunes, Google Play feed at slash play indirect RSS feed at slash RSS on talk python.fm. Our theme music is developers developers, developers by Cory Smith Goes by some mix. Corey just recently started selling his tracks on iTunes. So I recommend you check it out at talk python.fm slash music. You can browse his tracks he has for sale on iTunes and listen to the full length version of the theme song. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Let's mix let's get out of here
01:09:59 standing with you My boys having been sleeping, I've been using lots of rats. pass the mic back