Technical Lessons Learned from Pythonic Refactoring

Episode #150, published Thu, Feb 8, 2018, recorded Wed, Jan 31, 2018

Episode Deep Dive Links Transcript

Does your code smell? Have a weird fragrance? It turns out code smells are a real thing and an amazing conceptualization of suboptimal design. This week you'll meet Yenny Cheung who has some practical and real-world advice on using refactoring in Python to improve your code and wash away those code smells.

Episode Deep Dive

Guest Introduction and Background

Yenny Cheung is a seasoned full-stack engineer at Yelp with a strong passion for clean, maintainable code. She has experience leading projects built on Python, JavaScript, and React, and she enjoys mentoring and sharing best practices in Python development. At Yelp, her focus ranges from improving existing codebases through refactoring to building forward-looking features and services.

What to Know If You're New to Python

If you’re not deeply familiar with Python but want to get the most out of this refactoring-focused conversation, here are some essentials:

Understand Python’s function basics: how to define and call functions with named parameters and default values.
Know basic Python data structures: lists, dictionaries, and tuples. Be aware of how mutability can lead to bugs.
Familiarize yourself with Python’s code style guidelines (PEP 8) to recognize and implement clean, idiomatic code.
Appreciate the importance of code clarity: Python emphasizes readability, so be ready to rename or restructure code for clarity rather than strictly for performance.

Key Points and Takeaways

Refactoring Matters More than You Think This episode centers on how refactoring boosts productivity and code quality, even though it doesn’t directly add new features. Refactoring means restructuring existing code without changing its external behavior. It reduces technical debt, improves developer happiness, and keeps a codebase flexible for future requirements.
- Relevant Links & Tools:
  - Refactoring by Martin Fowler
  - Working Effectively with Legacy Code by Michael Feathers
Code Smells: The Early Warning Signs “Code smell” refers to hints in your code that there might be a deeper design or maintainability issue. These smells include overly long functions, large modules, duplicated code, or mysterious parameter lists. While a code smell doesn’t always guarantee something is broken, it’s a reliable signal that the design may be improved.
- Relevant Links & Tools:
  - “Code Smell” concept (by Kent Beck and Martin Fowler)
The Boy Scout Rule and Broken Window Theory Yenny highlights a common practice at Yelp: “Leave the code cleaner than you found it.” This is often called the Boy Scout Rule. It aligns with the Broken Window Theory, which suggests that small signs of neglect (like messy code or leftover dead functions) invite deeper decay. Keeping code tidy helps the entire team maintain higher standards.
- Relevant Links & Tools:
  - Broken Window Theory background
When to Refactor: Timings and Triggers Good refactoring is iterative: do it when adding new features, when reviewing code, or whenever you notice a growing complexity. It can be tempting to keep postponing it, but waiting too long leads to an unmaintainable legacy code situation. Quick micro-refactors during daily development often prevent much bigger problems down the road.
Selling Refactoring to Management Teams often fear refactoring if it seems like “lost time” since it doesn’t yield new features. However, refactoring prevents accumulated technical debt, making feature development and bug fixes faster in the long run. Demonstrating clear time savings or risk reduction helps justify the effort to product managers and stakeholders.
Style and Consistency Tools While PEP 8 is the official style guide, overfocusing on minor rules can cause friction. Instead, rely on automated tools like Flake8 or PyLint to handle small styling details. This frees you to focus on higher-level design and code organization, reducing friction among team members.
- Relevant Links & Tools:
Code Review as a Refactoring Catalyst Refactoring is often best done collaboratively via code review. It’s an opportunity for fresh eyes to catch design issues. Colleagues might spot a long function that’s begging to be extracted or see a repeated pattern that could be centralized. This built-in accountability drives continuous improvement.
Legacy Systems: Strategic vs. Big-Bang Refactoring Tackling large, outdated codebases can feel overwhelming. Rather than rewriting everything in a “big bang,” it’s often more effective to refactor in small stages. Write tests, especially integration tests, before you begin, so you can confirm you haven’t changed the code’s behavior while modernizing its internal structure.
Testing and Mocking for Confident Refactoring Before refactoring, robust tests ensure changes won’t break existing behavior. After each significant change, re-run tests to confirm everything still works. Tools like the built-in unittest.mock or libraries such as pytest make it easier to isolate sections of code and reduce the risk of introducing regressions.
- Relevant Links & Tools:
  - pytest
  - Mocking in Python’s Standard Library
Concrete Tips: Extracting Functions & Renaming The simplest but most powerful refactorings involve extracting large code blocks into well-named functions and renaming unclear variables or method names. Python’s readability focus means naming is key, if you find yourself adding comments to explain what a function does, it may be time for a clearer name or a smaller, more focused function.

Relevant Links & Tools:
- PyCharm’s Duplicate Code Detector
- Beyond PEP 8 talk by Raymond Hettinger (YouTube.com)

Interesting Quotes and Stories

"One of the first things they tell us at Yelp is the Boy Scout rule: You don't want to hand worse code to your colleagues than you found it." -- Yenny Cheung

"Whenever I write a function, it's very few times that I can get it right the first time. After I think a bit more about it, there's always something I can improve." -- Yenny Cheung

"Software tends to rot if you don't refactor. You look away a week or two, come back, and wonder: 'Who wrote this code?' " -- Yenny Cheung

Key Definitions and Terms

Refactoring: Changing the internal structure of code without altering its external functionality.
Code Smell: A surface indicator (like a too-long function or vague naming) of an underlying design problem.
Technical Debt: The accumulated consequences of poor design decisions or shortcuts that require extra work to fix later.
Single Responsibility Principle: Each function, class, or module should have one clear job.
Boy Scout Rule: Leave the code in a better state than you found it; “always clean up camp.”

Learning Resources

Below are some additional ways to grow your Python and refactoring skills:

Python for Absolute Beginners: Ideal if you’re brand new to Python and want to build a solid foundation in the language.
Write Pythonic Code Like a Seasoned Developer: Focus on idiomatic Python and clarity-driven refactoring approaches.
Getting Started with Testing in Python using pytest: Learn how to write reliable tests to support safe, efficient refactoring.
Effective PyCharm: Master advanced IDE features like automated refactorings, duplicate code detection, and more.

Overall Takeaway

Whether you’re a new Pythonista or an experienced developer, consistent, incremental refactoring is crucial for long-term success. It enhances code readability, lowers maintenance costs, and keeps development fun and productive. By watching for code smells, applying targeted fixes, and harnessing test-driven confidence, you can keep your Python projects clean and poised for future growth.

Links from the show

Yenny on twitter: @yennycheung
Yelp Careers: yelp.com/careers/home
PyCon.DE Technical Lessons Learned from Pythonic Refactoring: youtube.com/watch?v=Yq9-b2JKUyU
Python design patterns: toptal.com/python/python-design-patterns
The Zen of Python: python.org/dev/peps/pep-0020
PEP8: python.org/dev/peps/pep-0008
Beyond PEP8: youtu.be/wf-BqAjZb8M
Cloud9: aws.amazon.com/cloud9/
Episode #150 deep-dive: talkpython.fm/150
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode #150 deep-dive: talkpython.fm/150

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Does your code smell? Does it have a weird fragrance? It turns out code smells are a real

00:04 thing and an amazing conceptualization of suboptimal design. This week you'll meet Yanni

00:09 Chung who has some practical and real-world advice on using refactoring and Python to improve your

00:16 code and wash away those bad smells. This is Talk Python To Me, episode 150, recorded January 31st, 2018.

00:23 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem,

00:42 and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy.

00:47 Keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter

00:53 via at talkpython. This episode is brought to you by ParkMyCloud and Rollbar. Please check out what

00:59 they're offering during their segments. It really helps support the show. Talk Python To Me is partially

01:03 supported by our training courses. Python's async and parallel programming support is highly underrated.

01:09 Have you shied away from the amazing new async and await keywords because you've heard it's way too

01:14 complicated or that it's just not worth the effort? With the right workloads, a hundred times speed up

01:20 is totally possible with minor changes to your code. But you do need to understand the internals and

01:25 that's why our course, async techniques and examples in Python show you how to write async code

01:31 successfully as well as how it works. Get started with async and await today with our course at

01:37 talkpython.fm/async.

01:39 Yeni, welcome to Talk Python.

01:41 Hey, yeah, very excited to be on here.

01:43 Yeah, it's great to have you. I saw your talk, you know, virtually via YouTube, PyCon.de, and that was

01:49 really, really interesting. So I wanted to make, give you an opportunity to come here and share

01:53 your technical lessons learned from a refactoring with everybody.

01:58 Yeah, thanks. It was also a great learning experience for me with that many Pythonistas

02:02 there. And I also learned a lot from people's opinions. We had a great chat.

02:07 You know, one of the things I think is a little bit counterintuitive or ironic is

02:10 probably when you see those presentations given, the person that learned the most from that

02:15 presentation is the person presenting it, right?

02:18 Yeah.

02:19 Because you had to do all the research and the thinking and it's not just what came out,

02:23 but it's like the whole experience, right?

02:25 Yeah, pretty cool.

02:27 So before we get though into the details of all the refactoring stuff that you talked about,

02:31 which is really interesting, let's talk about your story. How'd you get into programming in Python?

02:35 Yeah, that's a long story. So when I was in high school, I don't actually know programming as a

02:41 thing. So I was just really into math and sciences. So yeah, like I like the problem solving part of it.

02:47 I even liked physics as a subject. And yeah, like from there, I kind of know that I like the

02:53 application side of things because I feel like, you know, by doing those high school problems,

02:57 I can solve real world problems. Not really, but that's how I got into it. And in university,

03:03 very naturally, I just applied for engineering. And I didn't really think about pursuing software

03:10 engineering at all because I didn't know it's a profession, but there's one requirement course

03:15 that I need to take for engineering. And that was CS21, Intro to Programming. So after I took it-

03:21 What language was that?

03:21 That was in Python. So yeah, after that, taking that class, I think Python set a very high bar for me.

03:28 So I didn't really get too much into other languages anymore because of how much I liked Python. It's

03:35 such a neat and beautiful language, I would say. So yeah, pretty much after that class, I decided to

03:41 switch my major into computing science instead, and I dropped engineering.

03:45 That's really wild. So it really connected with you and you're like, I found something I like,

03:51 better. I think that's an important part of college actually, is I'm always kind of blown

03:57 away that people just go into college knowing what they want and they just do that. Because to me,

04:01 college was you go and you try a bunch of things and a few things connect and you sort of follow that

04:07 path. Yeah. So what kind of engineering were you doing before you switched into programming?

04:11 So it's general engineering. We didn't really pick the discipline yet. By the time I had to do that, I already switched out. So I didn't get to that part of it.

04:21 Yeah. I think it's a pretty smart thing to use Python as the first language to teach to people.

04:27 Because I really feel like in Python, you spend less time thinking about the syntax.

04:32 You get to think much more about the logic of the code and computer science itself.

04:37 So compared to other languages, which we learn later, like C++, C, which are great because you get more of the happenings under the hood.

04:47 I think Python really took all these distractions out from the get go.

04:52 Yeah, I agree. I think it's a really good learning language. What surprises me about Python is there's a lot of languages that are kind of simple in that regard, like Visual Basic, like six, you know, the early Visual Basics and things like that.

05:04 They were kind of like that, but they would always they would hit this really hard limit as soon as you try to do very complicated stuff with them.

05:11 And Python seems to have struck a balance where it's simple in the beginning, but it can grow into larger applications still.

05:19 Yeah, it's totally an art how to write good Python code.

05:22 Yeah, that's for sure. So what do you do these days?

05:26 Recently, I've been leading a project. So I'm a full stack engineer. I don't only code on Python. I also work on JavaScript and React stuff.

05:34 So recently we've been working on such a project. Sometimes I could be doing interviewing or mentoring some other people in the company or preparing for a Python podcast.

05:45 Of course, of course. And you work at Yelp, right?

05:48 Yeah.

05:49 Yeah. That sounds like a really fun place to be.

05:51 Yeah, I think so. So I work in Yelp and Hamburg. So not in the San Francisco office.

05:56 Here we focus more on the business side of things. So selling ads, helping local businesses find customers and things like that, which I find I find it to be very interesting.

06:12 Yeah, like I really enjoy being here.

06:14 I'm sure. Is Yelp pretty big in Europe? I only know it through the US. When I lived in Germany, I didn't actually use Yelp very much.

06:21 There are definitely like some competitors in Europe as well, but it's growing. It's getting its ground.

06:27 Yeah. And you actually have more offices in Europe, which is kind of cool, right? London and Hamburg.

06:32 Yeah. Well, the good thing about like working in a European office is you get to fly around in Europe, you know, like you get to go to all these like fun European conferences, which is very nice.

06:42 Yeah, that's one of the things I really liked about being there is you're so much closer to many amazing places to visit.

06:50 Like where I live in Portland, it's three hours to Seattle to the north.

06:54 It's 10 hours to San Francisco to the south, and it's eight hours to Boise to the east.

06:59 And those are the three closest cities. Like you're in Paris in two and a half hours, or you're in Vienna or wherever. It's really just such a different experience there. So enjoy it. That's cool.

07:10 Yeah. But like the tech scene, I would say it's like very much concentrated in the US. Like say SF, you have such a huge technical community. But in Europe, it's definitely more spread out. So you need to like fly from one place to another.

07:22 Maybe like a bigger tech hub would be Berlin. But still, it's really, really widespread.

07:27 Yeah, that's an interesting observation.

07:28 Cool. So let's talk about your topic that you covered at your talk, which is refactoring. And I think refactoring, you know, it's interesting that we've had programming.

07:41 I don't know, like what you might consider programming, maybe since the 60s, right? Do you consider punch cards? I don't know.

07:47 But we've had programming for quite a while. But refactoring as an idea didn't really come out until, I don't know, was it early 90s, mid 90s, something like that?

07:58 With the whole Martin Fowler thing? Yeah.

08:00 Yeah.

08:01 So it finally got sort of formalized an idea. So maybe just really quickly for everyone listening, because I know people kind of use refactoring and just changing your code to be better.

08:11 And maybe adding features and whatnot while you're doing it as interchangeable. So maybe just quickly give us like a definition of refactoring.

08:19 So code refactoring is pretty much the process of restructuring existing computer code.

08:24 So that's pretty much meaning changing the refactoring without changing its external behavior. And it also improves the non-functional attributes of the software.

08:33 So from the outside, it should appear as if it didn't change, but you might change the way the algorithm is implemented or something to that effect, right?

08:44 Yeah. Pretty much changing the design, like improving the design of it, but without changing the functionality of what your software is supposed to do.

08:51 Book that kind of announced that was Martin Fowler's book. And you talked a little bit about that in the internet talk as well.

08:56 And I just looked it up. It was published in 1999. I mean, that's like height of dot com boom days, right?

09:04 It seems pretty far down the line for when that came into existence.

09:08 Yeah, but this is kind of like, I don't know, the Bible for refactoring. In a lot of ways, I feel it's like a start of discussion.

09:14 But in a way, a lot of those principles can still be applied to other languages and in other situations.

09:21 So I would really like to use this book as like an introduction to what refactoring means.

09:28 But later on in the podcast, I think we'll also touch on Python might not need that many of these refactoring patterns.

09:35 Right. So one of the things that was interesting, you know, so there was a lot of stuff that came around the same time.

09:41 So Martin Fowler's refactoring concepts, we had Kent Beck and I think maybe Martin Fowler was also participating in that with sort of extreme programming, paired programming.

09:51 We had the whole design patterns in the Gang of Four book.

09:54 There were some really interesting foundational ideas brewing in that time.

09:59 And certainly the design patterns thinking.

10:03 And it seems like just from looking at Martin Fowler's work, that it's very much based on languages like Java or Smalltalk or these heavily object oriented sort of object only languages.

10:20 So I think it's interesting to compare that back to Python because it doesn't always make sense or there's problems you'll find in Python that don't appear in the list of, say, the code smells, which we'll talk about or the refactoring techniques because it's fundamentally a different way to structure code in Python.

10:37 Yeah, I would totally agree with that.

10:39 Like a lot of the problems that the refactoring patterns try to solve, I think inherently in Python's language design, it's already solved.

10:47 So we can definitely explore into that a little bit later.

10:50 Right.

10:51 And then at the same, yeah, I think another one, though, so for example, that I think is kind of an issue you might run into is what I don't know, I would call is like a long module or something, right?

11:03 Where everything is just crammed into one file and it probably should be broken apart to be more easily understood.

11:10 Right.

11:11 But that doesn't appear anywhere in the traditional literature because they're talking about things like Java where you just have one class per file.

11:19 So if the file is long, that means the class is long.

11:21 Right.

11:21 But that's not necessarily the case in Python.

11:24 I feel like there's like refactoring obviously applies to Python, but some of the traditional history of it is maybe an 80% match.

11:33 Yeah, I would agree.

11:34 Like a lot of the ideas translate, but like how you achieve that might change because of the language and how to apply that.

11:41 Yeah, for sure.

11:41 All right.

11:42 So let's start by thinking about just why, what are the benefits of refactoring?

11:47 So first of all, I think it cleans up tech debt.

11:51 And for those who are not as familiar, tech debt tends to build up when you take shortcuts during development.

11:58 There might be a better solution, but instead you took like a shorter one so that you can save some developer time.

12:04 But these tradeoffs, you need to pay pretty much like pay for your debt later.

12:08 You need to spend more time to fix the implementation, things like that.

12:12 Or maybe when you have the spec, it's not complete yet or, you know, the products change, the code changes.

12:19 So the thing you design right now might not be applicable for the future.

12:23 And in this case, refactoring really helps it get into shape again.

12:27 But I just think that it's like a good habit to have.

12:30 Whenever I write a function, like it's very few times that I can get it right the first time.

12:35 So upon like thinking a little bit more about it or look at it the next day, there has to be, you know, there can be a lot of improvements can be done.

12:43 You know, maybe you can extract it to the function, name it better, things like that.

12:47 And also, I just think that it really saves productivity because I've worked on legacy code before.

12:53 And my experience was just, you know, I come to work, I look at all these code and I, okay, finally, I understood what it means.

13:01 Okay.

13:01 The next day I'll start implementing new features.

13:04 And then I go home, like I had a nice sleep.

13:07 I came back and then I just forgot what it does.

13:09 And the code doesn't really help because it's so convoluted.

13:13 So imagine that, but like times 10, you know, the whole team pretty much does the same thing.

13:18 So if some people can be responsible for refactoring this, then everybody doesn't have to face the same problem.

13:25 It's almost like a golden rule of programming in a professional environment is you talked about the Boy Scout rule, which was let's leave the place.

13:36 In this case, the code in a better state than we found it every time we interact with it.

13:41 So that's a pretty interesting idea.

13:43 Yeah.

13:43 That was one of the first things they tell us at Yelp.

13:46 You know, it's like a common courtesy.

13:47 You don't want to hand worse code to your colleagues, you know, than you found it.

13:52 So that's where the Boy Scout rule comes from.

13:55 Yeah, that's pretty cool.

13:57 I think another sort of real world analogy applied to code that would be appropriate there would be the broken window theory.

14:05 So like for those of you who don't know about that, that's in, I believe it came out of New York City.

14:10 And there used to be, the city was kind of run down and there's a lot of crime.

14:13 And they said, what can we do to make these neighborhoods nicer?

14:15 And there was some psychologist or something that realized that just a few broken windows and a little bit of decay sort of communicated to the people of that neighborhood that, oh, this place, we don't care what it's like.

14:27 It's kind of broken.

14:28 And what does it matter if we break it a little bit more?

14:31 Whereas if you come upon really nicely factored code that's really clean and beautiful to read, you don't want to go in there and mess it up, right?

14:38 Yeah.

14:38 The expectation is if I'm going to contribute to this, this is beautiful.

14:41 I'm not just going to throw junk on it, right?

14:44 So I think that's an interesting psychology about it as well.

14:47 Yeah, that's a very good analogy.

14:48 At the same time, I like what Martin Fowler mentioned in his book.

14:52 He said he's a lazy developer.

14:54 So the fact that he's working on this code right now means that it's very likely for him to work on this in the future.

15:00 So just refactoring right now makes sure he can understand it easier in the future.

15:05 Yeah, yeah.

15:06 Yeah, a lazy developer is a good developer.

15:09 Productive laziness.

15:10 Yeah, exactly.

15:12 Yeah, so you said also, in addition to cleaning up technical debt, it'll actually save you productivity now and over time, right?

15:20 Yeah, I would agree.

15:21 Because after refactoring, supposedly your code should look simpler, increase the readability.

15:27 So in this case, the code can also be more usable and easier to maintain.

15:32 It's good for you.

15:33 And this is the good thing about Python.

15:36 Python already reads very nicely, I would say.

15:39 Like semantics is very easy to read.

15:41 So if you can write good Python code, your code pretty much self-document.

15:46 And that's a huge win.

15:48 Imagine an organization like Yelp, right?

15:51 We have over 300 microservices.

15:55 And then we have even like a monolith of code base.

15:58 There is it's impossible for you to write documentation to cover every line of what you write, right?

16:04 So in this case, if you can write code that is readable, that is easy to read, then onboarding new people.

16:10 If they if you have like a new hire and intern in your team, then they can immediately jump on.

16:16 Like it really, I think it really brings a lot of value to the team.

16:21 This portion of Talk Python To Me is brought to you by ParkMyCloud.

16:26 The last time you parked your car, did you leave it running?

16:28 No?

16:29 Well, then why are you leaving your cloud resources running?

16:32 Every year, $13 billion are wasted on cloud instances that no one is using.

16:38 Don't let any of those be yours.

16:40 ParkMyCloud automatically identifies and eliminates wasted cloud spend, saving you 65% on AWS, Azure, and Google's cloud.

16:49 You're up and running quickly with a 10 minute setup and no scripting required.

16:53 Plus, govern users and easily integrate into your DevOps process.

16:57 See why ParkMyCloud was chosen by McDonald's, Unilever, Fox, and more.

17:02 Start a free trial today at parkmycloud.com slash talkpython.

17:07 So you talked about these 300 microservices and some monolithic code as well.

17:15 How do you guys think about, say, refactoring from one larger service into microservices?

17:23 Or maybe the reverse, maybe taking, say, five microservices.

17:27 These are really not doing enough to be independent things.

17:30 We're going to smush them back together.

17:31 Whenever we build new features, definitely, it's always in the back of our mind.

17:36 Can we make that into a separate service?

17:38 Can we modularize it a little bit better so that we don't have to contribute code to the giant monolith?

17:44 But yeah, at the same time, you know, if you have more microservices, there is always overhead.

17:49 So it's really a decision that is, you need to, like, find a good balance between it, I would say.

17:55 Yeah, for sure.

17:56 There's definitely trade-offs there.

17:58 Another benefit to refactoring, I think, is maybe overlooked.

18:03 I'm not entirely sure how often, say, managers take this into account in business, like sort of decision makers.

18:09 But it seems to me, as a developer working at a company, if you always work on code that seems to have no care, no craft, it's just kind of thrown together and not well factored.

18:23 You might decide either you'll be unhappy or you might just leave.

18:26 And so maybe you are actually losing your best developers because they don't want to work with the crappy code.

18:32 And so refactoring it into, you know, for all the reasons you already talked about, cleaning up debt and stuff, it seems like it almost could be like an HR issue.

18:39 Yeah, for sure.

18:41 Like, who doesn't like to work on clean code, nicely designed code?

18:44 Exactly.

18:46 Like, I've certainly worked on some projects where I'm like, wow, this does not look fun to work on.

18:52 Like, why do I, what is even going on here, you know?

18:55 Like, crazy variable names and huge stuff crammed into one function or a store procedure or something, right?

19:01 Yeah, for sure.

19:02 Yeah, another benefit, maybe, what do you think?

19:04 Flexible software?

19:05 I think there are different thoughts into how you should write software because some people think that, you know, software is kind of like a throwaway thing because products evolve so quickly that, you know, your code probably doesn't live very long.

19:19 Like, after one year, two years, you have to rewrite it.

19:22 Things come up and then, you know, you look at code that's supposedly to be very well designed before.

19:27 But, you know, two years after it's not, you know, code tends to rot.

19:31 And so, one thing about refactoring is the thing that came out from refactoring should be easy to modify.

19:37 So, I think refactoring actually complies with this thinking, you know.

19:42 If it's easy to read, you know exactly what to change.

19:45 You know exactly what to throw away.

19:47 And that's very important.

19:48 Yeah, absolutely.

19:49 And so, a lot of the refactory stuff also applies to breaking things into smaller, more well-understood pieces, right?

19:59 Instead of one huge log function, maybe, like, a couple of functions and, like, a class that represents the data or something.

20:05 So, it actually just makes it easier to evolve and change over time anyway.

20:10 Yeah.

20:11 Like, a counter example would be CSS.

20:13 You know, you can only add lines to it and not really take it out because you don't know, like, you know, once you take out something, it breaks.

20:19 So, this is exactly the situation we want to avoid.

20:23 That's a really good way of thinking of CSS.

20:25 And I definitely noticed that because you're like, well, but what about that one page that nobody realizes there?

20:32 Or there's that admin section that you don't want to redesign that.

20:35 So, you know, I find, like, the changes go into, like, the bottom of the CSS file.

20:42 Or if it gets really out of control, like, you'll create, like, this page overrides those CSS settings and we'll include that file after this one, right?

20:50 Yeah, I think it's, like, a nightmare.

20:52 But, like, recently I'm working on a React project, which makes it better because you have CSS modules.

20:57 But, you know, again, these theories of, hey, like, modularize your code can apply to every language, even CSS.

21:04 Yeah.

21:04 Do you use anything like less or SAS or just straight CSS?

21:08 Yeah, we do use SAS as well.

21:11 But I think the good thing about CSS module is you can declare, you know, this element is what needs this class.

21:17 So, in a way, you can know exactly what is used and what is not used.

21:22 Right.

21:22 Yeah, yeah.

21:23 The dead selector stuff is really bad.

21:26 It's even worse than regular code.

21:28 Yeah.

21:29 So, that's sort of the benefits of refactoring.

21:32 When do you think we should refactor?

21:34 Like, what are some of the best times in sort of the software lifecycle?

21:39 I think, yeah, that's a very good question.

21:41 I think it's really an iterative process.

21:44 So, whenever you add code, you know, remove code, or when you're understanding code, when you're fixing bugs, those are always very good timings for it.

21:53 And another thing I would also want to point out is during code review.

21:56 So, if after this podcast, you're more into this refactoring things during code review, then maybe you can comment, hey, actually, you can do this better.

22:06 And you can raise the code quality for your entire team.

22:10 And, yeah, I would really want to push for the value of code review.

22:14 I think part of this might be, like, deciding what's important during code review, right?

22:20 Like, are you reviewing for performance?

22:22 Are you reviewing just for our code standards?

22:25 Are you reviewing for correctness?

22:28 Like, this sort of cleanliness aspect of refactoring probably would make a really good checkbox in the review process.

22:36 Yeah, definitely.

22:37 I've also read up some articles on how to do code reviews and stuff.

22:41 And your suggestion is perfect.

22:43 I think it's, you know, assigning one reviewer to review that part of the code or that aspect of writing code is very helpful.

22:51 Because, yeah, it can also come with one problem with refactoring is, say, if your whole team is all, like, very enthusiastic about refactoring.

23:01 And you can imagine how much ideas people can have, you know, asking you to rewrite the code.

23:06 And it can also get into not so great of a situation because you'll have a lot of iterations, right?

23:12 One suggestion can be having several people in the design, like, involving several people in the design phase.

23:18 So come up with something.

23:20 And then one person can be responsible to work with the person who's writing code to do refactoring.

23:26 That tends to streamline the process a little bit.

23:29 I think also there's probably what you might consider, like, micro refactorings versus we're going to change the whole architecture, right?

23:37 Like, I have some long function.

23:39 I broke it into three and I changed the parameters to be more clear versus, oh, this actually should be another service.

23:45 We should use queuing here.

23:46 And this, let's make the database NoSQL.

23:48 And, like, that's kind of, like, that's another level, right?

23:51 Yeah, I think that's definitely, like, a do-later.

23:54 Let's follow up on that.

23:57 Exactly.

23:58 But that's the thing about, like, refactoring on a legacy, a system versus a new project.

24:02 Definitely, it's much harder when it comes to a legacy system, especially code that you're not familiar with, code that some other people wrote.

24:10 There are so many things that you think you can improve.

24:13 But how do you, you know, when do you start and when do you stop?

24:16 You know, it's important.

24:17 You can't just, like, it's a black hole.

24:19 You just continue to refactoring until the end of time.

24:22 So I think it's important to time box yourself as well.

24:25 One suggestion will be, say, like, this week I spent one day on this.

24:29 And I'll wait a bit, you know, like, I'll learn a little bit more about the code.

24:33 And then next week I'll come back again and do another day of refactoring.

24:37 Yeah.

24:37 And you could probably do a couple of interesting things.

24:41 You could probably run things like Flake 8 against it or other.

24:45 There's probably not just pure refactoring problems.

24:48 There's probably a host of problems.

24:50 And so, you know, you could approach it sort of, I'm just going to make it a little bit better every day until it's not so bad.

24:58 Right, the Boy Scout rule.

24:59 Yeah, exactly.

25:00 Apply it to this.

25:01 One particularly tricky challenge with legacy systems, I think, you know, this typically happens at larger organizations, is there's some not very high profile project.

25:14 But it's somehow really important that if it stops working, it's going to be a big problem, some kind of back-end thing or something.

25:22 And the person who created it is either no longer on the project or left the company.

25:27 Right.

25:28 It's written in, like, you know, Python 2.5 or some old thing.

25:32 And it's not currently your problem, but you would like to make it better.

25:37 But you know if you break it, it's your problem all of a sudden, right?

25:40 Like, if you try to make it better and you break it, you now own it because it was working and you're the one who made it not work.

25:46 And who even knows how to deploy this thing again, right?

25:48 That can be a real quick challenge.

25:49 That sounds like a government problem.

25:51 Yes.

25:52 Yeah, I've definitely seen this at some big companies.

25:55 Yeah.

25:56 There's actually a book I want to give a quick shout-out to about this legacy system in particular called Working Effectively with Legacy Code by Michael Feathers.

26:05 And he has some really interesting ideas of how to basically take a huge existing system and partition off little parts you're going to change and make them testable, flexible, refactorable without overwhelming – without trying to, you know, boil the ocean and change everything all at the same time.

26:23 So it's pretty cool.

26:24 A lot of techniques there.

26:26 Oh, yeah.

26:26 I'll definitely check it out.

26:28 Yeah, it's really cool.

26:28 Unfortunately, a lot of it is sort of C and Java, but the ideas in there are really interesting.

26:33 I mean, some of them are so insane.

26:35 It's like we're actually going to change the way the linker works to trick the system to do certain things consistently while we're making other changes in C, right?

26:45 I mean, it's like really quite far out there.

26:47 But you're like, oh, I didn't even think that we could take it that far.

26:51 There's a lot of – I'm sure people will get good ideas from it even if it's not in Python.

26:54 Yeah, for sure.

26:55 Another problem that I think you run into around this kind of improvement but no features has to do with selling this idea to your manager.

27:05 And I think while a lot of the modern software development methodologies are really nice, like Scrum, for example, the concept of a sprint in, you know, like a two-week or a month-long sprint, you know, you're going to sign up for some work, right?

27:22 Like, well, where does refactoring fit?

27:24 Like if I'm already fully booked on time, how do I go and say I'm actually going to do only half as much because I'm going to make things better?

27:31 Like, well, we actually just need new features.

27:33 This is really important.

27:34 So forget the better.

27:35 All right.

27:35 How do you have that conversation, do you think?

27:37 Yeah, I think it's a very important conversation to have.

27:40 But the fact that you're thinking about refactoring, you know, like when you're adding new features, you know, you sense that something is wrong.

27:46 That might be an indication that we need to do it right now.

27:50 So I think the way to communicate with a product manager or with your engineering manager is, hey, if we don't do the refactoring right now, it's actually going to take six weeks.

27:59 But if we do the refactoring, you know, spend one week on it, it'll be easier to add feature.

28:04 So four weeks.

28:05 I think that usually communicates the idea across.

28:08 Yeah, I find that does work sometimes.

28:11 What I've done in certain circumstances where it was like, look, we're just really busy right now.

28:16 We just need to go fast and we'll deal with it later sort of mentality.

28:19 It wasn't super as explicit as that, but where that's kind of implied, you know what I mean?

28:26 Where it's clear that the people would much just rather have that feature right now.

28:31 But that was if that seemed like it was always the case, you know, it's sort of like if everything is urgent, then nothing is urgent.

28:39 It's kind of like that.

28:40 Right.

28:40 There is also the tech debt thing, right?

28:42 Like, you know, we need to deal with it later if we keep on building out this tech debt.

28:47 You know, it's not like we don't have to do it.

28:49 Never.

28:49 It's going to come back to you.

28:51 It's going to haunt you one day.

28:52 Yeah, absolutely.

28:53 I feel like that will work well in a place like Yelp where it's a pretty technical company.

28:58 Let's say I'm just going to completely make up a company that I don't really know whether they're technical or not.

29:03 But let's say I work for like a food production company that makes like cereal.

29:08 Right.

29:08 Like those managers probably don't know or care about what technical debt is.

29:13 They just want like the new feature for their website or something.

29:16 What I found in those situations, I was just start adjusting my estimates.

29:22 To include refactoring and testing instead of saying, well, I'm going to spend this much time on the feature and this much time on refactoring, this much time on testing.

29:30 I'm going to say this feature takes X.

29:31 Exactly.

29:33 If I'm going to work as a professional developer, that means refactoring and cleaning up technical debt and putting in tests.

29:38 And you ask how long it takes, this is how long it takes.

29:42 Right.

29:42 You just don't say it's done.

29:43 Right.

29:44 You're giving a professional opinion.

29:45 I totally agree with that.

29:47 Yeah.

29:47 So I think you've got to adjust maybe per like what kind of environment you're in.

29:52 But it's, it is a little tricky to say, I'm going to take a bunch of time and do nothing in terms of what you see that we get.

29:59 I'm going to not do anything.

30:00 Right.

30:01 But of course, it's actually making it much better for all the reasons we talked about.

30:05 Yeah.

30:05 Yeah, exactly.

30:06 Yeah.

30:07 I do think that your, your warning earlier was really interesting though, is because there's an absolute possibility that you just like go refactoring and pattering crazy and just go like, all right, we're going to keep changing this and keep changing this.

30:21 And I always see this, like it could really be unended.

30:24 So I guess one of the things that might be interesting to talk about is like, when do you know that you should refactor and how should you go about that?

30:33 Yeah.

30:33 This is a very interesting topic.

30:35 So in the same book, actually, Ken Beck coined the term of code smell.

30:39 So pretty much means it's a surface indication that usually corresponds to a deeper problem in the system.

30:47 So I would like to make a metaphor with cheese.

30:50 Sometimes like, you know, the key word of this is the indication.

30:54 So sometimes cheese can smell very strong, you know, especially those like French cheese.

30:59 And then like you think there is something wrong, but when you eat it, it's actually good.

31:03 So code smell is an indication of a problem.

31:06 But you really need to take some time to investigate into it if, you know, it's actually a problem.

31:11 So you have like a very long function, but it actually does one thing.

31:15 So, okay, perfect.

31:15 This concept of a code smell is absolutely just like captured my imagination when I first heard of it because it's so perfectly captures what is wrong.

31:25 It's like if you look at some code and your nose kind of wrinkles up, you know, like, oh, look at that.

31:30 Like if it works, it's actually working just fine.

31:33 But to get in there and to be with it is a little unpleasant.

31:36 Like that is just the perfect idea of this code smell, right?

31:40 Yeah.

31:40 Well, that's the thing.

31:41 Like, you know, any programmer can write things the machine can compile.

31:45 But then only good programmers can write code that, you know, humans can read.

31:50 Yeah, absolutely.

31:51 And so I think what's interesting about the code smells is it's not just like, hey, there's this idea of smelly code, but there are actually smells, flavors of smell, like types of smells that then are prescriptive of different refactorings, which is super interesting.

32:09 And that's kind of what we were talking about at the beginning where some of the smells are more applicable to, say, like Java than they are to Python.

32:15 But still, there's plenty of Python analogies here.

32:19 Before we get into the code smells, one thing that also Martin Fowler talks about that just I think is so perfect is he talks about the idea of code comments being deodorant for code smells.

32:31 Yeah.

32:32 I really like that analogy.

32:34 So if, you know, if you have a lot of comments on your code, then that probably indicates that you didn't write it very clearly.

32:41 That's why you need to write comments to explain yourself.

32:44 So it's important to know that, you know, when you write a comment, you should address why you're writing this, but not what you're trying to say.

32:52 If it's a what problem, then maybe you should rename it or try to use some variables to explain what you're trying to do here.

32:59 I think this might be the most important idea of this entire code smell thing is literally every time I go to write a comment, I stop and go, why am I writing this comment?

33:09 Is it really that I should just stop and rename the function?

33:12 Is the function badly named?

33:14 Are the parameters badly named?

33:15 Is the function too big?

33:17 And so I could break it into smaller pieces.

33:18 So each one can then be really clearly named because right now it seems like the name would be a paragraph, right?

33:24 All of those types of things.

33:25 Right.

33:26 Exactly.

33:26 Yeah.

33:27 And people all the time try to fix these with code comments.

33:30 And it's just like, just delete the comment, make the variable name three characters longer, but understandable.

33:35 Right.

33:36 Exactly.

33:36 Totally agree.

33:38 That's awesome.

33:39 All right.

33:40 So maybe take us through some of the various code smells and how we might fix them.

33:44 Sure.

33:44 I can like vaguely categorize them into different classes.

33:48 So first you have long and complex code.

33:51 You have useless code, coupled code and inappropriate naming.

33:55 I think we can go through them one by one.

33:57 So let's start with the long and complex code.

34:00 Sometimes in your program, you can see very long functions and classes.

34:05 That might be an indication that your class or function is not doing one thing, just one thing.

34:11 So, you know, that violates the single responsibility principle and the drive principle, you know, don't repeat yourself.

34:17 So in this case, maybe we can extract the function or classes so that everything is encapsulated well.

34:25 When you're doing this, like when you're using this technique of extracting functions, though, beware of pass by reference versus pass by value.

34:33 It's like this mistake everybody makes and you pass in a dictionary rather than just viewing, just getting it.

34:40 You're modifying what is inside and, you know, the list dictionary, there are mutables.

34:45 It's very dangerous.

34:46 It's evil.

34:47 Yeah.

34:47 Yeah, you do have to be careful about that.

34:51 I think this is probably one of the most common things you run into is just something started out small and it grew and it grew and it grew and nobody wanted to really mess with it.

35:01 They just wanted their feature in there.

35:03 And so they added a little bit more, you know, another if clause, another conditional or whatever, right?

35:09 Yeah, adding new keys to your dictionary, like you don't even know when you like pass by, you passed in the dictionary and in the end, the dictionary totally got muted.

35:17 And yeah, it's very hard to keep track in this way.

35:20 Yeah, for sure.

35:20 And some other long and complex code problems as well, say temporary field.

35:25 So that pretty much means you call a function and you cast it into a variable.

35:30 But sometimes if you're not using this variable for a few times, you can just call it in line.

35:35 Otherwise, it gets really confusing.

35:37 And when you're trying to extract functions, those are the things that can prevent you from doing it very simply.

35:42 And one other thing I want to point out is the conditional complexity, which I think most developers probably encountered.

35:50 Me too.

35:50 You know, you just want to write something very simple if else and you ended up having three, four layers of nested conditional logic.

35:58 And that's just really, really hard to read.

36:01 Yeah, I find that people do that a lot because they, how is the right word?

36:06 They're testing for success.

36:07 So I'm going to say, I'm going to do a loop.

36:11 And then if this thing I want to work on is true, I'm going to go into that part.

36:16 And then if this other condition also is true, I'm going to go in it.

36:19 You end up like almost scrolling right just to read what the code is doing, you know?

36:24 That's where the line limit came in for PEP 8, I guess, you know.

36:28 But 79 is very, very strict.

36:29 I would have to say, you know, you can probably modify it for your own need.

36:33 But that could possibly stop people from writing too long of a code.

36:37 Maybe you're trying to encompass too much information on one line.

36:41 Yeah, for sure.

36:42 Well, and if people are looking for something concrete, you're like, okay, I know this is not good, but what do I do?

36:48 Like you can reverse the if statements.

36:50 You can do what's called a guarding clause that'll say, if it's not good, either skip this time to the loop or return early.

36:57 Instead of if yes, yes, yes, if yes, if yes, if you just go, if no, return, if not this, return, if not that, return.

37:04 And then what's flat below is the actual thing you want to do.

37:07 That can really help.

37:08 Exactly.

37:08 Yeah, like totally.

37:09 Like guarding clauses is a good way to go.

37:11 And in here, I would even value simplicity.

37:14 And I would even sacrifice, you know, the shortness for the conditional complexity.

37:21 So even being more verbals, having more verbal statements, I think it would help instead of introducing the nested conditionals.

37:29 Yeah, nesting is bad.

37:30 Yeah, exactly.

37:31 I think that's also mentioned in the send of Python.

37:34 You have flat is better than nested.

37:36 So it's good to remember that.

37:38 Just import this if you find yourself three levels deep.

37:41 Yeah, exactly.

37:42 This portion of Talk Python To Me has been brought to you by Rollbar.

37:48 One of the frustrating things about being a developer is dealing with errors.

37:52 Relying on users to report errors, digging through log files, trying to debug issues,

37:57 or getting millions of alerts just flooding your inbox and ruining your day.

38:01 With Rollbar's full stack error monitoring, you get the context, insight, and control you need to find and fix bugs faster.

38:07 Adding Rollbar to your Python app is as easy as pip install Rollbar.

38:11 You can start tracking production errors and deployments in eight minutes or less.

38:16 Are you considering self-hosting tools for security or compliance reasons?

38:20 Then you should really check out Rollbar's compliant SaaS option.

38:23 Get advanced security features and meet compliance without the hassle of self-hosting, including HIPAA, ISO 27001, Privacy Shield, and more.

38:33 They'd love to give you a demo.

38:34 Give Rollbar a try today.

38:36 Go to talkpython.fm/Rollbar and check them out.

38:40 So what are some more in this area?

38:42 As we mentioned, the mutable problem.

38:45 Using dictionary as a param is pretty dangerous.

38:48 Just now we talked about why.

38:49 So I would suggest using named tuple because that's the thing.

38:53 If I see a function and I'm debugging something, I see it passing in as a param, I'll probably cry a little bit inside.

39:01 Say it's just like a location, right?

39:03 I have so much imagination in my mind.

39:06 Could it be like a latitude, longitude?

39:08 Like is it a number?

39:09 A string?

39:10 Could it be like city, country?

39:12 I have no idea.

39:13 I need to throw in the debugger.

39:15 I need to like try to run the program and see what exactly is inside.

39:20 But if you use named tuple, it's defined.

39:22 You know exactly what is inside.

39:24 And when it comes to Python 3.6, there's type annotation.

39:28 So really like no questions asked.

39:31 You don't have to guess anything.

39:32 I think it's a very good way to go to prevent a lot of the work that has to be done and prevent bugs.

39:39 I really like that suggestion.

39:41 It's great because it takes this kind of unknown thing and captures it into something that you know exactly what's there, how to access it.

39:48 And the type annotation reinforces that that is actually the thing that's going there.

39:53 Exactly.

39:54 So you don't have to worry about the, because it's immutable, you don't have to worry about it being changed.

40:00 So less bug will be introduced.

40:02 Yeah, that's true.

40:03 Yeah, very, very cool.

40:04 Another one that I want to throw in here that does not have a code smell, but I want to give it a name.

40:09 I'll run this name by you and see what you think.

40:11 You know, you talked about dictionaries and parameters.

40:14 That's kind of hard.

40:15 When I see a method that takes star args, star star, kwrgs, just like I'll just take anything.

40:21 You name it, you don't name it.

40:22 I don't care.

40:23 Just give it to me.

40:24 I'm just like, oh my goodness.

40:25 What do I do here?

40:26 Like I had such a hard problem switching data centers in S3 because I needed to change like the encryption mode.

40:35 It was like the craziest thing with the Bodo API.

40:37 And I went to look in the, it was like this.

40:40 And I went to look at the documentation.

40:41 And one of the kwrgs was itself a dictionary, which only had a name and had no description of what even goes into the dictionary.

40:49 I'm just like, oh, how do I supposed to do this?

40:51 So my proposed code smell name for those types of methods are starry calls.

40:57 Yeah, perfect.

40:58 You're coining the term today.

41:00 Starry call.

41:00 I like it.

41:01 Starry call.

41:02 There you go.

41:02 Yeah, some people call it black magic, you know, the quarks that got passed in.

41:06 Yeah, I guess like Python, the structure of Python, it really lets you do so many powerful things.

41:12 But at the same time, you really have to be responsible about it because with great power comes with great responsibility.

41:18 Yeah, for sure.

41:19 Yeah, I understand why these methods sometimes exist.

41:23 But I feel like a lot of times people are just like, well, it's easier than just like making people name the orders and we'll just let them put whatever and we'll figure it out.

41:29 It's like, yeah, but that doesn't actually help them use it.

41:32 You know what I mean?

41:32 Yeah, exactly.

41:33 So the next section is useless code.

41:35 Right.

41:36 So as you mentioned, too many comments is definitely a deodorant.

41:40 Definitely we should write down why, but not what you're trying to do.

41:44 If that's the case, then you probably should consider renaming or explaining it.

41:49 One very common one is just the duplicated code.

41:53 So yeah, adhering to the dry principle, you should extract functions or considering an inheritance.

41:59 Well, in Python, you have other ways to do it as well, which we might cover later.

42:04 It's the composition pattern, which might work a little bit better in Python's case.

42:09 And sometimes you have dead code.

42:11 Well, because your code is not modularized, it's very hard to tell if your line is being executed.

42:18 But there are some IDs that are smart enough to tell you if the code is not executed or if some variables are not used.

42:24 So that can be a good help.

42:26 Or sometimes you have lazy classes.

42:29 So you have this one class that doesn't really have any functions, only have some fields.

42:35 Yeah, like pretty much some fields.

42:37 And in this case, you can just replace it with a named tuple.

42:40 And that just makes things easier because maintaining classes takes energy, takes time.

42:46 Yeah.

42:46 And actually, named tuples use less memory than I ever give a class anyway.

42:49 So it's probably slightly more efficient.

42:51 Yeah, I would agree.

42:53 Yeah.

42:53 So one tool, these are all great.

42:54 And they all can drive me crazy.

42:56 Like I've spent untold hours getting hold of some project, looking at some method, going, I don't understand how this is working in this environment.

43:07 Like it really seems like this doesn't work.

43:09 And so I'm trying to understand this code.

43:11 And then it turns out that like after a lot of piecing stuff together, like, oh, the reason it doesn't seem to have any effect is because it's never called.

43:20 Like, oh, my gosh.

43:23 Oh, no.

43:23 Right.

43:23 Yeah.

43:24 You know, no, it's just so frustrating.

43:25 It's a little bit of the broken window syndrome.

43:27 It's just like people left it in there, but they were afraid to take it out.

43:30 Right.

43:30 Yeah.

43:31 The CSS.

43:31 Yeah, exactly.

43:32 It's like the CSS problem.

43:33 The other one I want to give a shout out to, which is really just like it's so delightful,

43:37 is in PyCharm, you can open up a huge project.

43:42 Let's say it has like 100 files.

43:43 You don't have to select anything or do anything.

43:45 You can just go to a menu.

43:46 I can't remember where it is.

43:47 But you can say find duplicate code and it will compare like blocks.

43:52 It'll just go, oh, this sort of test here is done actually in 20 places.

43:56 You could just make that a method.

43:57 Yeah.

43:58 And that's pretty cool because you don't have to guide it.

44:00 You just say go find the duplicates and it'll somehow like put that all together.

44:03 I love all these like tools that can help us refactor code.

44:06 It's awesome.

44:07 Yeah, for sure.

44:08 So what's the next section?

44:09 So we'll talk a little bit more about coupled code.

44:11 So you have something called the message chains.

44:14 Pretty much means function A calls function B and then function B calls function C.

44:20 So when one thing changes, say in the chain, right, function C changes, there is a ripple

44:26 effect.

44:26 So everything has to change.

44:28 All your functions have to change.

44:29 And for functions that have this message chain problem, it's very, very hard to test.

44:35 So in this case, you know, the productivity is just drained away from first writing the function,

44:41 understanding the function and then writing tests, which is really not great.

44:45 Yeah, that can be one of those problems where you try to make some small change down at some

44:50 lower level and it like cascades through every layer of the application.

44:54 You feel like you're changing so many files just in order to like, well, like let's take

44:59 an example.

45:00 Like I want to add an extra parameter to the creation of an object way down low.

45:04 Well, that means the method that calls that has to pass it, but the class that calls it

45:07 doesn't have it.

45:08 So its constructor has to take another parameter, which, and then just creates this like this

45:14 sort of combinatorial explosion of like, why am I just doing this everywhere?

45:17 This is crazy.

45:18 Right.

45:20 Yeah, I would totally agree.

45:21 So if we can flatten it out a little bit, maybe function A can call function B and then

45:26 function A can call function C.

45:28 That'll make it a little bit better and easier to test.

45:31 Yeah, for sure.

45:32 Or maybe even there's some other mechanism from getting that information deep down there.

45:36 Like maybe it's stored in the database instead of passed or I don't know.

45:39 It really depends.

45:40 Yeah.

45:41 Or create static functions.

45:42 Those are always great.

45:44 Pure functions.

45:45 They're dumber.

45:46 It's easier to test.

45:47 I don't know what to make the next one.

45:49 Indecent exposure.

45:50 Oh, it's very clear.

45:51 It's exposing your privates to other classes.

45:55 That sounds very indecent.

45:56 Yeah.

45:57 If your class is consistently calling like functions from another class, then it's better

46:02 to combine it probably.

46:03 So that means you have code that's too modularized in a way that maybe if they share the same

46:10 context, they should be put together.

46:12 That's an interesting comment because I feel like one of the things that's funny about

46:17 refactoring and the code smells is they often have what I want to coin as refactorial inverses,

46:25 right?

46:25 Like multiplicative inverses, right?

46:27 Like there's inline variable and there's create variable.

46:30 There's inline method.

46:31 There's extract method.

46:33 There's push this to subclass.

46:35 Pull it down.

46:37 Pull the subclass.

46:38 Push it to all the derivative ones.

46:40 There seems to be like this thing and this undoing thing often in refactoring.

46:45 And it really is context driven, right?

46:47 Yeah.

46:47 That's why I say it's an iterative process because once you have added some code, the situation

46:52 is different again.

46:53 And maybe what could be code smell before is not a code smell now.

46:57 Or if you, upon investigating, the code smell actually doesn't point to anything.

47:02 So it really, it's a constant effort, I would say, to keep your code good along the lines.

47:09 Yeah.

47:09 It's also why you can basically refactor for infinite time because you can do the thing

47:14 and then you undo the thing.

47:15 Then you do the thing in a different way.

47:17 Yeah.

47:18 So the last section you wanted to cover in this area was inappropriate naming.

47:23 Yeah.

47:23 I would love to cover this area because it's one of the three hardest problems in computing,

47:28 right?

47:29 You have cache invalidation, you have threading, and then you have naming.

47:33 That's right.

47:34 And this is something I care a lot about as well.

47:37 Like it's kind of really closely tied to the code comment stuff and so on.

47:41 Yeah.

47:41 That I would totally agree because Python is dynamically typed.

47:45 So in a way, when you create a variable, it doesn't really have to type information stuck

47:50 to it.

47:50 And in this case, you know, with great power comes with great responsibility.

47:55 We need to name things right because we kind of didn't have this, some extra information

48:00 as other languages would have had.

48:03 You know, naming variables correctly, naming modules correctly.

48:06 One thing you recommend are keyword arguments or at least calling functions in the keyword

48:11 argument style, right?

48:12 Definitely.

48:13 So in this case, you don't have to have like 10 terminals open to see the function definition

48:18 when you're stumbled upon this code, right?

48:21 Immediately, you know what is being passed into the function.

48:24 So I think it's just a more efficient way because it's always good to be explicit than implicit

48:30 by the sense of Python.

48:31 Yeah, for sure.

48:32 Another thing that I think can be challenging is sort of implicit values or magic values.

48:40 See what I mean by that?

48:41 It's like you had an example around a function that took the mood of a person and the mood

48:46 could be like one, two, three, or the sets of numbers.

48:50 And is three good?

48:51 You don't know, right?

48:52 Like it's really hard to understand that stuff, right?

48:55 Yeah, I would totally agree.

48:56 Like it's definitely like indicating the direction is important, right?

48:59 So mood bigger than three.

49:01 What does that mean?

49:01 Is it happy?

49:02 Is it sad?

49:03 So casting it into a variable called is happy equals mood bigger than three can have a

49:09 wonderful effect of documenting your own code.

49:12 For sure.

49:12 And Python recently added enumerations as well, enum classes.

49:17 And if there's only four moods, you know, making that an explicit enumeration.

49:22 So there's like a sad sort of blase or like hair, like then there's happy, super excited, right?

49:28 Like it would be really clear that way.

49:31 And that'd be a pretty good refactoring too.

49:32 Yeah, for sure.

49:33 So this is all well and good.

49:35 And I would like a better list of Python code smells to guide us, but there's still plenty

49:41 to work with from sort of existing, existing literature and writing and stuff.

49:46 So how do you go about developing your code nose?

49:51 I definitely agree on reading literature on this, but I think it's just a skill that you develop

49:56 over time.

49:57 So if you actually just look at some legacy code, you can find a bunch for sure.

50:02 And just seeing it day to day, you know, like just read more of that and try to refactor

50:09 more of that.

50:09 I guess we can learn, you know, at which point we need to refactor this.

50:13 Is this an indication of a problem?

50:16 It really comes with experience.

50:18 And also the code review process can help you learn from other people in your team or, you

50:23 know, whoever is reviewing.

50:24 Yeah, I think the code review process is super helpful if at least that's being incorporated

50:29 to the code review.

50:30 Yeah.

50:30 If you're working more on your own and, you know, there's plenty of people who work even

50:34 in companies, but there's, they're kind of more or less on their own.

50:38 They're not in a technology company.

50:39 That code review can be more or less absent.

50:42 And so one of the things that came to mind while you're thinking about this is like, or

50:46 you're speaking about this to me is, you know, there's all these code smells.

50:49 You could take like one code smell a week and say, all right, I'm going to work on the conditional

50:55 complexity problem this week.

50:56 So anytime I'm writing code, if I see, I feel myself writing like that fourth indentation

51:01 level, I'm going to like remind myself to apply this refactoring or this, this, this technique.

51:06 And you could just take them one at a time, right?

51:08 Cause they seem overwhelming altogether, but they're pretty simple by themselves.

51:12 That's definitely a good idea.

51:14 One of the dangers of refactoring, I mean, it's extra high in legacy systems, but in

51:18 general, it's a danger is that your intention is to refactor code, but your actual outcome

51:24 is you've changed code, right?

51:25 It behaves differently.

51:26 So what's the role of testing here?

51:28 I think testing is very important in the workflow.

51:32 So what I would suggest is writing integration tests first for the code you're about to refactor,

51:37 if it's not present yet.

51:38 And then during your refactoring process, you know, after you extract the function, after

51:43 you change the variable names, run it over again to check if the functionality of your code

51:48 base has changed.

51:49 So that really helps you to limit loss.

51:51 You know, like after you change one thing, it's very easy to spot out what has been changed.

51:56 And after refactoring is done, you can start introducing unit tests to test that the functions

52:02 you've introduced actually uses the right logic.

52:05 So with this workflow, I think we got our ground covered.

52:09 Yeah.

52:10 That sounds like a good, a good way to do it.

52:12 And what test frameworks use like the built-in one, use pytest, use Nose?

52:16 Yeah.

52:16 At Yelp, we definitely have a built-in one, but there is pytest as well, which is very, very

52:20 similar.

52:21 You can just assert things.

52:23 You know, there are a lot of inbuilt functions that you can use.

52:26 Also, one thing that Python is great is you can use mocks.

52:30 So that's, you know, if you have a network call that you don't want to actually make during

52:35 your testing, then you can mock things out.

52:37 And that's really, really convenient.

52:38 Yeah.

52:39 Very nice.

52:39 So maybe let's talk about some of the tools that you could apply here, because in Martin

52:45 Fowler's 1999 book, he literally shows you the manual steps at every level.

52:52 Like, okay, here, first you create this variable, you put this piece here, and then you do this

52:56 step.

52:56 And it's really painful, but there's more we can do these days, right?

53:01 I do think so.

53:02 So one thing is, you know, the styling part of it first.

53:06 So we have PEP 8, right?

53:07 PEP 8 is a Python enhancement proposal that talks about, it's essentially a Python style guide.

53:14 So how you're supposed to structure the white spaces, how do you comment things, how do you

53:20 use the string quotes, things like that.

53:23 So that there are a lot of tools that can help you.

53:25 There's the PEP 8 tool, there's Flake8 that can also help you check conditionals as well, I believe.

53:32 Does it check for things like dead code and stuff like that as well, like unused parameters

53:36 or methods not called?

53:37 Yeah, that's big.

53:38 Yeah, or PyLint.

53:39 And there are a lot of things that we can use that can be programmatic about it.

53:44 I would want to mention Raymond Hedinger's talk.

53:47 He was mentioning how PEP 8 can become a nightmare because you just have someone from the team

53:52 that bugs everyone about, hey, your trailing comma is off, you know, your white space is off.

53:58 But you know, there are actually tools that can do that.

54:01 And also, you should PEP 8 onto thyself, you know, not to bug everyone in a team about it.

54:07 It's like a style guide.

54:08 And it should surf like one, right?

54:10 Because if at the same time you're pissing off your colleagues or if readability is not

54:16 improving because of this style guide, then maybe it's not worth it to do it in the first

54:20 place.

54:21 Yeah.

54:21 And you mentioned it is beyond PEP 8 talk.

54:23 That is really quite insightful.

54:26 It's a good example of how just following strict rules can lead to actually less readable code.

54:33 Whereas if you let it slip just a little bit, but you're creative about it, there's actually

54:38 better ways or maybe even not so much that you break PEP 8, but that PEP 8 is not the end.

54:45 There's actually more important stuff to think about than PEP 8 about how you structure code,

54:49 whether it's more Pythonic and the way that it works.

54:52 It's a really good talk.

54:53 I think it's done during PyCon as a PyCon talk beyond PEP 8.

54:57 One thing he also mentioned is rather than thinking too much about PEP 8, maybe you should

55:03 think about the problem P versus NP.

55:05 So, you know, like silence.

55:07 What is the P?

55:08 What is the NP?

55:08 It's Pythonic versus not Pythonic.

55:12 That's right.

55:13 That's right.

55:14 Because you could write perfectly PEP 8 compliant, very non-Pythonic code, or you could just think

55:21 beyond it and actually make the code better.

55:23 Yeah, it's a good talk.

55:24 And it was at PyCon, I believe, in that.

55:26 So there's a video version, which I'll try to add in the show notes for people.

55:30 Yeah, I really like that talk.

55:31 Yeah.

55:32 So maybe that's a good place to leave it for the refactoring stuff directly.

55:37 But this was super interesting to talk about.

55:39 Thanks for sharing your thoughts on that.

55:41 One thing I did want to point out to people, folks ask me often about like where to get

55:46 jobs and how to get into Python.

55:48 Maybe they're doing C and they want to like move over to a place that actually does Python.

55:52 So you at Yelp, you guys are hiring, right?

55:55 Definitely.

55:56 We have so many empty seats to fill and we really want you to be here.

56:00 We use a Python backend.

56:01 We have over 300 microservices and we have millions of search requests per year.

56:07 There are a lot of interesting projects, you know, data for you to play with.

56:11 So definitely we would love to have you here.

56:14 Yeah, it sounds like a fun place that cares about code quality and craft.

56:18 Very nice.

56:18 Yeah.

56:18 And we have offices in San Francisco or if you want to go to Europe as well, we have London

56:24 and Hamburg.

56:24 Cool.

56:25 All right.

56:26 So before we get out of here, let me ask you the two questions.

56:29 So if you're going to write some Python code, what editor do you use?

56:32 Well, since I'm a full stack engineer, I would actually suggest using Cloud9.

56:37 I don't know if you've heard of it before.

56:38 So it's pretty much you can code on the browser.

56:41 It's super easy to set up.

56:42 You can just like get clone the whole repo into whatever environment you're working with

56:47 and you can pair programming real time.

56:49 And it's just seamless for full stack development.

56:52 Good for Python down to CSS.

56:54 It's great.

56:55 That's really cool.

56:56 And I really like the real time collaboration.

56:58 It's sort of like Google Docs for code.

57:00 Yeah, definitely.

57:01 And, you know, it's in the cloud, so you can pick up your progress wherever you left it.

57:06 You could code on your iPad if you want.

57:07 Perfect.

57:08 Yeah.

57:09 Yeah, that's right.

57:10 All right.

57:11 And then notable PyPI package.

57:13 Like what's one out there that maybe people don't know about that is pretty cool.

57:16 Just now we mentioned PEP 8 or Flake8.

57:18 There are pretty cool links that can help you get your quote quality there.

57:23 And yeah, I think definitely you should check out the sign of Python because I think it's

57:28 pretty much the key rules of this Python language.

57:32 There are a lot of things that we covered already, like flat is better than nested.

57:36 Simple is better than complex or visibility counts.

57:40 But whenever you make a decision when it comes to writing code, then when in doubt, import this.

57:46 Yeah, that's perfect.

57:47 All right.

57:49 Final call to action.

57:49 People are excited to get started with refactoring or do it better.

57:53 What would you say to them?

57:54 Just start right now.

57:55 Like you don't need a lot of tools.

57:56 You don't need anything.

57:57 Start reading materials.

57:59 Start learning the patterns or code smells one by one.

58:02 I think it's a fun journey ahead.

58:04 Yeah, I totally agree.

58:05 Thanks for that.

58:06 And thanks for being on the show.

58:07 It was great to talk to you.

58:08 Yeah, thank you.

58:08 It was great.

58:09 Yep.

58:09 Bye.

58:10 This has been another episode of Talk Python To Me.

58:13 Today's guest has been Yeni Chong.

58:16 And this episode has been brought to you by ParkMyCloud and Rollbar.

58:20 Do you hear that sucking noise?

58:23 That's your cloud provider making you pay for your idle instances.

58:26 Turn on ParkMyCloud, plug the leaks, and save money.

58:29 Visit talkpython.fm/park to get started.

58:32 Rollbar takes the pain out of errors.

58:35 They give you the context and insight you need to quickly locate and fix errors that might have

58:40 gone unnoticed until your users complain, of course.

58:43 As Talk Python To Me listeners, track a ridiculous number of errors for free at

58:48 rollbar.com slash talkpythontome.

58:51 Are you or a colleague trying to learn Python?

58:53 Have you tried books and videos that just left you bored by covering topics point by point?

58:57 Well, check out my online course, Python Jumpstart, by building 10 apps at talkpython.fm slash

59:03 course to experience a more engaging way to learn Python.

59:06 And if you're looking for something a little more advanced, try my Write Pythonic Code course

59:11 at talkpython.fm/pythonic.

59:13 Be sure to subscribe to the show.

59:16 Open your favorite podcatcher and search for Python.

59:18 We should be right at the top.

59:20 You can also find the iTunes feed at /itunes, Google Play feed at /play, and

59:25 direct RSS feed at /rss on talkpython.fm.

59:29 This is your host, Michael Kennedy.

59:31 Thanks so much for listening.

59:32 I really appreciate it.

59:33 Now, get out there and write some Python code.

59:35 I'll see you next time.

59:55 Thank you.