Faster Python Programs: Measure, Don't Guess

Episode #66, published Thu, Jul 7, 2016, recorded Mon, Jul 4, 2016

Episode Deep Dive Transcript

Python is a wonderful programming language that is often underestimated because it's so clear and simple. Oftentimes people mistake this simplicity for being too simple for real-programs. After all, you didn't even struggle to get your program to link against an incompatible static library or battle a DLL version mismatch in your Python app today did you?

Usually we find this simple and clear programming language to be powerful and fast. But what happens when it's not fast enough? Do you have to stop and rewrite it in C, C#, or Java?

Well before you do something drastic, Mike Mueller is here to teach us the techniques and steps to determine why our Python programs might be slow and give us some tips to make them faster.

Links from the show:

PyCon Talk by Mike: youtube.com/watch?v=JDSGVvMwNM8
SnakeViz Project: jiffyclub.github.io/snakeviz
line_profiler: pypi.io/project/line_profiler
Pympler mem profiler: pypi.io/project/Pympler
memory_profiler: pypi.io/project/memory_profiler
Python Academy (Mike's company): python-academy.com
xonsh: xonsh.org
Article: Amazon found every 100ms of latency cost them 1% in sales:
highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it

Stitcher and Talk Python Podcast: A Farewell Letter:
blog.michaelckennedy.net
The Course Everyone New to Python Desperately Needs to Take:
datadependence.com/2016/07/write-pythonic-code-like-a-seasoned-developer

Episode Deep Dive

Guest introduction and background

Mike Müller is a long-time Python developer and educator who began using Python in 1999 for scientific research in hydrology. He wrapped Fortran and C models with Python to study topics such as groundwater and lake water quality. Mike is deeply involved in the Python community, teaching courses through Python Academy, contributing to conferences like EuroPython and PyCon.DE, and working on scientific computing. In this episode, he shares methods, tools, and real-world stories on making Python code faster with careful measurement and strategic optimization.

What to Know If You're New to Python

If some performance terms or optimization tools seem advanced, here are a few key points that will help you follow along:

Be comfortable running Python scripts from the command line (the examples often use utilities like cProfile and memory profilers).
Recognize that “functions” and “loops” are essential building blocks in Python, optimizations often target how these are used.
Knowing Python’s data structures (lists, dictionaries, etc.) will help you understand the anti-patterns discussed (e.g., inserting at the start of a list).
Start small: Ensure your code is correct before trying performance measurement and tuning.

Key points and takeaways

Measure, Don’t Guess Python’s simplicity can be deceptive, sometimes it’s extremely fast, other times you need to optimize. Rather than spending time guessing where bottlenecks might be, a concrete measurement is essential. Profiling tools reveal the real hotspots, making optimization efforts more successful. This helps you avoid changing code that offers negligible gains while missing true performance problems.
- Tools:
  - cProfile (docs.python.org)
Focusing on the Right Areas Performance tuning is only worthwhile if the code section is truly a bottleneck. Often, overhead is caused by external factors like network latency or database queries, rather than Python itself. Once you identify that Python is indeed your bottleneck, you can decide if an optimization or even switching interpreters (e.g., PyPy) is necessary. This ensures the largest impact for the least development effort.
- Links / Tools:
  - PyPy (pypy.org)
CPU Profiling with cProfile and Jupyter Notebooks The built-in cProfile module is a standard, reliable way to measure function-level performance in CPython. Jupyter Notebooks further simplify using cProfile with line magic commands like %prun, so you can quickly inspect and rerun sections of code. Seeing detailed timing data per function helps you identify where to refactor or optimize.
- Tools:
  - cProfile (docs.python.org)
  - Jupyter (jupyter.org)
Visualizing Profiling Results with SnakeViz Digging through giant text-based profiling dumps can be tedious. SnakeViz takes the profiling data (for example, from cProfile) and renders interactive, colorful graphs of function call hierarchies. By simply saving your profile data and loading it in SnakeViz, you get a clear, visual breakdown of time spent in your code.
- Tools:
  - SnakeViz (pypi.org)
Memory Profiling and Avoiding Memory Pitfalls Large memory usage can degrade performance, especially if you hit swap space. Tools like Pimpler (and historically Heapy or Guppy’s hpy) can track how objects grow and shrink over time. This is key for spotting memory “leaks,” overly large data structures, or unnecessary copies in your code.
- Tools:
  - Pimpler (pypi.org)
  - sys.getsizeof() (docs.python.org)
Line-by-Line Profiling with line_profiler Sometimes you need more granularity than function-level profiling can offer. line_profiler measures time spent on each line of a specific function, revealing which operations are actually expensive. Use it sparingly (as it slows your code considerably), but it’s incredibly handy to pinpoint whether a specific line, like a list comprehension or math call, is the bottleneck.
- Tools:
  - line_profiler (pypi.org)
Beware of Anti-Patterns Common Python anti-patterns can drastically reduce speed, especially in tight loops:
- Inserting at the beginning of lists repeatedly (list.insert(0, x)) instead of appending at the end.
- Naive string concatenation with += in large loops (where joining a list is more efficient).
- Doing repeated global lookups instead of caching frequent calls (e.g., from math import sqrt in a loop vs. math.sqrt each time). These subtle patterns can lead to massive slowdowns when scaled up.
Numpy Vectorization vs. Raw Python Loops Numerical tasks in Python become performance-intensive if you rely heavily on nested Python for loops for large data sets. By switching to vectorized operations in NumPy (or other array libraries), those loops are executed in optimized C code. A once slow Monte Carlo simulation or matrix calculation can see double-digit or even order-of-magnitude speedups.
- Libraries:
  - NumPy (numpy.org)
Alternative Interpreters Beyond CPython, interpreters like PyPy, Jython, and IronPython can sometimes offer speed gains, especially for long-running and loop-heavy workloads. PyPy uses just-in-time (JIT) compilation to translate Python to machine code on the fly. If your program spends significant time in pure Python loops, PyPy can drastically reduce execution time.
- Links:
  - PyPy (pypy.org)
  - Jython (jython.org)
Testing Before and After Optimization Once you’ve identified a bottleneck and proposed a fix, it’s crucial to ensure the code still behaves the same (e.g., correct results) and to verify you’ve actually gained performance. This requires good test coverage plus repeated measurements on consistent hardware. Skipping this step can lead to “optimized” code that introduces new bugs or fails to deliver a real speedup.

Interesting quotes and stories

"Experience helps. It helps. But it's no guarantee that you get it right. You have to measure." -- Mike Müller

"Python is a wonderful programming language that is often underestimated because it's so clear and simple. People mistake the simplicity for being too simple for real programs." -- Michael Kennedy

Key definitions and terms

cProfile: A built-in module in CPython for profiling function-level performance.
line_profiler: An external tool that times each line in a specific function, offering finer-grained insight than cProfile.
SnakeViz: A utility that visualizes profiling data from cProfile as interactive charts.
Wall Clock Time: Actual elapsed time from start to finish of an operation, including waiting or sleeping.
CPU Time: Time the CPU spends actively executing code, excluding idle waiting periods (like sleep or I/O waits).
PyPy: An alternative Python interpreter that uses JIT compilation to optimize runtime performance, often speeding loop-heavy code.

Learning resources

Python for Absolute Beginners – Great place to start if you’re new to Python basics.
Python Memory Management and Tips – Delve deeper into memory-related performance and optimization.
Write Pythonic Code Like a Seasoned Developer – Learn idiomatic patterns that often lead to clearer, faster code.

Overall takeaway

Optimizing Python applications effectively begins with measuring actual performance rather than relying on hunches. By using profiling tools such as cProfile, line_profiler, and memory profilers, you gain insights into where time and memory are really spent. With this knowledge, you can tackle anti-patterns, switch to NumPy for numeric loops, or even run PyPy for a quick fix. Above all, ensure correctness remains paramount and keep an eye on the bigger picture: sometimes simply appending to a list differently, or using a single vectorized NumPy call, makes all the difference.

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Python is a wonderful programming language that is often underestimated because it's so clear and

00:05 simple. People mistake the simplicity for being too simple for real programs. After all, you didn't

00:11 even struggle to get your program to link against an incompatible static library or battle a DLL

00:16 version mismatch in your Python app at all today, did you? Usually we find the simple and clear

00:22 programming language to be powerful and fast. But what happens when it's not fast enough? Do you have

00:27 to stop and rewrite your code in C, C# or Java? Well, before you do something drastic,

00:32 Mike Mueller is here to teach us the techniques and steps to determine why our Python programs

00:37 might be slow and give us some tips to make them faster. This is Talk Python To Me, episode 66,

00:43 recorded Monday, July 4th.

00:46 Developers, developers, developers, developers, developers.

00:49 I'm a developer in many senses of the word because I make these applications, but I also use these

00:55 verbs to make this music. I construct it line by line, just like when I'm coding another software

01:00 design. In both cases, it's about design patterns. Anyone can get the job done. It's the execution

01:06 that matters. I have many interests. Welcome to Talk Python To Me, a weekly podcast on Python,

01:12 the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy.

01:18 Follow me on Twitter where I'm @mkennedy. Keep up with the show and listen to past episodes

01:22 at talkpython.fm and follow the show on Twitter via at Talk Python. This episode has been brought

01:28 to you by Rollbar and SnapCI. Hey everyone, it's great to be with you today. I have a couple of

01:34 news items for you this week. First up, Stitcher. If you've been listening to Talk Python To Me on

01:39 Stitcher, I have some news for you. Unfortunately, due to some business practices that I believe are

01:44 harmful to the podcasting space in general, I've asked them to delete my listing and remove the show

01:48 from Stitcher. If you're unfamiliar with the service, it's basically a podcasting client. But instead of

01:53 just serving up my content from my servers like most players do, unmodified, Stitcher downloads my MP3

02:00 files, re-encodes them into a low bandwidth format, and then slices it apart and inserts their own audio

02:06 ads into my show without any form of revenue share whatsoever. I don't get a penny from them for this.

02:11 To me, this is unacceptable. I did a big write-up about it and you can find the link to the write-up

02:17 in the show notes. The article was entitled Stitcher and Talk Python Podcast, a farewell letter.

02:22 Next up, on a positive note, a review. I want to say thank you to Jamal Moyer who did an excellent

02:28 review of my WritePython at Code course called The Course Everyone New to Python Desperately Needs to

02:33 Take. It's an interesting view on what's good and what's not so good about my course. I gave Jamal access

02:39 because I was interested in his opinion. I didn't know he'd write a review, but I'm really glad he did.

02:43 Thank you, Jamal. You can find the link to his review in the show notes. Jamal actually has a

02:47 bunch of great Python content, so I encourage you to go check out his site. Now, let's hear from Mike

02:52 about making Python programs faster. Mike, welcome to the show. Hello. It's great to have you finally

02:59 on the show. I've seen a lot of your presentations and I've always thought that they were really great.

03:04 Today, we're going to take one of your PyCon 2016 presentations and sort of have a conversation

03:10 around it about making Python programs faster. Yeah. Very nice to have me on the show. I listen

03:15 to most of your podcasts. It's very, very interesting. It's an honor to be here.

03:19 Thanks so much. Thanks so much. Well, let's get started by sharing your story. How did you get

03:23 into programming in Python? I know you've been doing Python for a long time, right?

03:26 Yeah. Actually, I pretty much know. I started Python in 1999.

03:30 So I worked on my PhD thesis and the task was to couple numerical models. We had existing

03:36 numerical models. One was written in Fortran, one was written in C and I looked for a language

03:41 to couple. I knew I'm not going to write it in Fortran or C. I just want to have something

03:44 else. And I looked at different languages, Java and other things. And then I somehow hit Python

03:50 and I asked on a mailing list, would be a good language to couple things. And then I got a call

03:56 from Martin van Leeuwen. He's a German, he's a Python core developer now for quite a while.

04:02 And he just talked me into Python. And that's how I got started. So I used it for my PhD thesis

04:08 and coupled models. And I got into work, which was not really clear if this would work because

04:12 numerical models from different fields. And I put them together and made them work as one

04:17 model, as one program. And it worked out pretty well.

04:20 Oh, that's really cool. What was the subject of those models?

04:22 So actually, my background is hydrology. So I coupled a lake model, a groundwater model,

04:28 and a hydrogeochemical model. So you have this, after mining, you have, you do surface mining.

04:35 And then you take something out of the subsurface, which is the purpose of mining. And then you have

04:41 a void left. And this void typically fills with water, with groundwater and forms a lake. Now you have

04:46 this, what's called the pit lake. And the pit lake is pretty different than an actual lake in terms of

04:50 water quality. And that was the purpose to find out the water quality of this lake.

04:55 And there were hydrocarbular, this lake model, which kind of takes care of the lake itself.

05:00 And the groundwater model, which takes care of the groundwater part and the hydrochemistry,

05:04 which is pretty different from a lake model. Typic lake models are very different systems determined

05:10 by algae growth and stuff like this. And now this lake is different because you have these chemicals

05:15 going in there. Actually, you have a very acidic lake most of the time that depends on the

05:19 nature of this thing. So if you draw down the groundwater table, there are some kind of directions going on.

05:23 And they take things out of the subsurface and then the groundwater moves over the stuff into the lake.

05:28 And you have a very acidic system, which is very different than the natural system.

05:32 That's what I was working on. And for this one, I need a new model.

05:35 And I cuddled it and I did it with Python. And Python worked out very well for this.

05:41 Was it controversial or a big risk that you took to try to do this with Python?

05:46 No, I was pretty free. So in academia, just the end result counts. Nobody really was into programming.

05:53 All these people are experts in groundwater and lakes and stuff. But the programming part,

05:58 I was pretty much on my own and I could decide whatever I want. And nobody asked how I do it.

06:02 Just as long as it works, it works.

06:03 Yeah, that's cool. Yeah, I suppose probably the major alternatives were things like MATLAB or

06:08 not other programming languages necessarily.

06:10 Yeah, yes. I looked at different systems and first you have to get into this, how you're going to do

06:15 this. And then you say you can extend Python with C and then you can connect C to Fortran. And that's

06:20 how it did it actually. I wrote at this time, we didn't have tools yet. At least I wasn't aware of

06:25 this F2Py tool, which I use now regularly. It didn't exist yet. So I wrapped every single

06:30 subroutine in Fortran with C and then every single C function with Python by hand, which is a

06:35 very tedious undertaking.

06:37 I can't imagine, but you probably learned a lot doing it, right?

06:41 I learned a lot. I learned a lot. I learned a lot about the C API and I tried to keep it very simple,

06:46 which was still complicated enough because for every function, I had to study this,

06:49 how to do it and looked at some examples. And this time we didn't have stack overflow or anything

06:54 like this. You know, everything was just, if you email us, if best, and everything was much,

06:59 much slower than nowadays in terms of getting help on the internet.

07:02 Yeah, absolutely. Learning to program or learning the right techniques was completely different back

07:07 then, wasn't it?

07:08 Yeah.

07:08 Yeah, cool. So we're going to talk about making Python programs faster.

07:14 Yes.

07:15 Yeah. And you did a really great presentation at PyCon and we'll kind of go through the details

07:21 there. But in general, you know, how much does performance matter? Like on one hand, it might be

07:28 great to have faster code, but on the other, maybe we just get more VMs or pay a little more for our

07:33 cloud bill or something like this.

07:34 The answer is clear. It depends. It's very, very depends what you're going to do. For a lot of people,

07:39 Python is plenty fast. So I think for most cases, actually, Python is pretty fast. For most web

07:43 developers, I talk to them and they're pretty happy with it because most of the time the database or the

07:49 network or something, it's what causes the problem. Most of the time, Python is fast enough for this.

07:54 But I also teach a lot of scientists and engineers and depends what they're doing. If they just move data from A to B,

08:01 then Python might be fast enough. But if they do real simulations, then Python is way too slow for a lot of things.

08:06 So it depends very much what you're doing. And if they have the right case, then you need to do something to make Python

08:13 faster. That's one thing. But having a faster is always a good thing. Typically, faster is better than slower.

08:18 And sometimes it doesn't take a lot of effort to make things faster. That's one case.

08:22 So sometimes it's just making things a bit nicer and as a side effect, making it faster.

08:27 And the other thing is you put a lot of effort in to make it faster. So you make it actually more difficult to

08:33 understand, but you make it faster because you need it. I think these are two areas here.

08:38 Yeah, I think you're right. Certainly the computational stuff is really, really important. And I suppose with the

08:45 websites, it depends a little bit if you're a web developer on what is fast enough and how fast you have to be.

08:51 There was a really interesting study. I think it was by Amazon. It was something around like the price of latency.

08:58 And it said something like for for every hundred milliseconds that our site is slower, we lose, you know, some percentage, like

09:07 several percent of business, maybe one, like something like one percent or something of business. And

09:14 you know, if you have a lot of traffic that also really matters, right?

09:17 Yeah. So performance matters. Just a question if Python is the one that that is a bottleneck for the

09:22 performance, it's not totally clear. There can be many other things that cause your site to be slow.

09:27 It doesn't have necessarily the programming language. There can be other considerations.

09:31 Absolutely. For example, if you've got a million records and you're doing a non-indexed query on your

09:36 database, it probably doesn't matter what language you're using.

09:38 Yeah.

09:39 Yeah.

09:39 All right. Cool. So let's talk about your tutorial. What was it called?

09:44 Yeah. It was faster Python programs, measure, don't guess. So you should to stress actually on the

09:50 measuring part. I started out actually in 2007. It was my first two tutorials at PyCon US in Dallas,

09:58 Texas, my first PyCon. And I gave two tutorials right away, which was amazing. And actually it was

10:02 a two part one. The first one was more or less what I have here, the measuring. And the second part was

10:08 actually extending Python with other languages, like C extensions and using other tools there.

10:14 And now it's just the first part. And this is stress actually in measuring, just getting a grip on your

10:19 system to find out what's going on and where are potential points to improve. I think it's very

10:24 important to have something to quantify what's going on and see where the problems are and then

10:29 maybe come up as a solution. And that's very important to measure things and to make it quantified.

10:35 You just have a gut feeling it's slow enough to do something that's not really going to work.

10:38 Yeah, I totally agree. When I started out doing professional programming, I did,

10:44 sort of scientific visualization tools. I found that lots of people, including my intuition,

10:51 was really quite wrong often about what was fast and what was slow. Maybe have like some really

10:57 complicated like wavelet algorithm. You think, oh, this has got to be just super slow and some other

11:03 place where you're just working with a basic data structure. And it turns out like the majority of

11:07 the time you're fiddling with the data structure or something like this, right? So this measurement

11:11 idea is really important.

11:12 Yeah, I can confirm that. I do it for quite a while and most of the time I have a kind of feeling

11:17 what's going on. But very often actually I'm wrong. So for some reason there's something else I didn't

11:22 think about it, which is very clear afterwards. You say, of course, that's the reason, but you didn't

11:26 think about it beforehand.

11:27 Yeah, and that's coming from somebody who spent a lot of time doing this optimization, right?

11:33 So even with the experience, sometimes it still can be hard.

11:37 Experience helps. It helps. It helps. But it's no guarantee that you get it right. So you have to

11:42 measure.

11:42 Right. And of course, you can look at code. You can say, well, I'm sure that this version of that code

11:46 is faster and this version of the code is slower. But if it's 10 microseconds versus 20 microseconds,

11:52 who cares? It really matters how you're using the code as well.

11:56 So one thing I thought was interesting when you talked about your PyCon tutorial was that

12:01 you said, we're going to use Python 3. And I've been on a big push to evangelize Python 3

12:07 and so on lately as well. And what was the reason you said it?

12:11 Yeah. So of course, I think you have Python 3 and you have legacy Python. So I would like

12:16 to use the current version of Python 3. I always, because I teach Python for a living, and I always

12:22 use Python 3 whenever possible. Because when you're outside in the real world, there's

12:26 still a lot of systems running 2.7. But still, I like to use Python 3 for teaching. And so

12:31 that's a nice way to do it. And if you want to do Python 2, then it's not that difficult

12:35 to write programs with the same source code that run with Python 2 and Python 3 if it just

12:39 takes a few steps. So it's all this deprecated stuff in Python 2. And you're mainly there.

12:44 It pulls from future prints, something like this. And then you are there. So most of the things

12:50 you can write. And that's why you are ready to run in Python 3. And you still have to

12:54 support Python 2 for some reason. And you still can do this. And that's very important, because

12:59 otherwise, you might redo a lot of work later on when you now kind of focus on Python 2.7,

13:04 even though all your colleagues are still on 2.7.

13:07 Yeah, absolutely. I agree with you. I think that's great. You know, from futures import, was it Python?

13:12 Sorry, print function. And then maybe, maybe range equals x range.

13:18 Those kind of things.

13:19 Those kinds of things. And then you can write your Python 3 code in a way that is compatible

13:24 with Python 2, long as you, the other way, you can get yourself more into trouble, I think.

13:28 Yeah.

13:29 Yeah, very interesting.

13:30 The PyData balloon, I just gave it a troll about writing code that runs with Python 2 and

13:34 3 at the same time. And actually, I think the best thing is use the Python future org library.

13:39 Maybe you're familiar with this. This gives you a lot of whatever you said, but on a much higher

13:43 level, there are a thousand tests on it. And instead of making your own kind of compatibility,

13:48 they just use this library and they take care of a lot of details, including rearranging the standard

13:54 library imports and stuff like this, which can be really useful. So you pretty much write Python 3 code

13:59 that also runs on Python 2. So in Python 3, nothing changes, but on Python 2, you get all this

14:06 compatibility stuff in there automatically, which is very nice. And I think it's a good solution.

14:11 Yeah, that's fantastic. And in 2020, coming up, there's going to be no more Python 2 support.

14:17 Yeah, yeah.

14:18 And that seems like that's way in the future. It seems like there could be a science fiction

14:23 show set in 2020, but that's like three and a half years away.

14:26 Yeah, I still remember 10 years ago, like yesterday, so it's not a long time.

14:31 Yeah, that's right. That's right. Let's dig into some of the topics of your tutorial.

14:36 You started out by saying there are some general guidelines for measuring and understanding

14:41 performance.

14:42 The first thing is, it's the execution. How fast is fast enough? So the customer says,

14:46 do I have a case in the first case? So do I need to do something? And that's a lot of things.

14:51 Just look at your system and say, do I need the performance? What am I going to do with it?

14:56 It's a one-way thing, and I'm going to throw away. It doesn't matter if it runs half an hour or one

15:00 hour if I throw it away. So if I spend another day improving it, just to be half an hour faster and

15:05 never use it. Like scientists write a lot of one-way just moving data from here to there or something.

15:10 That could be the case. Or the other, you have to have some realistic use cases. Is this really

15:15 too slow? What am I going to do with it? Those kind of questions. It seems very common sense,

15:20 but actually most of the things are common sense. You just need to think through. Do I have a case?

15:24 Do I need to improve it? And look through. And then again, as I said, if you're in a web

15:28 application, maybe it's a database. Check how fast the database is. The network connection is really

15:33 Python. Can you gain anything out of it? And how much can you potentially gain?

15:37 Sometimes you can say, okay, I can gain not more than 20% anyway. Is it worth the effort?

15:42 If you have to pay for the CPUs and you use a lot of CPU, maybe 20% is worth the effort.

15:48 If you just wait overnight and if it takes six hours or seven hours to calculate,

15:53 so scientific calculation can easily take as long. It doesn't really matter because it runs

15:58 overnight. It doesn't matter if it's finished in the morning. It's fine. Those kinds of things.

16:02 Those are very common things. So those things you need to maybe reiterate after a while,

16:06 then you do something if you have a case. That's the first thing.

16:08 Right. And then the next one you said was sort of the whole story around premature optimization.

16:13 Like don't optimize as you go. Write your code in the most understandable way

16:17 so that it works. And then think about performance, right?

16:21 Yeah. First make it right, then make it fast. That's very important.

16:24 Yeah. Yeah. And the make it fast part is all about the measurement. Yeah?

16:28 Yeah. The first thing that I've ever focused here on the measurement, how to measure.

16:33 There are several tools out here. I use most of the time C-profile, which comes with Python,

16:39 which is a good tool. The other tools now actually are the Intel is doing a big thing. I just have this

16:43 beta stage kind of profiler, which is probably quite a bit better because when you measure something,

16:50 you always influence your system. There's no way out of it. So the problem, the question is just how much

16:56 you influence the system. If it's 1%, it seems to be okay. If it's 100%, it doesn't sound very right.

17:02 Yeah? So the question is you don't know how much you influence the system most of the time.

17:07 And then you want to have some kind of profiler that doesn't interfere too much with your things,

17:12 especially if you measure time and things are very short, then the timing might be too coarse to give

17:19 you some reliable results.

17:21 Right. Or the overhead to time, very, very small function calls might be 10 times the cost of the

17:26 function calls themselves.

17:27 Yeah. And that's kind of tricky how to do this, how to measure this actually.

17:32 And you can't avoid this, right? This is sort of the Heisenberg uncertainty principle kind of like

17:37 it behaves in one way until you observe it, then it behaves in a different way.

17:41 So it seems really important to me to just look at the difference, to measure one version,

17:46 change your algorithm and measure it again, rather than try to just, you know,

17:51 poke at it and say, well, I ran this test and now this is slow, right?

18:08 This portion of Talk Python To Me has been brought to you by Rollbar. One of the frustrating things

18:13 about being a developer is dealing with errors, relying on users to report errors, digging through

18:18 log files, trying to debug issues, or a million alerts just flooding your inbox and ruining your day.

18:23 With Rollbar's full stack error monitoring, you'll get the context insights and control that you need

18:28 to find and fix bugs faster. It's easy to install. You can start tracking production errors and deployments

18:34 in eight minutes or even less. Rollbar works with all the major languages and frameworks,

18:39 including the Python one such as Django, Flask, Pyramid, as well as Ruby, JavaScript, Node,

18:44 iOS, and Android. You can integrate Rollbar into your existing workflow, send error alerts to Slack or

18:50 HipChat, or even automatically create issues in Jira, Pivotal Tracker, and a whole bunch more.

18:55 Rollbar has put together a special offer for Talk Python To Me listeners. Visit rollbar.com

18:59 slash talkpython to me, sign up, and get the bootstrap plan free for 90 days. That's 300,000

19:05 errors tracked all for free. But hey, just between you and me, I really hope you don't encounter that

19:10 many errors. Loved by developers at awesome companies like Heroku, Twilio, Kayak, Instacart,

19:15 Zendesk, Twitch, and more. Give Rollbar a try today. Go to rollbar.com slash talkpython to me.

19:21 Yeah, that's very, very important. To compare to something, okay, that's this version. This

19:33 version, the difference is very small, but this version is twice as long. Why do we spend so much

19:38 effort to make it just a little bit faster? Maybe it's not, it's not, it's much easier to understand

19:43 shorter version than a longer version. And again, 10% and just say how often is this called and what

19:49 does 10% mean for my whole application?

19:52 Right. Absolutely. You've gone through and you've measured it. Now you've decided, okay,

19:57 it's time to make it faster. And you had some really good advice that you probably shouldn't

20:01 start trying to make it faster before you have good test coverage, right?

20:04 Yeah. That's a big problem, especially with scientists. So even though science is supposed

20:09 to be reproducible, test coverage is not something that a lot of scientists use. You cannot generalize

20:15 this, but very often a lot of scientists, depends how much they program, they might not even use

20:21 version control in any way or something because they're not aware of it. Some people do for sure,

20:26 but some don't. And also they don't know what measuring means just run it and look at their

20:31 results and say, that's right. So when I say tests, you want to have automated tests or something

20:37 that you can reproduce later on. And that's a big problem in scientific community. Depends

20:41 where else some scientists do this, but some don't do it all. And then if you try to increase the

20:47 performance, you change your code. And when you change your code, you might change behavior and you

20:52 don't want to change the behavior. You want to have the same result at the end, but just a bit faster.

20:56 Yeah, absolutely. You don't want it to be really fast and wrong. That would be terrible.

21:00 Yeah. But I think testing scientific results can be super tricky, right? You change the algorithm and

21:07 you've got floating point cutoffs and different heuristics and it might still model the thing

21:14 correctly, but in a slightly different way. And so you can't just do like a old value, equal,

21:19 equal, new value sort of analysis, right?

21:21 You need some help. So you need to do something like NumPy provides a few helpers. NumPy is a package

21:28 used in scientific computing quite a bit. And they have, they have offers some helpers to compare

21:32 arrays with some tolerance. So you can say, I want to compare the tolerance can be X. So it doesn't have

21:38 to be exactly the same. There can be small variations, which is totally normal if you do numerical calculations

21:43 and you do a lot of, a lot of them, then there will be some small differences, those kinds of things.

21:49 There are some tools there that make the work easier.

21:51 Yeah. I think there's a couple of really interesting speed ups that you talked about.

21:56 And one of them, I'm sure it's obvious once you know about it, but depending on your background,

22:02 where you're coming from, maybe you just have no awareness or it's not on your mind in terms

22:05 of thinking about performance, but alternate interpreters or runtimes, right? Like under certain

22:11 circumstances, you might be able to switch your code from say CPython to PyPy and get pretty

22:15 serious performance improvements, right?

22:18 Yeah, that's true. So PyPy can be very nice, especially for numerical competitions for

22:23 loops. And when you write a for loop in pure Python and use only integer or floats in there,

22:29 typically PyPy can find this out and just make it faster. So it compiles it to machine code in the

22:34 background. And very often you get close to C speed. So somewhere the range of C speed, then writing the

22:41 same loop in C, which can be easily factor of 10 faster or even more than pure Python.

22:46 Yeah. So if you need a 20% speed up and you can get your code right on PyPy, maybe you're already done,

22:52 right?

22:52 Yeah. Yeah. Very likely you're done. PyPy, of course, they have this benchmark and I think it's a very

22:58 big benchmark and they factor 6.5 times faster. So that's way more than 20% for average benchmark.

23:05 I think they don't have a single one where they are slower, actually. The worst thing is we get the

23:11 same thing. So as soon as you have a problem, it's not going to work for something like Mercurial,

23:17 where the time is in starting the interactive program. So of course, PyPy, I think it's slow

23:24 on startup. It works for things that are computationally expensive and we repeat it many times. If you do

23:29 something only twice here and there, PyPy doesn't have any way to do it. It has to be done like 10,000

23:35 times and then PyPy says, okay, it's doing almost the same thing. I can compile this with the machine

23:39 code and then you can get a speed up, which is pretty close to C. Actually, they had this one contrived

23:45 example where PyPy was faster than C because PyPy managed to inline some function calls where C just

23:52 was steadily compiled and called some library functions. And PyPy was bit faster than C, which

23:59 is of course, it's pretty contrived, but still you can get close to C.

24:02 Yeah, it's really cool to see those examples though, that that's a possibility, right? And if you use it

24:09 in the right circumstances, it's very cool. Let's suppose we've decided that we need to make our code

24:14 faster. And just to keep the conversation a little bit simpler, let's say we'll stick to CPython for

24:20 for whatever reason. So you talked about a couple of tools that we can use to understand our code

24:25 and a variety of them. And the first thing you talked about was this thing called PyStone. What's that?

24:30 Yeah, PyStone is just a benchmark, a simple benchmark. And benchmarks are always

24:33 wrong in one case, in one way, because benchmarks are designed to measure some kind of performance.

24:38 But maybe they're better than us. PyStone comes with Python. So it comes with the Python standard

24:43 installation. And you can run PyStones and this gives you a feeling of, because it's pure Python,

24:48 Python. So it runs with CPython, with PyPy, with Jison, with IronPython, whatever you want to test it

24:54 with. And you can see how fast the Python installation is. And of course, it's also connected to your

24:58 hardware. But if you run it the same machine, then you get a PyStone. You can also have this

25:04 kilo PyStone. So something can, of course, get a big number, but also gives you a time.

25:09 The bigger the PyStone, the better or the shorter the time, of course. And you can see if you run this

25:14 over this PyStone, you can have a very rough comparison of different Python interpreters.

25:20 And of course, if you run on different hardfares, the hardware will be included in this thing.

25:24 Right. For example, if you're working on your dev machine, but then you have some big virtual machine

25:29 in the cloud that you're actually going to run your code on, you could say, well, if it's taking this long

25:34 here and I can compare the PyStones from my dev machine versus production, then you get like a

25:39 little bit of a sense for the scaling factor up or down, right?

25:42 Yeah. So it's very, so it's pretty controversial if PyStone is a good benchmark. So it's one of the

25:47 benchmarks. If you want to, you can take other benchmarks. You could go for the PyPy benchmark

25:51 and see how this works or other benchmarks around to run it. But it just gives, the thing is simple

25:56 because it's there. You don't have to go and make it work because it's supposed to work

26:01 with this part of the standard library. And as far as I know, it works with giant

26:05 sin. It works with iron Python. It works with PyPy. It works with CPython from the old to

26:10 Python 2 and Python 3 everything. So you can get a feeling how fast your Python is.

26:14 And then the next thing you talked about is using C profile to measure your code.

26:19 A bunch of profilers around. One of them is profile that's written in Python, which has a lot of

26:23 overhead. So it's not really recommended. And C profile is, as the name suggests, written in C.

26:28 So the overhead is much less.

26:30 But it only works on CPython, right? It won't work on something like PyPy, will it?

26:33 It works only on CPython. So I focus this on CPython because it's more about the principles

26:39 than about the tools. In particular, I would say just get people kind of a workflow how to do this.

26:45 And if you have a different setup, you might need to use a different system. Like if you work on

26:49 Json, you might use some profilers that job-up gives you or something. I'm not an expert in this field,

26:55 but the procedure is the same. So you want to measure. And my caution is always,

26:59 those tools can be wrong. Sometimes you get strange results and you always have to question it. But you

27:06 might find a second or third way to measure pretty much the same thing and see, look at the results

27:13 correspond somehow. If you get totally different results with different tools for the same

27:18 kind of measurement, then there must be something wrong with the tool or how you apply the tool.

27:22 Maybe you just use it in the wrong way.

27:25 Right. Or even there's some really crazy overhead like you're calling a function a million times,

27:30 but the measurement of it is so much larger that it's just completely out of whack, right?

27:34 Yeah.

27:35 Yeah.

27:35 Okay. And so if I had like some method and I wanted to profile this, what's the steps? Like,

27:42 what do I do with C profile to get this rolling?

27:44 So first I shows this slow way to do it just with the standard library. So it's just import C profile. You

27:51 have to make instance of a class profile, and then you have several abilities or possibilities

27:57 how to call the method. You just can put the method in these arguments. You can also use a string,

28:02 let's evaluate a string, and then as Python code and get the results. And then you get an object that

28:09 it represents the results and it calls different kinds of print or other evaluation methods on it.

28:14 And it shows you the results. The three profile only goes by function. So the finest resolution,

28:20 you see how much time it takes to run one function call. It shows you how many times a function being

28:25 called, how long it took per call, and how long it took in total. You have a lot of ways to sort this

28:34 by the most calls, the longest times. And you can also sort who called whom. So you can have a

28:38 relationship, this function called this function. So you get some kind of a graph there.

28:43 Yeah. Okay. Yeah. That's really nice. And then you also use Jupyter Notebooks to do some cool stuff

28:48 there. Yeah. So if you're mind it easier. So Jupyter Notebook is actually the killer application of

28:55 Python, I would say. Because of Jupyter Notebooks, they have what's called %PRUN, which is kind of a wrapper

29:01 around C profile. There's no new functionality in this case, but it makes it much more convenient. So instead of doing

29:07 doing three steps, importing, making instance, you just say P run. And P run takes some like

29:12 command line arguments. So you can specify how many times you want to repeat or whatever you want to do.

29:16 And then it runs your function call and gives you all the results in the same way, which is very nice.

29:22 So that's, if you use the Jupyter Notebook, used to be called IPyce Notebook, but nowadays it's Jupyter Notebook.

29:29 You should use it anyways. It comes from the scientific field and it's used a lot. So if you go to your

29:35 PsyPy or PsyPy conference, pretty much everybody's using Notebook. But it can be useful for everybody who's

29:42 programming a person just to try something out. And this profiling means to try something out, try something

29:48 again and again. So the Notebook is great for this. And this gives you pretty much everything you can do with

29:54 the C profile, just much shorter and much more convenient. And if things are shorter, then you're

29:59 more inclined to do something. So if it makes it more convenient to do something, then you will do it more

30:05 often and you measure more often, I would say.

30:07 Yeah, the easier you can make this and the more automated you can make it, the more people will

30:11 do it, right? And with Jupyter Notebooks, you can go back and you see the cells that you ran before

30:15 and you just kind of like rerun them again if you want to retry it. And yeah, it's really nice.

30:18 Yeah. And then of course you get a table output, but then you say, you know, a picture saves more

30:25 than a thousand words. There are some nice tools to actually visualize it. And I use this one called

30:30 SnakeWiz. So you just run SnakeWiz on the output. So the C profile, you can store all this profiling data

30:37 in a file. I think just Marshall, just put there. And then you read it back with SnakeWiz, for instance,

30:43 and it gives you a nice picture. So you can look in the browser and it's an interactive thing. So it's

30:48 running a, this file is interactive. So you can zoom in and look at different things. It gives you everything

30:53 in a nice colorful picture, which is really helpful because you can see in a part of a second what's

30:59 going on, what function takes most time. So you can look at it and then you can click it and you see,

31:04 have all the information. It's nothing different than the table. But if you have a big table with

31:09 a lot of numbers, it takes you a lot of time to make sense out of it. And this diagram you get there,

31:14 it's much more convenient to use.

31:16 Yeah. Certainly when you have tons of data, almost anything like having a really fantastic

31:21 way to slice and dice and visualize it makes all the difference. And this SnakeWiz is really cool. So

31:27 the way it works is basically you run the profiler and instead of printing out the stats to the terminal

31:32 or something, you save them to a binary file. And then this SnakeWiz can like suck that up and it runs a

31:39 little local web server that then opens your browser and you cruise around in there, right?

31:42 Yeah. Yeah. Yeah. It's very nice. And you have, you have the table and you can also sort the tables,

31:47 tables and there also, so you have the raw numbers if you want to, but you also have this picture and

31:51 they have two different ones. They have one is a circle, like the outermost function is innermost

31:57 circle and that's going outwards. And you see the circles getting smaller and smaller, like fractions of

32:01 a circle. It's very nice visualization. The other one is called icicles. So instead of a circle, you have squares

32:08 of areas that are, or not rectangles actually, stacked on top of each other, which also kind of symbolize this

32:15 function is calling this function. You see where the time goes.

32:18 Yeah. That's really cool. And I think, you know, if you have a program that you care about performance,

32:23 run this through it really quick and just look at the picture and you'll probably learn something right away.

32:30 The case is if you see one big function that uses all the time, then you know where to go.

32:34 If it's very evenly distributed among a lot of functions, then it's much harder because we're

32:39 going to start. You have to look at all these functions at the end. So that's, that's the first

32:43 thing. You see, you have a case. If you all the time is in this function, then it's very worthwhile to

32:48 look at its functions. If you have 20 functions, so they all take about 5% of the time. Yeah. We're

32:54 going to start. So it's, it will be much more works to work through all these functions.

32:58 Right. It's, it's not just low hanging fruit, but it's, it's something, it's something different.

33:03 Right. So that's CPU profiling. And one of the ways your code can be slow is it's just

33:09 computationally expensive, or you could also discover, I guess, external things that are slow.

33:15 It could be, you know, the cumulative time for like a disk IO or a database call or a web service call

33:21 could be really large. And that would tell you that that thing is slow. Right.

33:24 Yes. Yeah. Actually you can do it as a C profile. C profile typically measures wall clock time.

33:29 So the time that actually elapses from the beginning to the start, but you can also also provide your own

33:35 timing function and you could do something like CPU time. So CPU time is where the CPU spends doing

33:41 something. So if you have, if you use something like time sleep in your code, say time sleep,

33:46 two seconds, and the wall clock time will be two seconds, but the CPU time will be nearly zero because

33:51 the processor is not doing nothing actually just waiting for you to do this time sleep.

33:56 So this is very important to see. So if you do input output, you wouldn't see the wait for something to

34:01 that, that it happens in time sleep. Just let the processor rest in, in terms of your process,

34:06 of course, the processor is still working, but not for your process, not what you're measuring.

34:10 And that's an important thing. So that's what I try to get across that most of the time you measure

34:14 wall clock time. Unfortunately, that difference is between windows and Unix systems. What if you,

34:19 especially in Python 2, if you just use a time.time, which gives you a timestamp on Unix,

34:26 which is nice on windows, it's way too close to do anything useful. Therefore you should use time.it

34:31 default timer. So something I was like to get across because we very often, you see a lot of online

34:37 tutorials that use time time, which is totally fine on Unix. But if somebody, somebody is trying this in

34:42 Windows, you might not get useful results because it's pretty coarse. And also there's a time.clock, which is much faster

34:49 on Windows means measuring time on Windows, but measuring CPU times on Unix. So it's pretty, pretty messy

34:55 here. So you might mix things up. You measure, might measure the wrong thing depending on your platform, which is

35:01 not, it's not good. And therefore I give you like a small helper function that gives you also the CPU time on, on

35:07 Windows. That's completely not obvious that the time is evaluated difference and sometimes even means

35:13 something different on the different platforms. Yeah. Actually, I think it has a reason because it's

35:16 just a thin wrap around C library. The C library is different on the platform. In Python 3, it's better.

35:21 There's a time perth time, I think, which kind of attracts this away a little bit. So it's getting better.

35:26 There's another reason I use Python 3. So you have a better abstraction of this, platform abstraction of this measurement.

35:48 Continuous delivery isn't just a buzzword. It's a shift in productivity that will help your whole team become

35:53 more efficient. With SnapCI's continuous delivery tool, you can test, debug and deploy your code quickly and reliably.

36:00 Get your product in the hands of your users faster and deploy from just about anywhere at any time.

36:06 And did you know that ThoughtWorks literally wrote the book on continuous integration and continuous delivery?

36:11 Connect Snap to your GitHub repo and they'll build and run your first pipeline automagically.

36:17 Thanks SnapCI for sponsoring this episode by trying them for free at snap.ci/talkpython.

36:23 So that lets us measure things like web service calls and databases as well, which is really interesting.

36:36 But sometimes it's more of a memory pressure or a memory issue. Like maybe the reason our code is slow is because actually we're running out of memory and it's like going to page on memory on disk for the virtual swap files and all that kind of stuff. Right? So can we profile memory as well?

36:52 Profiling memory is not the simplest thing, but there are some tools out there. And I used to use Hippie, which is part of this GUPP project. But as far as I know, it's only Python 2 so far, if I'm not mistaken.

37:07 Profiling memory is not the same. But there's another project called Pimpler, which is also a merger of three different projects in the past, and this supports Python 3.

37:13 They do pretty much the same thing. And if you have the chance, actually, you can use both of them to compare the results with the Re, which is always good.

37:20 Profiling memory is also a Pimpler, which is always good.

37:22 Profiling memory is also a Pimpler, which is always good.

37:24 Profiling memory is also a Pimpler, which is always good.

37:26 Profiling memory is also a Pimpler, which is always good.

37:28 Profiling memory is also a Pimpler, which is always good.

37:30 Profiling memory is also a Pimpler, which is always good.

37:32 Profiling memory is also a Pimpler, which is always good.

37:34 Profiling memory is also a Pimpler, which is always good.

37:36 Profiling memory is also a Pimpler, which is always good.

37:38 Profiling memory is also a Pimpler, which is always good.

37:40 Profiling memory is also a Pimpler, which is always good.

37:44 Profiling memory is also a Pimpler, which is always good.

37:46 Profiling memory is also a Pimpler, which is always good.

37:48 Profiling memory is also a Pimpler, which is always good.

37:50 Profiling memory is also a Pimpler, which is always good.

37:52 Profiling memory is also a Pimpler, which is always good.

37:54 Profiling memory is also a Pimpler, which is always good.

37:56 Profiling memory is also a Pimpler, which is always good.

37:58 Profiling memory is also a Pimpler, which is always good.

38:00 Profiling memory is also a Pimpler, which is always good.

38:02 Profiling memory is also a Pimpler, which is always good.

38:04 Profiling memory is also a Pimpler, which is always good.

38:06 Profiling memory is also a Pimpler, which is always good.

38:08 Profiling memory is also a Pimpler, which is always good.

38:10 Profiling memory is also a Pimpler, which is always good.

38:12 Profiling memory is also a Pimpler, which is always good.

38:14 Profiling memory is also a Pimpler, which is always good.

38:16 Profiling memory is also a Pimpler, which is always good.

38:18 Profiling memory is also a Pimpler, which is always good.

38:20 Profiling memory is also a Pimpler, which is always good.

38:22 Profiling memory is also a Pimpler, which is always good.

38:24 Profiling memory is also a Pimpler, which is always good.

38:26 Profiling memory is also a Pimpler, which is always good.

38:28 Profiling memory is also a Pimpler, which is always good.

38:30 So you can just use a Pimpler, which is always good for the Pimpler, which is always good for the Pimpler, which is always good.

38:34 Profiling memory is also a Pimpler, which is always good for the Pimpler, which is always good for the Pimpler, which is always good for the Pimpler, which is always good for the Pimpler.

38:40 And if you have a Pimpler, which is always good for the Pimpler, which is always good for the Pimpler, which is always good for the Pimpler.

38:48 So you can write your own tooling that is totally tuned to what you want to do.

38:55 So you have this basic tools and then Python makes it so easy to write some nice special tools for your use cases.

39:03 Yeah, that was really cool.

39:04 And I really liked the way that you're using decorators in your tutorial.

39:08 Like here you have one called like, you know, measure memory or something like that.

39:11 So you want to measure some function and all the functions it calls.

39:14 Of course, you just say at measurement memory in front of it.

39:17 Right. And then you get this little little summary, which is really cool.

39:21 Yeah. So I guess if you want to answer the question of how many bytes is this, you know, five or 10 byte thing I've created, like this wouldn't be the way to do it.

39:29 This is a way to get like large scale pictures of, oh, I have 10 million integers and I didn't expect that.

39:36 What's the problem, right?

39:37 Yeah.

39:38 But there's also ways to like actual get object size from this library, isn't there?

39:43 Like, yeah.

39:44 Yeah.

39:45 Difficult to know the full like closure of an object graph size, right?

39:49 Yeah.

39:50 You can use this, get a size of your, the system module and the sound library has a size of, which gives you the memory size of one object, but it's only the object itself.

39:59 Like you have a list, you get the size of the list, but you don't get the size of all the objects stored inside the list.

40:05 And the objects that they store and the objects that those objects store, right?

40:08 And so on.

40:09 Yeah.

40:10 Pimplot is giving you this so you can actually measure.

40:12 And actually I have a very nice example.

40:14 They're very interesting.

40:16 So I start with an empty list and then I keep appending to this list, one integer after the other, up to a million or 10 million actually.

40:24 And then every time I append, I measure the size of this object and see how it changes.

40:29 And use two different ways.

40:30 I use this kind of built-in, sys get size of, which gives you just the size of the list.

40:35 And then the other one from Pimplot that gives you the whole memory, the list and everything, or the integers in my case use up.

40:42 And you can see a very nice step function because that's how Python actually works.

40:47 So you have a list and Python always allocates a bit more, roughly 50.

40:51 There's a formula, but roughly 50% or so more memory.

40:54 It fills this memory.

40:55 And then when it hits the limit, it allocates more and more.

40:59 So if you have 10 million appends, I get like 104 or so allocations, which is, because allocations are pretty expensive in terms of time.

41:07 So this makes lists so efficient if you do append.

41:10 If you use some other structure that would need to be rebuilt, or if you use, instead of appending to a list, you insert zero.

41:16 You insert at the beginning of a list.

41:18 That means every time you do this, you have to rebuild the whole list and you have to allocate all the memory again, which is a big anti-pattern actually.

41:25 And it's very nice.

41:26 You can visualize this.

41:27 Why is it like this?

41:28 And you can see a picture.

41:30 And I think it always, if somebody tells you that's the case, it's one thing.

41:34 If you do it yourself and you measure yourself, it's a very big difference in terms of experience.

41:38 You see, okay, I measured myself and I can play around with it and see how things change.

41:42 Yeah.

41:43 I thought the graph was really cool because it made it very clear what the algorithm was.

41:47 You can see that it's basically trading a little bit of memory for a lot of performance.

41:51 Yeah.

41:52 In the general case, right?

41:53 In the general case.

41:54 Yeah.

41:55 And obviously you can buy memory.

41:56 So buy a bit more memory just cost you a few bucks, but make making a CPU a thousand times faster is pretty hard.

42:02 I would say.

42:03 Yeah, absolutely.

42:04 So the other thing that you showed in your tutorial that I thought was really cool that I don't think we've touched on yet is line by line measurements.

42:12 So you could do line by line CPU profiling as well as allocation management, right?

42:18 Yes.

42:19 You can do this.

42:20 So there's, there's a line profiler from Robert Cron.

42:23 So, as an author and actually he ported it to Python 3 now, because I asked him several times to do this.

42:28 And now the line profiler works with Python 3, which is, it gives you a lot of overhead.

42:32 So it makes your program much, much slower, but gives you a line by line breakdown because the C profile is just the smallest increment you get as a function.

42:41 And you can do a line by line profiling, which is especially important if you write a function and you have like calls to NumPy.

42:48 So in a function, you might have three, four, five calls to NumPy.

42:50 Then you can see which call takes the most time, for instance.

42:54 Right.

42:55 And this is what the line by line profiling is doing.

42:57 Yeah.

42:58 So maybe you, you see profile and, and those sorts of techniques to go and figure out, all right, well, I know it's really this function that is a problem, but it's, it's 10 or, you know, hopefully not a thousand.

43:07 It's 10 lines long.

43:08 What's actually slow about that, right?

43:10 Then you can turn on this line profiler business.

43:12 Yeah.

43:13 Exactly.

43:14 Because you only want to have one or two functions.

43:16 You use a decorator, add profile, you know, you put a decorator called profile there, and then you run it through your profile with some options.

43:24 It is this line profiler, and then it makes it very, very slow.

43:27 So either maybe factor of a hundred slow or something like this.

43:30 It feels very slow because it just has to go line by line and check everything what's, what's doing and gives you a nice breakdown.

43:38 You see, okay, allocating this list or this number is array takes so much time and then looping over it takes so much time and doing this takes so much time.

43:46 So you could see pretty much where the time goes, which is, can be a way to be useful educational.

43:51 So you can get a feeling what these functions are doing, but also useful to make your program faster.

43:58 You also talked about some anti patterns, things that you should try to avoid or, you know.

44:03 Yeah.

44:04 Can you go through a couple of those?

44:05 Yeah.

44:06 One is I just said is about maybe the most important one is about the list.

44:10 So if you have a list, then you want to append at the end.

44:13 And maybe when you, when you're done, you just reverse the list.

44:16 If you want the other way around, instead of inserting at the position zero, which is what you want.

44:21 And this would be big anti pattern.

44:23 And this can be many orders of magnitude difference in terms of performance.

44:27 If you do this one, and this is one of the biggest anti patterns.

44:30 And now the anti patterns is like string concatenation, which is an old one, which it's now kind of optimized in CPython.

44:37 So instead of saying string plus equals new string, string plus equals new string in a loop, you just use a list and append to the list.

44:46 Take advantage of this list behavior.

44:49 And then at the end, you just join the list.

44:51 Yeah.

44:52 Because the list do this pre allocation thing where strings don't.

44:54 Yeah.

44:55 So this is actually, it's not really true anymore.

44:57 It used to be the beginning of Python.

44:58 So I started with Python 1.5 too.

45:00 This was my first Python version.

45:02 And it was pretty true, I think.

45:03 And then pretty soon they changed, they optimized it away.

45:06 And it works with CPython.

45:08 But if you use PyPy, which is supposed to be much faster, then this anti pattern just kicks you.

45:13 Then if you do this for a string of like say 100,000 characters, it just kills the performance.

45:20 And you have to wait an hour or so instead of a second.

45:23 Wow.

45:24 Yeah.

45:25 I don't really, I remember, I just read it somewhere.

45:29 There's a reason that it didn't put it in this optimization.

45:32 So you never know if CPython always can optimize it.

45:35 Because for simple cases that I use string literals and stuff like this, it should be easy.

45:39 If it maybe is a code a bit more complex, who knows if it's still up to an optimization.

45:43 So it's better not to rely on this.

45:46 As long as the string is short, it's fine.

45:49 But it gets a bit longer than...

45:50 Yeah.

45:51 Yeah, it's a problem.

45:52 Maybe you're streaming stuff out of a CSV file and trying to build up a thing.

45:55 And maybe it can no longer optimize that.

45:57 Who knows, right?

45:58 Yeah.

45:59 Who knows?

46:00 Another interesting example you had was looking, sort of optimizing variable and function

46:07 lookups.

46:08 Like caching global things like square root versus math.square root, for example, in a function.

46:14 Those kind of things.

46:15 So it depends what you're doing.

46:16 It can be, it can give you a little bit speed up.

46:19 So because that's just the basic thing how Python looks up names.

46:23 So first looks in local namespace in your function, then goes to the global namespace, which is your module, and then goes to the built-in namespace.

46:30 And every lookup is kind of a dictionary lookup.

46:33 And instead of going like this global, even the built-in namespace, like if you use sum, the built-in sum, then every time you use sum,

46:41 Python has to go and hunt for sum until it finds it.

46:44 And so if you say, okay, my sum equals sum in a function, for instance, then you make it a local variable, and you avoid like two lookups for every time you use sum.

46:52 This can be useful.

46:53 Yeah, if that's in a tight loop, then maybe that becomes a problem, right?

46:56 Or a loop within a loop.

46:57 So if it's a tight loop, you don't do much else, that might be something.

47:00 If you do a lot of other heavy computations, or the sum itself takes a long time to sum something big, then you still get the speed up from the lookup avoidance.

47:10 But percentage-wise, it might be just half a percent faster.

47:14 It might not make a lot of difference.

47:16 If you don't do much else, you can get several 10% faster behavior or something like this.

47:21 So it depends very much on your use case, but at least you can try, and this would be one thing.

47:26 So if you call a function or some built-in or global variable in the inner loop again and again and again, making it local might speed up things a bit.

47:34 Sure.

47:35 You had a cool example where one of the points you were making was saying, look, sometimes it's an algorithm, and you just need to look, you know, not optimize a function a little bit, but you need to rethink how you're doing something.

47:47 And you had this created, it was a contrived example that was computational, but you were basically computing pi using the Monte Carlo method.

47:54 And you said the naive way to do this would be to sort of generate a couple of random numbers, do some math, save some values, and do this in a loop.

48:01 And you said, well, look, if we do this with a different algorithm like using NumPy, we get dramatically different performance, right?

48:08 Yeah.

48:09 So NumPy, the algorithm with the Monte Carlo algorithm is maybe the worst algorithm to calculate pi.

48:13 There are much better algorithms out there than give you pi with 100 digits with 100 iterations or so.

48:18 So in Monte Carlo, 100 iterations would take the age of the universe to calculate, I guess.

48:23 So it's computationally, but it makes a good example because it's very slow and you can use a lot of techniques to improve it because also it's embarrassing parallel.

48:34 So I use it later on, actually, to do some parallel calculations with multiprocessing and other means.

48:40 So it's a very good example because I use the same example with many different kind of approaches, which is useful.

48:47 So it is simple enough to understand.

48:49 So I use this and then you just run it in normal Python.

48:53 And as I said, one of the things where Python is slow is when you use numerical computations and you use 4x in range, 4y in range loops, something like this.

49:02 This is pretty slow.

49:04 And if you just use NumPy and NumPy is doing this vectorize, so instead of writing a loop, then you vector, you just call the function in NumPy.

49:13 The fact that when you program NumPy, you should pretty much never write a loop.

49:17 You should always vectorize things if you want to make it fast.

49:20 And then NumPy just runs us with a function which is running in C, when you get near C speed for a lot of things.

49:27 So it makes it much faster.

49:28 Yeah, yeah.

49:29 That was really cool.

49:30 I like that example.

49:31 What if people missed this, right?

49:34 This was actually done in June.

49:37 So if they weren't there, they've already missed it.

49:39 There's a couple of opportunities to still check out your presentation though, right?

49:43 Yeah, the presentation does, you can go to YouTube.

49:45 If you go to PyVideo, it's on YouTube.

49:48 If you search for it, you will find it.

49:50 So if you go to the PyCon US site, then it should be there.

49:54 So if you look for faster bicycle.

49:56 Yeah, and I'll link to it in the show notes as well.

49:57 Yeah, you can link to it.

49:59 And also I will give this tutorial again at Europython in just in a few weeks in Bilbao,

50:05 in Spain.

50:06 So then there will be this tutorial again.

50:09 So I will talk about it.

50:10 Yeah, Europython is going to be very exciting.

50:13 That'll be a good conference.

50:15 Yeah, it's a nice conference.

50:16 I gave it last year there.

50:17 So it seems like because there's always demand for it and people like it.

50:22 So it's just because it's, I try to always make it hands on so you can follow everything.

50:27 So there's nothing.

50:28 It's too sophisticated, which is on purpose.

50:31 Because if you show too sophisticated code, people won't understand.

50:34 So everything is pretty simple and you can use it right away.

50:37 And hopefully when you understand, you can apply it to your own use case.

50:40 That's the whole purpose of the thing.

50:42 Yeah, that's great.

50:43 So we just have a few minutes left in the show.

50:45 Before we call it quits, let's talk a little bit about what else you have going on in the Python space.

50:50 I've been in the Python space for quite a while, since 1999.

50:54 Actually, I spent my most, actually my professional life in the Python space.

50:58 I teach Python.

50:59 So I'm the founder of Python Academy, which is now more than 10 years old, actually.

51:03 And that's where I teach Python.

51:05 I gave my first Python course in 2004.

51:08 It's quite a while ago.

51:09 So I taught university level courses in different fields, but in scientific fields.

51:15 So I had some experience in teaching and gave a lot of talks at conferences.

51:20 And it was just a small step to Python courses.

51:23 So just how it developed.

51:24 I really like teaching.

51:26 It seems like it fits.

51:27 So people understand most of the time what I'm saying.

51:30 And so I started teaching Python.

51:33 And meanwhile, Python Academy grew.

51:34 Now we have about 11 teachers.

51:37 So not everybody's teaching all the time, but I have a lot of different teachers doing web development,

51:42 testing, Sison, database programming, and have also teachers that have different native languages.

51:48 So I myself can teach in German, which is my native, and in English.

51:52 But I also have people teaching in Italian and Polish and maybe in other language later on.

51:57 So some people might be taught in their own native language, which might help to understand.

52:02 Yeah, absolutely.

52:03 And congratulations on the success there.

52:05 That's awesome.

52:06 I've been deeply involved in training and training companies for the last 10 years myself.

52:11 And I know what it takes to put those together and keep them running.

52:14 Yeah.

52:15 It's a lot of effort.

52:16 So I myself, Python Academy, we focus on everything Python.

52:19 So very wide range in terms of Python, but deep in there.

52:23 So I teach beginners, people that barely programmed before, or other programmers,

52:28 so professional programmers that wanted to switch to Python, but also advanced Python.

52:32 I have advanced Python course, but that's a Python track.

52:35 I also teach scientific tools, which is a very wide field.

52:38 So I teach how to use an IPython on Jupyter Notebook and NumPy and some SciPy tools and Matplotlib and Pandas,

52:47 those kind of things, which is in very high demand in terms, because scientists need these tools all day long.

52:53 Yeah, absolutely.

52:54 What's your favorite one to teach?

52:55 It depends.

52:56 So I like to have variations.

52:57 I like the scientific thing.

53:00 And actually, but I typically like in a course, I like to include examples from the people and actually do a little bit of live programming in the course.

53:08 So they have a problem and say, okay, I need to, scientists are often to have to read in this file A and have to convert it into file B, which is a very common thing.

53:17 And Python really shines in this.

53:18 So you can actually in the course ad hoc, develop a smaller routine and read in the file.

53:23 There's different tools.

53:24 So you can do it.

53:25 It's just standard library.

53:26 You can use Pandas, which is great for CSV files and something like this.

53:30 And then you say, okay, now I have this file and I've assigned the data and then I write it back somewhere.

53:35 And this is very often very useful because then I can use a lot of things I taught in the course.

53:40 So how to do this, how to write a function, how to write a doc string in a function and how to make a function, refactor the function because it's getting too big.

53:47 Everything, but on a small scale, so understandable for people.

53:50 And I see it's really useful if you do this.

53:52 So you can reuse this function again and again, do something.

53:55 That's mainly for scientists that very often scientists use programming as a tool, but they're not so deep into programming very often.

54:02 As compared to professional programmers, for them, there's nothing new.

54:05 They pretty much understand when I say something, they understand the concept and that's pretty easy.

54:10 You can go on.

54:11 For scientists, you have to give an example because just explaining it in this very kind of generic examples might not be kind of really good enough to understand what is useful, can be useful for.

54:23 Right, because they spend most of their time working on physics or biology or something real, right?

54:28 Yeah, something real.

54:29 So it depends.

54:30 Some of them tend to become programmers.

54:32 Very often scientists turn programmers.

54:34 But many of them just use Python from time to time as a tool, as an important tool.

54:38 And as soon as they get their result, they just forget about it and go on.

54:42 Then I said, wait a minute, you can, now, the script works, but if you spend another half an hour, you can make it reusable.

54:48 And next time you can save a lot of time because you put this in a function and you put a dong string.

54:53 Maybe you put a test in there.

54:54 So make it work.

54:55 And it makes it much more useful.

54:57 Even your colleague might be useful.

54:58 Now, if you write this small function or several functions that can read this file format, then your colleague can use it.

55:04 If you just have a script that's doing everything in one piece, then your colleague is probably not really able to use it without a lot of efforts.

55:11 Right.

55:12 You end up starting from scratch every single time.

55:14 Yeah.

55:15 Yeah.

55:16 So it's kind of, it depends.

55:17 Yeah.

55:18 So that's what I like.

55:19 I like the variety.

55:20 So I like introductory courses, even though I taught it many times, but also like the advanced course.

55:25 And lately I had a course where people really used meta classes.

55:29 And I had to really dig deep into meta classes.

55:32 There were some very hard meta class questions.

55:34 I had to kind of think really hard because meta classes, but as a topic, most of the time, if you don't know what they're doing, you don't need them.

55:42 And then you could stop the course.

55:44 But sometimes they can be useful for certain types of things.

55:47 And these people will use them to a high extent, which is interesting.

55:51 Pretty challenging course though.

55:52 Yeah.

55:53 Cool.

55:54 Yeah.

55:55 It's great to have that spectrum.

55:56 Do you also do consulting?

55:57 So actually I do some scientific consulting.

55:59 I just help some develop some lake model.

56:01 So we do some consulting in scientific area.

56:04 Of course not only me, so we have a team of a few people.

56:07 So we also do consulting and program something.

56:10 And sometimes you also connect to consulting actually pretty frequently.

56:13 So develop a solution, but instead of handing over a black box solution, we do a workshop with it.

56:18 So we explain how things work and how people can extend what we did.

56:23 So you get a really solid solution, which is people probably like scientists would not be able to write themselves, but they can build on it.

56:30 So they have this basics.

56:31 This is a good foundation.

56:32 You can build it and you can use it and increase it and make it more useful for their needs.

56:38 Yeah, that's great.

56:39 That's really helpful.

56:40 And then finally, you've done a lot with conferences.

56:42 You were involved with EuroSciPy, EuroPython, and so on, right?

56:47 So it seems like I'm pretty much into conferences.

56:51 So actually, I started pretty much, I was the guy behind pushing EuroSciPy.

56:56 So I started here in Leipzig where I'm located.

56:59 And 2008 and 2009, the first two EuroSciPys happened here and I was a chair.

57:04 And now EuroSciPy has a great success.

57:06 Then we had two conferences in Paris, two conferences in Brussels, two in Cambridge.

57:10 And now we're back to Germany in Erlang and it will be in August, so just a bit more than a month from now.

57:16 That will be the number nine EuroSciPy.

57:18 So it seems like a success.

57:19 People come to EuroSciPy and it's continuing.

57:22 So I've been involved in EuroSciPy.

57:24 And also I'm involved in PyConDE.

57:27 So 2011, 2012, PyConDE.

57:29 The German PyCon was here in Leipzig and I was a chair.

57:33 And also, of course, PyConDE is run by the Python Software Verband, which is a kind of a German Python Software Association.

57:41 And I happened to be the chair there also.

57:43 So I'm a bit involved in community work.

57:45 And then we also did EuroPython 2014 in Berlin and I was also the chair, which was a big conference, more than 1,200 people.

57:53 Yeah, that must have been a lot of work to put that together.

57:55 Yeah, so we had a very great team.

57:57 So, of course, you have the chair.

57:58 You just kind of oversee everything and try to delegate as much as possible, which is necessary because it's just impossible to do main or bother work yourself.

58:07 You have to have a great team.

58:09 And we had a lot of people that are very enthusiastic about it and put a lot of effort in there.

58:14 And it was a great conference.

58:16 I had a lot of good feedback from people about the conference.

58:19 Yeah, cool.

58:20 Are you involved in any upcoming ones?

58:22 Yes, I'm at the EuroSciPy.

58:24 So the Python Software Verband is also taking over the legal part of the EuroSciPy.

58:28 So we do the – somebody has to move the money and sign the contracts and stuff like this.

58:33 So I'm involved also and we use our software that we use for PyCon.de and EuroPython for EuroSciPy now, the web conferencing software.

58:41 We are involved in – I'm involved there.

58:44 But I'm also a little bit involved.

58:46 We have what's called the Python Unconference in September.

58:49 And we are – our software fund is a sponsor and we help and support a little bit.

58:53 So we are not the main organizer, but we help.

58:55 There will be likely – so we are very close actually – there will be PyCon.de in October in Munich.

59:02 And we are involved there and there will be next year PyCon.de and EuroSciPy next year also.

59:08 So – and I'm always involved to some degree in these conferences.

59:12 I'm not doing the main work for sure, but I'm on the mailing list and I'm somehow involved in organizing.

59:18 Yeah, that's cool.

59:19 And it sounds like there's a lot of conferences coming up in Germany around Python.

59:22 That's cool.

59:23 Yeah, there's several of them.

59:24 They have this bar camp, which is interesting in Cologne.

59:27 I myself, I'm just a participant.

59:29 Other people do this, which is pretty relaxing.

59:31 So it's a kind of a – also like an unconference.

59:33 So we have this unconference in Hamburg.

59:35 So there's several events we have.

59:38 So now actually we have not this one big one, but multiple smaller ones, which are more original, which can also be nice.

59:45 So people don't have to say, if I miss one, then that's it.

59:48 I have still two or three more alternatives.

59:50 Right.

59:51 A little bit of each is really great.

59:53 Final two questions, Mike, before I let you go.

59:56 When you write some Python code, what's your standard editor?

59:59 What do you open up to code?

01:00:01 Typically, I go with sublime nowadays.

01:00:04 Sometimes I use Wing IDE if I have some projects, which of course has a nice debugging and refactorization things.

01:00:11 For courses, I use Spider, which is kind of a scientifically inclined thing.

01:00:17 So I use it for my courses.

01:00:18 That's the one that comes with the Anaconda distribution, right?

01:00:20 Yes.

01:00:21 So Anaconda has it out of the box.

01:00:23 If you don't, if you just install Anaconda, you have it.

01:00:26 So Spider, it's not perfect.

01:00:28 There's sometimes crushes, but it's okay.

01:00:30 So of course, then I think it's, you don't have to tell people to install a different editor just to have one.

01:00:35 And they have a debugger and they have an object.

01:00:37 You can look at the objects, the Python objects for debugging and stuff like this, an interactive console, if you like.

01:00:43 But most of the time, of course, actually I spend a notebook anyway.

01:00:46 So, right.

01:00:47 Yeah.

01:00:48 The Jupyter notebooks is also really nice.

01:00:49 That's cool.

01:00:50 All right.

01:00:51 And the 80,000 plus PyPI packages, what's your one to recommend that people maybe don't know about?

01:00:56 The one that we talked about already is that Jupyter notebook is the killer app.

01:00:59 So if you don't know Jupyter notebook, you're missing something out in the Python community.

01:01:04 That's something you should go.

01:01:05 So that's very interesting to work with Jupyter notebooks.

01:01:08 Jupyter notebooks.

01:01:09 I always have a notebook open to just try something out.

01:01:11 If you have an idea, you can use it.

01:01:13 It's a very nice mixture between an editor, a full editor and the interactive prompt.

01:01:17 And that makes it.

01:01:19 Maybe that's not the real secret.

01:01:21 The one I'm looking right now is sconch.

01:01:23 You know sconch?

01:01:24 It's this interactive shell.

01:01:25 Yeah.

01:01:26 With an X.

01:01:27 Yeah.

01:01:28 There's an X, but it's pronounced sconch.

01:01:30 And I know Anthony Scopeparts pretty well.

01:01:33 We're hanging out at PyCon lately and he's a funny person and very knowledgeable on top of this.

01:01:41 And he wrote a very interesting tool which gives you a kind of a shell that incorporates Python.

01:01:45 So in a nutshell.

01:01:46 It's very interesting.

01:01:47 So you get a shell that incorporates Python.

01:01:49 And then I think it works on Windows.

01:01:51 So you get a much more powerful shell on Windows because the shell on Windows is typically not that great.

01:01:56 So if you need to work on Windows sometime, as I do for my courses, then this would be a good alternative to the workers.

01:02:03 Right.

01:02:04 Yeah.

01:02:05 That's cool.

01:02:06 Okay.

01:02:07 Great recommendation.

01:02:08 Yeah.

01:02:09 And you know, the story on Windows is getting better, right?

01:02:10 Like Steve Dower redid the installer.

01:02:11 So now the installer is not a complete challenge to like get Python on there.

01:02:15 The Windows 10 shell is a lot nicer.

01:02:17 They're bringing the Ubuntu binaries to Windows 10 starting in a month or two.

01:02:23 So it's getting better.

01:02:24 But yeah, it still is sometimes painful to work on Windows.

01:02:27 Yeah.

01:02:28 Yeah.

01:02:29 But in the compiling C extensions still not the smoothest experience.

01:02:33 No.

01:02:34 No.

01:02:35 Let me just say VCVAR's bat was not found.

01:02:38 Right.

01:02:39 Yeah.

01:02:40 I have this message about a hundred times.

01:02:41 Gosh.

01:02:42 Yes.

01:02:43 All right.

01:02:44 Well, Mike, it's been really fun to have you on the show.

01:02:45 Thanks for sharing your optimization experience.

01:02:48 Thanks for having me.

01:02:49 Thank you very much.

01:02:50 Yeah.

01:02:51 Talk to you later.

01:02:52 This has been another episode of Talk Python To Me.

01:02:55 Today's guest was Mike Mueller.

01:02:56 And this episode has been sponsored by Rollbar and SnapCI.

01:02:59 Thank you both for supporting the show.

01:03:02 Rollbar takes the pain out of errors.

01:03:04 They give you the context and insight you need to quickly locate and fix errors that might

01:03:09 have gone unnoticed until your users complain, of course.

01:03:12 As Talk Python To Me listeners track a ridiculous number of errors for free at rollbar.com/talkpythontome.

01:03:19 SnapCI is modern continuous integration and delivery.

01:03:23 Build, test, and deploy your code directly from GitHub.

01:03:26 All in your browser with debugging, Docker, and parallels included.

01:03:29 Try them for free at snap.ci/talkpython.

01:03:32 Are you or a colleague trying to learn Python?

01:03:35 Have you tried books and videos that just left you bored by covering topics point by point?

01:03:39 Well, check out my online course Python Jumpstart by building 10 apps at talkpython.fm/course

01:03:45 to experience a more engaging way to learn Python.

01:03:48 And if you're looking for something a little more advanced, try my WritePythonic code course at talkpython.fm/pythonic.

01:03:56 You can find the links from this episode at talkpython.fm/episodes/show/66.

01:04:03 Be sure to subscribe to the show.

01:04:04 Open your favorite podcatcher and search for Python.

01:04:06 We should be right at the top.

01:04:08 You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct

01:04:14 RSS feed at /rss on talkpython.fm.

01:04:17 Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix.

01:04:22 You can hear the entire song at talkpython.fm/music.

01:04:26 This is your host, Michael Kennedy.

01:04:28 Thanks so much for listening.

01:04:29 I really appreciate it.

01:04:30 Smix, let's get out of here.

01:04:32 *outro music*