20 Python Libraries You Aren't Using (But Should)

0:00

01:16:54

Episode Deep Dive Transcript

Many of you write to me and tell me how you appreciate the way my guests and I highlight a particular Python package at the end of each episode. Well if you enjoy that little segment, you're going to love this episode.

This week you'll meet Caleb Hattingh who wrote a great book called 20 Python Libraries You Aren't Using (But Should). He and I spend an hour digging into all the very powerful and interesting packages that you probably haven't heard of but will be super excited to use after you learn about them.

Links from the show:

Caleb on twitter: @caleb_hattingh

Book: 20 Python Libraries You Aren't Using (But Should):

oreilly.com/programming/free/20-python-libraries-you-arent-using-but-should.csp

Learning Cython course: shop.oreilly.com/product/0636920046813.do

Python-specific Slack group online (~ 2.5k members): pythondevelopers.herokuapp.com

Episode Deep Dive

Guest Introduction and Background

Caleb Hatting joined the show to discuss a free eBook he wrote for O’Reilly called 20 Python Libraries You Aren’t Using (But Should). Caleb has a deep background in chemical engineering, scientific computing, and Python software development. He’s worked with a range of programming languages, but Python has become central to his work, from building simulations with Cython for serious performance gains to writing data-processing code in organizations of all sizes. His unique perspective blends heavy computational tasks with practical everyday coding, which led him to compile underused but highly valuable Python libraries in his eBook.

What to Know If You're New to Python

If you’re relatively new to Python and want to get the most out of this discussion, here are a few essentials to keep in mind:

Python’s standard library already covers a wide array of use cases, so explore it before pulling in external packages.
Many of the libraries in this episode offer more streamlined or specialized functions than the built-in modules.
Don’t be afraid to experiment: libraries like Arrow, Colorama, and Cython are quite approachable and can drastically improve day-to-day development.
A strong foundation in basic concepts (like functions, modules, and data structures) will help you integrate these libraries more easily.

Key Points and Takeaways

Highlighting Under-the-Radar Libraries Caleb’s main goal was to introduce powerful libraries that are both stable and not as widely known. He balanced showcasing third-party packages and shining light on hidden gems within Python’s standard library.
- Tools / Links:
  - Caleb’s eBook from O’Reilly
Taking Advantage of the Standard Library Even though Python’s ecosystem is massive, the standard library can solve many tasks, from scheduling repetitive jobs to building concurrent applications. Modules like collections, logging, sched, and concurrent.futures often go overlooked by beginners.
- Tools / Links:
  - collections (docs.python.org)
  - concurrent.futures (docs.python.org)
  - logging (docs.python.org)
  - sched (docs.python.org)
CLI Tools: Colorama and Begins Caleb stressed that command-line interfaces don’t have to be dull. Colorama makes it trivial to colorize terminal output, improving readability. Meanwhile, Begins provides decorators for rapidly building intuitive CLI commands and subcommands.
- Tools / Links:
  - Colorama GitHub
  - Begins GitHub
Live Visualization and Desktop Integration For real-time data plotting, PyQtGraph stands out for speed and user interactivity, especially if you’re already using PyQt. PyWebView takes an intriguing approach by letting you write a desktop app in Python but present the UI via the system’s native web engine.
- Tools / Links:
  - PyQtGraph
  - PyWebView GitHub
Monitoring and File-Watching Two system-level libraries that often solve DevOps-type headaches are psutil (process and resource monitoring) and watchdog (file system event handling). Both provide cross-platform abstractions so you don’t have to handle OS-specific quirks.
- Tools / Links:
  - psutil GitHub
  - watchdog GitHub
Enhanced Python REPL with ptpython If you spend a lot of time experimenting interactively, ptpython is a must. It features multiline editing, syntax highlighting, and optional Vi/Emacs keybindings, making your REPL workflow smooth and productive.
- Tools / Links:
  - ptpython GitHub
Date and Time Handling with Arrow and parsedatetime Arrow simplifies the pain of mixing naive vs. aware datetimes, automatically ensuring your objects carry timezone info. For flexible input formats, parsedatetime can interpret phrases like “two weeks and three days in the future” or “10 minutes from now” into actual datetime objects.
- Tools / Links:
  - Arrow
  - parsedatetime GitHub
General Utilities in Boltons Boltons is a “batteries added” library offering helpful utilities for caching, iteration, debugging, and more. It integrates seamlessly alongside the standard library, filling in gaps with straightforward, well-tested functions.
- Tools / Links:
  - Boltons GitHub
Cython for Speed and Parallelism One of the most discussed power tools was Cython, which compiles Python-like syntax into C extensions. Caleb emphasized you can drop the GIL for CPU-bound tasks, enabling true parallelism across multiple cores. You can see speed gains of 100x or more for certain math-intensive workloads.
- Tools / Links:
  - Cython
Awesome Python Collection For those looking to discover even more libraries, the “Awesome Python” list on GitHub offers curated subsections (like OCR, e-commerce, etc.). It’s a continuously updated resource if you’re on the hunt for specialized libraries.

Tools / Links:
- Awesome Python GitHub

Interesting Quotes and Stories

"The presence of a global interpreter lock, while it's interesting, is not really a bottleneck anymore in CPython because of Cython." -- Caleb

"For simpler kinds of applications, Hug makes it extremely easy to get a REST interface up." -- Caleb

"Sometimes, it’s easier to bring in a specialized library like PyWebView than to write a massive front-end in JavaScript." -- Caleb

Key Definitions and Terms

GIL (Global Interpreter Lock): A mechanism in CPython that allows only one thread to execute Python bytecode at a time.
Context Manager: A Python structure (with ... as ...) that handles setup and teardown logic automatically around a code block.
Time Zone Awareness: Whether a datetime object carries explicit time zone info or if it’s “naive.” Libraries like Arrow unify this.
Cython: A language and toolchain that compiles Python-like code to C for speedups and the possibility of releasing the GIL.

Learning Resources

If you’re looking to sharpen your Python foundation or coding style while exploring these libraries, here are two excellent Talk Python Training courses:

Python for Absolute Beginners: A great starting place if you need a refresher on core Python programming concepts.
Write Pythonic Code Like a Seasoned Developer: Learn best practices and get deeper insights into idiomatic Python that align well with integrating these lesser-known libraries.

Overall Takeaway

This episode is a testament to Python’s versatility: Often there’s an off-the-shelf library to streamline your workflow, be it for concurrency, CLI creation, data plotting, scheduling, or time/date management. By exploring the libraries Caleb featured and staying open-minded about lesser-known tools, you’ll expand your Python “tool belt” and become significantly more efficient in your day-to-day coding.

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Many of you write to me and tell me how you appreciate the way my guests and I highlight a particular Python package at the end of each episode.

00:06 Well, if you enjoy that little segment, you're going to love this episode.

00:10 This week, you'll meet Caleb Hadding, who wrote a great book called 20 Python Libraries You Aren't Using But Should.

00:16 He and I spent an hour digging into all the very powerful and interesting packages you probably haven't heard of, but will be super excited to use after you learn about them.

00:25 This is Talk Python To Me, episode 77, recorded September 20th, 2016.

00:51 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities.

01:03 This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy.

01:07 Keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter via at Talk Python.

01:14 This episode is brought to you by Capital One and Intel.

01:17 Thank them both for sponsoring the show by checking out what they're offering during their segments.

01:22 Hey, everyone. I have a quick message for you before we get to Caleb and his book.

01:26 In addition to writing this book for O'Reilly, Caleb also wrote a screencast course on Cython.

01:33 And it looks to be one of the better Cython courses out there.

01:36 So when he's talking about Cython, if you're really interested in what he's up to, be sure to check out his course, which is linked in the show notes.

01:43 And O'Reilly agreed to give away a free copy of his course.

01:46 All you have to do to be eligible is be a friend of the show.

01:49 So be sure to visit talkpython.fm, click on friends of the show, enter your email address, and you'll be eligible to win.

01:56 Now, let's talk to Caleb.

01:58 Caleb, welcome to the show.

02:00 Hi, Michael. It's great to be here.

02:02 Yeah, I'm super excited to talk to you.

02:04 We've got some really cool stuff around a book, a free e-book that you did.

02:09 And I found it super interesting.

02:11 So I think everyone else will.

02:12 Basically, we're going to take the last question I always put at the end of my podcast.

02:17 What's your favorite PyPI package?

02:18 And turn that into an entire episode and just go deep on that idea, right?

02:22 Okay.

02:22 So before we get to that, though, let's start at the beginning.

02:24 Where do you get into programming and Python, that sort of thing?

02:27 Yeah, great question.

02:29 So one of the other podcasts I listen to is the C++ podcast.

02:32 And just about every guest on that show says that they started programming assembler in grade school.

02:38 That was not my story.

02:39 I didn't get into programming at all in school or high school.

02:43 I started really as a hobby in university while I was studying chemical engineering,

02:47 which is kind of an odd thing to do as a hobby when you're doing something completely different.

02:51 But as the years went on, I kind of got more and more into programming.

02:55 And it turned out that I did my master's degree in process control, which is like a subset of chemical engineering.

03:01 And that was all in MATLAB.

03:03 So it was pretty much all a programmed course.

03:06 And that's really how I got into programming.

03:08 And I learned that I really did not like MATLAB very much at all.

03:12 Yeah, I spent some time there.

03:16 I can sympathize.

03:17 I don't love the .m files, no.

03:19 Yeah, I got to know it really, really well.

03:21 And yeah, I decided that I did not really want to carry that forward.

03:26 And it was really when I started working in the first or second year that I started working.

03:30 I started learning a couple of languages outside of work.

03:33 And the one that I really tried to focus on was Java.

03:36 And I signed up for a fairly expensive certification course.

03:40 I think Java was around 1.2, version 1.2 or 1.3 or something at the time.

03:45 And halfway through that course, I came across Andrew Kuchling's Python tutorial at the time,

03:51 which I think was for Python 1.5.

03:53 And it just blew my mind.

03:56 I kind of had the realization that I couldn't possibly use Java anymore to do the kind of work,

04:02 like data analysis work that I was doing, because it was so easy in Python.

04:06 It was just a complete waste of time to develop all of these object-oriented structures around

04:11 fairly simple data pipeline processing tasks.

04:14 Yeah, that makes a lot of sense.

04:16 I mean, Java has so much formality.

04:18 And maybe, let's say it's maybe good for large applications.

04:22 Maybe.

04:23 Exactly.

04:23 But it certainly doesn't make sense for small ones, right?

04:25 Like you're talking about.

04:26 Yeah, absolutely.

04:27 There's definitely a place for Java for very large programs.

04:31 But for the kind of things that I was doing, and especially for shorter programs involving data pipeline processing,

04:37 Java is just way more than what you need to get the job done.

04:41 And that was pretty much the end of Java for me.

04:43 I never finished that course.

04:45 I really got stuck into Python.

04:46 And this was around 2001.

04:48 So quite a few years ago.

04:49 And then I watched Python become Python 2.

04:52 And 2.4 was a big one for me.

04:55 I used that for quite a long time.

04:56 And so on.

04:57 So yeah, that's pretty much how I got into Python.

04:59 But along the way, I did use quite a few other languages fairly heavily.

05:03 Fortran I used quite a bit as well because large chunks of the scientific world still use Fortran.

05:10 I have written new Fortran 77 code.

05:12 I have added that to the world.

05:14 Oh my gosh.

05:15 That's awesome.

05:16 Yeah.

05:17 And Delphi I used for quite a few years.

05:20 My career has moved into and out of software development and into chemical engineering.

05:25 I've kind of straddled both worlds for the past 15 years or so.

05:30 And I did a stint as a software engineer in the hospitality industry for writing hotel administration software for about four years.

05:38 And that was heavily using Delphi, the IDE from what used to be Borland and then became Embarcadero.

05:45 So I got to know that language really, really well as well.

05:47 Pretty much as well as I know Python, I would say.

05:50 That's very interesting to me.

05:52 I kind of regard Delphi and Python as almost polar opposites in many ways.

05:55 A GUI is very easy in the one, not quite as easy in the other.

06:00 Deployment is extremely easy in the one.

06:01 Deployment is kind of difficult in the other.

06:03 And so on.

06:04 There are many parallels where I kind of see Delphi and Python as direct opposites.

06:10 Another good one is the GIL, the global interpreter lock, which I think is really fascinating.

06:13 In the Delphi world, for many years, one of the things that developers asked for over and over again was thread safety in the library.

06:21 Because it was one of the huge talking points.

06:25 They wanted the containers and the structures inside the library to be thread safe because it was too easy to get race conditions in threads because you could just spawn native threads and have them clobber each other's memory layer.

06:36 And it's really fascinating to me that the exact inverse argument gets made in the Python world.

06:42 That is funny.

06:43 The presence of thread safety is the problem because it slows your code down.

06:48 So, yeah, it's really interesting to be able to have a depth of knowledge in multiple programming communities because you kind of get a sense of maybe what is really important and what is spurious.

06:58 Everything is a set of trade-offs.

07:00 Yeah, exactly.

07:01 That's the thing.

07:01 That's what I was thinking.

07:02 Nothing is accidental.

07:04 Things are designed the way they are for good reasons.

07:06 And they may not always be the best fit for every particular situation.

07:10 Right.

07:11 You learn a language that maybe it's not obvious initially.

07:13 You don't know the history or whatever.

07:14 But there was probably some deep thought that went into at least all the popular programming languages.

07:19 Those things have evolved with lots of thought over time.

07:22 Absolutely.

07:23 Yeah.

07:24 Nice.

07:24 Okay.

07:25 Well, so a lot of Delphi, a lot of Python, a lot of scientific programming.

07:31 What are you doing with Python and programming today?

07:33 Like, what's your day job?

07:34 Good point.

07:35 So, I can give you a quick two-minute, well, one minute, let's say, run through of the things I've done.

07:41 So, I started in chemical engineering with doing a lot of simulation work using off-the-shelf tools for that.

07:48 And as time progressed, I started moving more towards the problems where there were no off-the-shelf tools.

07:52 And for those, you have to write code.

07:54 And I began using a lot of Fortran.

07:55 Then I started incorporating Python into that.

07:57 And then I took a break from engineering and went into software development where I used Delphi.

08:02 But I also used Python as well for web development.

08:04 Then I went back into engineering and I started writing simulation software for cold gasification, which is what brought me to Australia.

08:12 And what's interesting about that job was for the first time, I really decided to use Python for the entire system, which means all of the number crunching stuff as well.

08:24 And that's where Cython really came into the picture for me.

08:26 I made the decision to use a full Python stack for that simulation work because it seemed to me that Cython was mature enough, really, to be able to give you the speed that you need to solve these mathematical problems in the background.

08:39 And that definitely proved to be the case.

08:42 It was a good choice on my part.

08:43 Cython is not so much a break from Python, really, as an extension.

08:47 But it gives you all of the native speed and control over the memory layout that you need if you want to make fast code.

08:53 So I spent a couple of years doing that.

08:55 Yeah.

08:55 Would you say it's a little analogous to, like, inline assembler and C++ or something like that?

08:59 You're like, just if this one loop could be faster.

09:02 Let me just do this part fast.

09:04 I think I would disagree with that.

09:06 Okay.

09:06 I began with that in mind.

09:07 It's easy to look at it that way, but Cython is so much more than inline assembler.

09:11 There's another way to look at it, which is you can write.

09:15 You can get all the benefits of C by writing what looks like Python.

09:18 And in one or two places, you just add some types onto a couple of variables.

09:21 And you can get a hundredfold increase in speed.

09:24 So whereas inline assembler is much more of a niche application of a different technology.

09:31 Unless you can read assembler really easily, which I can't.

09:34 Yeah, neither can I.

09:35 Having inline assembler, yeah.

09:36 Where Cython is not like that, by and large, Cython is as easy to read as Python.

09:42 There are a couple of things that are different in the layout.

09:45 But overall, you would find Cython as easy to read as Python.

09:48 Okay.

09:49 Sounds good.

09:50 So back to what you're doing today.

09:51 Yeah.

09:52 So I've been working for the past couple of months on a contract for GPS tracking, which

09:56 has also been using a full Python stack.

09:57 And I was very lucky in that they were willing to go straight to Python 3.5.

10:01 So I've been writing async I.O. code in Python 3.5 since February, which I feel very fortunate

10:07 to have had that opportunity.

10:09 And yeah, I'm starting a new job tomorrow from the date of this recording, working for a company

10:16 called Console, which was formerly called IIX.

10:20 And yeah, I believe my title has something to do with network orchestration, which is going

10:25 to be a whole new thing again for me to learn.

10:28 That is quite a shift.

10:29 But that's really cool.

10:30 And Python, of course, plays a super important role in that space.

10:33 So it makes a lot of sense.

10:34 Excellent.

10:36 Yeah, absolutely.

10:36 One of the big benefits of Python, just as a technology choice, is that you can use it

10:41 just about every way.

10:42 Yeah, that actually is really important.

10:44 It's really important.

10:46 Yeah, so I think that's a wide range of background and experiences.

10:51 And it gives you a nice overview of the ecosystem and the standard library and all the different

10:57 ways that Python makes you efficient, productive, and so on.

11:02 And so you wrote a really cool book.

11:04 You wrote it for O'Reilly, right?

11:05 Yeah, that's right.

11:07 Yeah, as a free e-book, I think.

11:09 And it's called 20 Python Libraries You Aren't Using But Should.

11:14 And I thought it was a really nice survey.

11:17 Very BuzzFeed-y title.

11:19 You know what?

11:21 Anytime it starts with 10 or 20 or the seven things you should never say, you know it's

11:26 a BuzzFeed.

11:26 But it really is succinct and it lives up to the name.

11:30 It's good.

11:31 The idea for the book came from O'Reilly.

11:33 Susan Conant at O'Reilly suggested it to me and asked if I would be willing to write

11:38 the book.

11:38 So the title was established before I got to the project and then I provided the content.

11:43 Yeah, but the original working title was 10 Python Libraries You Aren't Using But Should.

11:47 And I couldn't stop at 10.

11:50 And in fact, if you count the major featured packages, there are 20.

11:55 But you'll see throughout the text of the document that I refer to a whole bunch of other libraries

11:59 as well in footnotes.

12:01 Yeah, I thought that was interesting.

12:03 Yeah, we'll talk about the 20 major ones and maybe even touch on some of the ones that you pull in.

12:08 Like, for example, one of the web service bits, the implementation of the web service uses a few

12:14 other libraries that we can talk about later, but they're not actually part of the 20, right?

12:18 So there's, I feel like you get a really good, well-rounded view of what's out there.

12:24 I was quite cautious about writing the book because it's a fairly contentious thing.

12:28 The choices, the technology choices that people make, many people can be quite passionate about

12:32 those things.

12:32 And the brief for the book was, we want you to focus on libraries that other people don't know

12:37 about yet.

12:38 So that means I have to leave things out.

12:40 And it also means that I have to leave things out that may be fairly popular, which means

12:43 there might be quite a widespread degree of support for libraries that I'm not going to

12:48 be mentioning.

12:48 So I was somewhat apprehensive about that.

12:51 So the libraries that I tried to focus on were things specifically that may not have much

12:56 exposure, which is a very interesting idea for the book.

12:59 I can't include things that are too niche that they could not be used for much, you know,

13:04 very low applicability.

13:05 But at the same time, I did not want to include things that were very well known because that

13:10 defeats the purpose of the book.

13:11 So I found it quite challenging to pitch it.

13:14 It's the ones you aren't using, not the ones that you are, right?

13:16 Exactly.

13:17 Yeah.

13:18 Yeah.

13:18 It is a challenging one to go, okay, let's strip off that.

13:22 Like if you say, what is the most popular library to do?

13:25 You almost have to say, okay, well, except for that one, what else can we do?

13:29 But I...

13:29 Yeah, exactly.

13:30 And I thought about that when I read the title.

13:32 I'm like, oh, is this going to be a bunch of niche things that are like second fiddle

13:36 to the stuff that you actually should be using?

13:38 But no, I think it was really good.

13:39 So maybe we could start by talking about the ones that everybody has installed already.

13:44 That's the stuff that comes in the standard library.

13:46 So you have, in the first chapter, you said, hey, look, there's a bunch of stuff you're

13:50 not using that's built in.

13:51 Developers with experience do tend to look in the standard library first because they've

13:55 been burned by carrying extra dependencies, which after a couple of years may not be as

14:00 well maintained.

14:00 The impression that I have is that more experienced developers tend to lean more heavily on the

14:04 standard library when choosing technology.

14:05 Even if some of the time there may be other third-party packages that might be a better

14:10 fit.

14:10 That's a decision that gets made.

14:12 That's really a trade-off.

14:13 And I do the same.

14:14 When I have to deploy an application to production and I know that this is a core service, I tend

14:20 to lean more heavily on choosing things out of the standard library when possible, as opposed

14:24 to adding third-party dependencies.

14:26 Whereas newer developers tend to get whatever is the latest and greatest on PyPy and run

14:31 with that.

14:31 So what seems to me to be the case is that more experienced developers have a much better

14:37 and deeper knowledge of what is available in the standard library.

14:39 So even though the book was intended to be focused on third-party libraries only, I did

14:44 want to squeeze in some of the absolute must-have, must-know standard library options, like the

14:50 collections package, which is the first section in this chapter.

14:53 If you watch any of Raymond Hittinger's talks, he plugs the collections module heavily, as

14:59 well he should, because it's awesome.

15:01 Yeah, I do feel strongly about that, that people really should know more about what's in the

15:06 standard library.

15:06 And my original version of the book had more of it in, but we decided that we wanted to focus

15:12 the book more on the third-party stuff.

15:13 So it got trimmed down.

15:15 Sure.

15:15 I think that makes a lot of sense.

15:17 Yeah, that makes a lot of sense.

15:18 Partly, I mean, there's nobody who is the advocate for the thing in the standard library.

15:23 It's just built in.

15:24 But when somebody makes their open source library, they set up some GitHub Pages thing, and they've

15:28 got some cool logo.

15:30 And, you know, like it's, there's somebody promoting it in a sense.

15:33 And so I can see that.

15:35 Yeah.

15:35 So a couple of the things that you talked about in the collection library, one of them, which

15:39 I think is pretty interesting and timely, is ordered dict.

15:43 Yeah.

15:43 So what's interesting about this section is that it may become redundant in December.

15:48 I don't know if you've been following the discussions on the Python dev mailing list, but the new

15:54 dictionary in Python 3.6, which I think the release date for the final version is December, the

16:00 dictionary is going to become ordered.

16:01 Whether that is going to be advertised as a requirement for the language spec, or whether

16:06 that's just going to be an implementation detail remains to be seen.

16:09 But yeah.

16:11 Sure.

16:12 Yeah.

16:12 And so every now and then you'll see people building special dictionaries for Python that

16:19 are ordered.

16:20 For example, the MongoDB library exchanges dictionaries at serialization, and the changing of order

16:25 causes more rights on the server for documents than if it doesn't.

16:29 And so they may went and created their own, and we've got this order dict.

16:32 So what's some of the problems you run into with like the regular dictionary?

16:36 Or like, why do you care about ordering, I guess?

16:38 The main use case for ordering that I've come across is usually when processing things that

16:43 are really mappings in a sequence, and they need to be processed in the sequence in which

16:47 they appear.

16:48 So I think the example I gave in the text is, yeah, and a common example is processing lines

16:52 in a file where the lines map to something else.

16:54 And you want to serialize them or persist them in some way, retaining the order that they appeared

17:00 in the original list.

17:02 Right, like maybe like a CSV you're going to load and you're going to say look up by

17:05 like some ID, which is a column.

17:07 If you write it back, you want to be able to save it the same order and not have to maintain

17:10 like two data structures or something.

17:12 Yeah.

17:12 Yeah, that's right.

17:13 Exactly.

17:14 Okay.

17:14 So there is right now in all the versions of Python, the collections.order dict, which is

17:21 a specialized dictionary that solves this problem.

17:23 It just so happens if you live right out at the very edge of new Python, you might not need

17:29 that in December, but a lot of people don't live there.

17:32 Right.

17:32 So I think it's still totally relevant.

17:34 Yes.

17:34 And from the discussions that I've been seeing on the Python dev mailing list, it probably

17:38 is going to remain in the library.

17:40 What the latest that I've seen is that the order of keyword arguments in function calls is

17:46 going to be guaranteed to be maintained.

17:48 But the requirement for normal dictionaries to be ordered may not be a specification of the

17:54 language spec, which means that other implementations of Python may not need to maintain that.

17:59 Right.

17:59 Okay.

18:00 So that's quite interesting.

18:01 One of the caveats that I mentioned about order dict towards the end of the section with

18:05 the big red triangle is beware creation with keyword arguments, which is exactly this problem.

18:10 When you create an order dict and you supply keyword arguments as you would with maybe a regular

18:15 dictionary, the problem is that the order is maintained with your specification because the

18:21 keyword arguments first get created as a regular dictionary before they get created as

18:24 an order dictionary.

18:25 And that's going to be changing for sure in 3.6.

18:27 Okay.

18:28 Oh, excellent.

18:29 That's good to know.

18:29 Yeah.

18:30 Because that happens at the call site before the order dict class ever gets any information.

18:36 It's just given a dictionary and it can do what it can do, but it's too late.

18:39 The order has already changed, right?

18:41 Yeah, exactly.

18:42 That's right.

18:42 And I think the other guarantee is that the dunder dict entry in classes is also going

18:49 to have a guarantee of the order being maintained.

18:51 Even though it's implemented as a regular dictionary, what the language spec requires and what actually

18:56 happens in practice are two different things.

18:58 So the developers of Python are trying to maintain the language spec as a spec even for other implementations

19:03 besides CPython, which is difficult to kind of keep in your head when all you work on is CPython,

19:08 which is largely the case for me.

19:10 But yeah, they're dealing with bigger problems than just whatever goes into CPython.

19:16 Yeah, that's an interesting thing to keep in mind because we often just think of CPython

19:20 equals Python language, but there's a lot of other implementations and extensions and forks

19:25 and whatnot.

19:26 Yeah, yeah, exactly.

19:27 Capital One has a special message for you.

19:43 They need Python pros who love to work with data.

19:46 Put your Python experience at work at Capital One and help them use data to make life better

19:50 for millions of customers.

19:51 Capital One is employing the latest tools and approaches to do data analytics and data science

19:56 from the ground up.

19:57 They're smart, creative professionals who love to explore new ways to interact with data.

20:01 They're interested in figuring out novel, advanced Python techniques and even more interested

20:06 in finding more people who will help them do that.

20:09 When you join their state-of-the-art Python community, you'll work with people you really

20:12 like, people who might be listening to this podcast right now.

20:15 Relentless innovation is their way of life.

20:17 Make it yours at Capital One.

20:19 Visit jobs.capitalone.com slash talkpython to learn more and apply today.

20:24 So another one I would say is also one of, if I had to pick the most useful thing to come

20:34 out of the collections library, I would say it's probably named tuple, which you highlight

20:38 in your book as well.

20:38 Yeah, yeah.

20:39 Named tuple.

20:40 Named tuple is kind of interesting.

20:42 I have recently started using it directly when creating tuple structures.

20:46 But most of my experience with named tuple really has been converting old code that used

20:50 regular tuples into using named tuples just to improve the maintainability aspect of that.

20:55 And it is very powerful in that respect.

20:58 Yeah, it doesn't change the performance much.

21:00 And it's an easy thing you can do because named tuples are compatible with the existing code,

21:07 but they definitely add a layer of maintainability, right?

21:09 So if you have a regular tuple and it's three things and you need to put some new item in the

21:14 middle to make it four, well, the code that was going, you know, T bracket two now is not

21:20 true, not accurate anymore, right?

21:22 But if you could refer to them by the names, the property names, that's fantastic, which

21:27 is what named tuples adds.

21:28 Great.

21:28 Yeah, that is a good one.

21:30 Yeah.

21:30 One that I've done a lot less with is Context Lib.

21:33 What's the story of that?

21:35 Ah, so what did you think of my example?

21:37 Just for the listeners, the example that I gave, the code snippet under Context Lib, is

21:42 creating a simple context manager that measures the time.

21:45 Well, it records the time before and after the execution of the body of the context manager

21:50 and then gives you a way to calculate the performance of that section.

21:54 I haven't gotten much feedback about the book yet because it is fairly new and I was curious

21:58 what your opinion was.

22:00 Well, I got to say, it did take me a moment of going back and let me look at this context

22:06 manager implementation.

22:07 It's just only three lines of code.

22:11 I can just, you know, basically the idea is you create a context manager instance by calling

22:16 this method and it will, when it enters, capture the start time.

22:20 When you leave the width block or suite, it captures the end time and then it tells you

22:25 how much time had passed.

22:26 And so the implementation is T equals get the perf counter.

22:30 T zero equals perf counter.

22:31 Yield a lambda, which does a computation and then compute the value that is actually used

22:41 in the lambda above.

22:42 And that, I was a little bit taken aback by that.

22:46 It was interesting.

22:46 Yeah.

22:47 I was worried that it was perhaps a little bit too complex.

22:50 And I didn't want to, the fact that the use of the lambda, I didn't want the use of the

22:53 lambda to overshadow the demonstration of how the context manager works.

22:57 But basically where the yield comes in is where the body of the context manager gets executed.

23:03 And if you return something from the yield, that's pretty much what you get at the end

23:07 of the line when you say with timing as thing.

23:11 The thing is what gets yielded out of the context manager.

23:13 And the little bit of cleverness in this particular example is that the lambda is a closure over

23:19 the namespace inside the timing function.

23:22 So it captures the storage location of T one and T zero.

23:26 So only when you evaluate the lambda later, do the values of T one and T zero actually

23:31 get used.

23:31 Yeah.

23:31 It's quite clever.

23:33 Yeah.

23:33 This particular example is not imaginary.

23:36 I use it quite a lot.

23:37 Yeah.

23:37 It's nice.

23:38 I appreciated it because it made me think and stop and not just read.

23:42 Yep.

23:42 Okay.

23:42 Yep.

23:43 Okay.

23:43 Oh, wait a minute.

23:44 Not necessarily.

23:44 Okay.

23:45 What's going on?

23:46 And you know, that was cool.

23:47 Like it's nice when code makes you do that.

23:48 If it's not just because you're confused and it's too messy or whatever.

23:52 It's cool.

23:53 Yeah.

23:53 My editor at O'Reilly, Dawn Shanafelt, she was really good about making sure that each of

23:59 these steps were explained in more detail.

24:01 And the editors at O'Reilly are really good.

24:03 They can pick up based on the style of your writing, whether you think you've explained it

24:08 sufficiently or not.

24:09 And they can prod you to say, are you sure you've explained this, but it seems like you

24:13 were a bit terse, perhaps add a few more points.

24:15 So all these bullets and points on the side where everything is spelled out in great detail,

24:19 that wasn't driven by me.

24:21 That was driven by the editors.

24:22 They're really, really good at what they do.

24:24 Yeah.

24:25 You did a good job as a team of breaking down what the steps meant.

24:29 Yeah.

24:29 That's cool.

24:29 So the other thing that you, that was built in was the concurrent.futures module in Python

24:36 3.

24:36 And I thought that was a really interesting way to think about sort of a unifying API between

24:43 process-based parallelism and thread-based parallelism.

24:45 Yeah.

24:46 I wanted to push that point because I think that's, that is the underutilized aspect of

24:50 concurrent on futures is that it gives you this, this really easy lever to switch paradigms.

24:55 For some processes, thread-based programming is valuable.

24:58 And for others, process-based parallelism is, is equally valuable.

25:02 And you get the same interface really just about.

25:05 So you can switch between those two paradigms really quite easily after the fact, which is

25:09 really interesting.

25:09 Usually for complex code involving parallelism, you end up with a structure that is hard to

25:15 change to fit it in a different paradigm unless you do a rewrite.

25:17 And the fact that concurrent.futures gives you the same API for both thread-based work and

25:22 process-based work is a really cool superpower.

25:25 Yeah, it totally is.

25:26 And maybe, you know, it definitely is a simplification because when you start talking about threading,

25:32 there's so many edge cases and interesting variations.

25:34 But maybe the general rule of thumb is if you spend most of your time waiting on the network,

25:40 then thread-based parallelism is probably good, especially if you're sharing a lot of data as

25:45 well.

25:46 And if you're doing a lot of computational stuff because of the GIL and you're not using

25:50 Cython or something, then, you know, you can't really parallelize that very much.

25:54 So multiprocessing and multiple processes for that is maybe a much better way to go.

26:00 But yeah, with the thread pool executor and what was the other one called?

26:06 The process pool executor.

26:09 Those two have exactly the same API.

26:11 And so if you write your code against those instead of directly against multiprocessing and

26:15 directly against the thread API, you literally change your import statement and it changes

26:20 where stuff runs and how, which is pretty cool that you try it out.

26:23 Yeah, that's really good.

26:25 One comment that I also want to make is if I make the choice between whether to use threads

26:30 or whether to use processes, it's not because of the GIL.

26:33 Because as you mentioned, Cython lets you drop the global interpreter lock.

26:38 That's not an issue for me.

26:40 I can write my number crunching code in Cython and use Python's normal threads and still access

26:46 all of the cores.

26:46 The distinction for me between whether to use process-based parallelism or thread-based is

26:50 really about whether I need to use, I need to be able to access the entire memory space

26:54 in the process.

26:55 So that is the main distinction about whether things are okay to be separated by process or whether

27:00 I really need the entire memory space to be accessible by all of the parallel parts of

27:05 execution.

27:05 So if that is the case, for example, if the batch of work that you need to operate on has

27:10 to all fit in the same memory space inside a process and you need to work on different sections

27:15 of memory concurrently, then I would use threads.

27:18 The presence of a global interpreter lock, while it's interesting, is not really a bottleneck

27:23 anymore in CPython because of Cython, because it makes it so easy to drop the GIL.

27:27 Right.

27:27 Awesome.

27:28 Okay.

27:28 Yeah.

27:28 That's a really interesting point.

27:30 And we will definitely be coming back to Cython.

27:32 But if you're working on some data structure that is really large and the threads are updating

27:37 multiple parts of it at the same time, then yeah, you want to keep that in the same process

27:41 space.

27:42 Yeah, absolutely.

27:43 It's really difficult to make that work with process-based parallelism.

27:46 I have been looking at ways of doing that, and I would like to find more about that,

27:50 about using memory mapped files to share memory between processes.

27:54 But I don't have much experience with that yet.

27:56 That's something that I would like to get into more.

27:58 Yeah, that would possibly be a solution.

28:00 But yeah, I don't know what the performance looks like.

28:03 It's interesting.

28:04 Okay.

28:05 Yeah, me neither.

28:05 All right.

28:07 So the next one that was built in was logging.

28:09 You said, you know, look, it's time to get over the print statement.

28:13 If you're trying to actually do debugging stuff, don't just spread it out.

28:16 Like, it's almost the same as you do logging, but you get a lot more.

28:18 Yeah.

28:19 So the experience that I have, this is quite a few years now.

28:23 The experience is I write out a new module or a new script using print statements.

28:28 And a couple of hours later or a couple of days later, it becomes something that I actually want to use and depend on.

28:32 And then I go back through the same code and I change all the print statements to logging statements.

28:36 And yeah, for the last couple of years, I've now just gotten into the habit of just beginning with logging.

28:40 You just put in the boilerplate, the setup line, and then creating your logger, and then you just run with that.

28:48 Yeah, it's pretty straightforward, right?

28:50 You import logging, you call logging.getlogger, and then you can say logger.debug, logger.info, warning.

28:57 And I agree with your sentiment.

28:59 You know, where I find it, I'll be totally happy with print for a while, and then I want to make the code that I was playing with a library and not an application.

29:08 Right.

29:09 Yeah.

29:10 And then all those print statements, it's like super hard to make them go away or to configure them.

29:14 It's just like, ah.

29:15 All right.

29:16 Just a removal.

29:17 Just a removal.

29:18 Yeah.

29:18 So logging, excellent.

29:19 Another one that I really like in this space, although this is the built-in ones, is I really like logbook.

29:25 I think logbook is a nice external one.

29:27 But like you said, having stuff built in is great.

29:30 Okay.

29:30 That's a good tip.

29:31 I didn't know that I'm going to make it.

29:32 Yeah, I think that's Armin Roeniger.

29:34 I can't entirely remember, but I have to look.

29:37 It's really good.

29:38 Okay.

29:39 Let's see.

29:40 So another thing that you might want to do is run something on a scheduled basis, right?

29:47 Like every five minutes, I want to do this thing, or exactly on the hour, I want something to happen.

29:53 And the OSes have built-in ways to do this.

29:57 And I mean, I guess you could like spawn a thread or something to watch.

30:01 But there's some cool stuff built in for that, right?

30:03 Yeah, that's right.

30:04 So you're talking about the shared module, S-C-H-E-D.

30:07 This is a really good example of how you have to be aware of your biases.

30:13 For people who only ever work on POSIX systems or Linux, for example, right?

30:17 Cron is always there.

30:18 It does what it does really well.

30:20 There's a wealth of information available on the internet for how to use Cron.

30:23 So it seems bizarre that there would be this thing in Python that does exactly the same job.

30:28 But the thing is, Cron doesn't run on Windows.

30:30 Windows uses a separate system.

30:32 However, because Python includes the shared module, you can get the same or very similar functionality

30:38 to what you might get in Cron or the Windows task scheduler.

30:41 With a cross-platform Python module.

30:43 And that's really, really powerful if you're writing some service or library that needs to do these jobs on a timer.

30:49 Or at a particular time of the day or so on.

30:52 I look at shared as a really great example of what Python provides in terms of cross-platform support

30:57 for getting this kind of functionality, but in a cross-platform way where you can use the same code base on multiple platforms.

31:02 Yeah, and it's really nice.

31:04 And you basically set up the scheduler and you give it a priority and a frequency.

31:09 And then you say, and call this function whenever it's time.

31:12 And you can do that in either a elapsed time, like 10 minutes from now or every five minutes from now.

31:18 Something like that.

31:19 Or you can do it on a more, like once a minute exactly at the minute.

31:25 Right?

31:25 Yeah.

31:26 Nice.

31:27 Yeah, you can control completely when the target time is, or happens to be.

31:31 I can definitely see Shred becoming a part of the robotization, I guess, of the internet in a big way.

31:41 Automating things and creating bots and timers and work queues and so on.

31:46 Yeah, it's beautiful if you've got some embedded device running your Python code and needs to get home every now and then.

31:53 Just set that up, right?

31:54 Yeah.

31:54 Yeah, absolutely.

31:55 Nice.

31:56 I see we've got to the end of the standard library section.

32:00 We have?

32:00 There was one.

32:01 Yeah.

32:01 There was an additional one that I had in an earlier draft of the book, but we dropped it because it was too short, I guess.

32:09 And that's shlex.

32:11 There's a module called shlex in the standard library, which I wanted to include for no other reason than it has a split function, which will split strings like the normal split, except that it will retain quotes around sections.

32:24 So you can group chunks of words with quotes, just like you might imagine shell processing would process your commands.

32:32 If you put quotes around sections of things, then it treats those as one thing.

32:35 So the shlex module in the standard library has a split function that does that for you as well.

32:40 Oh, nice.

32:41 Yeah, you can almost escape the things you're putting on by putting in quotes.

32:47 Okay.

32:47 Yeah, awesome.

32:48 You don't have to do any quote processing yourself.

32:50 It's already in the standard library.

32:52 Yeah, excellent.

32:53 Okay, very nice.

32:55 So that was sort of the look inside of what's in the box if you just have Python.

33:00 And then you said, all right, let's look outside at external packages and why not start with a better way to install packages?

33:05 Yeah, absolutely.

33:07 So for anyone who doesn't know about Flit and you found that the normal process for creating and publishing a Python package to be arduous, Flit absolutely is the thing that you need to look at because it automates, for simple packages, it automates almost entirely everything that you need to do.

33:25 It's by Thomas Klaver.

33:27 He's very active in the Python scientific community.

33:30 And yeah, I think it's just awesome.

33:33 I'm using Flit at the moment for several of my own smaller projects.

33:37 Yeah, it's cool.

33:37 So if you want to submit something to PyPI, you have to create a setup PY with a lot of various settings, you know, set the license in the right way so people can discover it and who's the author and where's the documentation and what version it is, all those kinds of things.

33:53 And if you install Flit, you can basically say, I'd like to initialize this package and it just lets you, it basically takes you through a Q&A and then it generates the things it needs to upload your package, right?

34:03 That's right.

34:04 Yeah.

34:04 And the Q&A is pretty short.

34:05 I think it's four questions or something like that.

34:07 If you, another good tip is the cookie cutter project by Audrey Roy Greenfeld.

34:13 And there's a cookie cutter project for creating a skeleton for a Python package.

34:18 And it's quite eye-opening when you, when you run the cookie cutter and you see how many files it creates in a folder.

34:24 There's a manifest.in and there are several other extra files that are used just to, just to create and publish your package.

34:30 Whereas Flit does away with all of that.

34:32 You've really just got the Flit.ini.

34:33 Nice.

34:34 And you can get your package on PyPI.

34:36 And it's quite, quite simple stuff in the INI file.

34:40 It's not outrageous, right?

34:41 Yeah, exactly.

34:42 Nice.

34:43 And so then you can say things like Flit wheel upload, and it'll just take whatever active package.

34:48 package you happen to be in with the version specified in the files and just package it up and send it, right?

34:53 Yeah, exactly.

34:54 I haven't tried Flit yet for packages with extensions.

34:58 So, yeah, I don't want to say that it can do that as well, because I just haven't tried that myself.

35:03 But that's something that I do want to dig into as well.

35:05 Yeah, absolutely.

35:06 Okay.

35:07 Well, another thing that is very common is to create some kind of shell utility or app that has some kind of terminal output.

35:16 And there's not a lot of facilities in the standard library for, like, Keller output and nice sort of antsy style graphics, if you will.

35:27 So, one of the things you talked about is Colorama, which I thought was pretty cool.

35:31 I've looked at it a few times.

35:32 Yeah, I feel very strongly about Colorama.

35:35 And the reason is because we, generally speaking, we write software for people, for other people or for ourselves.

35:43 And you see the output from software, particularly in the terminal.

35:46 You have to deal with that a lot.

35:47 And I think that making that output friendlier and easy to read and easy to understand context, but, for example, by using green for good and red for bad, makes it a lot easier to use programs, really.

35:59 If we have to write software that works in the terminal as opposed to writing graphical user interfaces, there's no reason why we can't make that output appear better.

36:08 The prompt toolkit is another good library for making interactive user interfaces in the terminal.

36:15 And I didn't cover that in the book, but I think later we come to the PT Python interpreter, alternative interpreter.

36:22 So, we'll get to that later.

36:23 But that is part of this.

36:24 The use of color, I strongly believe, can help to make better user interfaces on the command line.

36:30 Oh, I totally agree with you.

36:31 Yeah.

36:32 What makes Colorama so great is that they completely abstract away, again, platform differences.

36:37 So, your code that uses Colorama will use the correct antsy codes in a bash shell, but when you're writing it in a Windows command prompt,

36:45 it will also use the correct color codes for that environment.

36:47 So, I think that's really powerful.

36:49 You're not really committing to a particular platform by using Colorama.

36:52 And it is a well-maintained package that I think support goes back to 2.64 and they include 3.5 as well.

36:59 Nice.

36:59 And you said also that you recommended Color Log as a way to add coloring to your log messages.

37:05 So, like, warning is one color, error is another, and so on.

37:09 Yeah, exactly.

37:10 And it's two or three lines, and you get that functionality, and all your existing logging messages will just get those colors.

37:16 It's a really easy drop-in replacement just to make sure that you have colorization for all the different logging levels of your logging messages.

37:23 Yeah, I think it's great.

37:24 You see something red go by, you know, obviously.

37:26 Pay attention, right?

37:27 It's great.

37:28 Exactly, yeah.

37:29 Or bold red, I see, for critical.

37:31 Yeah, absolutely.

37:33 So, another thing that you talked about on the terminal, the CLI, is accepting arguments.

37:41 So, built-in, we have argpars, but there's maybe some better ways.

37:46 And one of the ways you recommended was the Begins library.

37:48 Yeah.

37:49 What's Begins?

37:50 Begins is a library that I first heard about at PyCon Australia in 2014.

37:54 The author, Aaron Isles, gave a very strong demonstration of Begins.

37:59 And it struck me at that time how much you can really do with Python if you exploit all the features of the language that are available to you.

38:08 So, the point that I made in the book was that Begins, just from the perspective of an API design, is extremely aggressive with exploiting everything that Python provides to you.

38:19 And, for example, the annotation, the variable annotation format in the function definitions, Begins uses those annotations for the docstrings of each of your parameters so that you don't have to add that anywhere else.

38:30 And I really like the way the Begins API was designed to give you as much functionality as possible for as little input from you.

38:38 I like that trade-off very much.

38:40 Yeah.

38:41 It's really nice.

38:41 What I have heard from many people, though, is that they much prefer a slightly more rigorous specification format like what you can get now in the click library.

38:52 And docopt also gets a lot of love, which is another way of creating your command line interface by, not docopt, I forget the name now.

39:00 But there's another library where you can write out the help message of your CLI tool.

39:07 Yeah.

39:07 And it'll do it in reverse.

39:08 It's basically the reverse of Begins.

39:10 I think that is docopt.

39:11 Yeah.

39:12 Oh, okay.

39:12 It is docopt.

39:13 Yeah.

39:13 So, it's the reverse of Begins.

39:15 You write out your help message that will be printed when the user types help, and then it infers what all your parameters are.

39:20 That is also fairly popular.

39:22 Even so, I have found that for my own small scripts, Begins gets me going much, much faster.

39:27 And even subcommands are very, very easy to enable.

39:32 Right.

39:32 So, basically, you have some method.

39:34 You want to give some kind of CLI to it.

39:36 It takes some parameters, and you just give it a decorator or a subcommand decorator.

39:41 And now it is accessible, and it's part of the help text and all that.

39:44 That's correct.

39:45 Yeah.

39:45 Excellent.

39:46 We all love Python for its tremendous productivity benefits, but getting the best performance takes some work.

40:06 What if you could get out-of-the-box, easy access to high-performance Python?

40:11 Intel distribution for Python developers delivers just that.

40:14 Get close to 100 times better performance for certain functions when using NumPy, SciPy, scikit-learn, linked with optimized native libraries like Intel Math Kernel Library, access efficient multi-threading, and Python projects like Numba and Scithon.

40:28 Try the Intel distribution for Python and experience performance today at talkpython.fm/Intel.

40:34 And profile your Python and native C, C++ applications for performance hotspots with Intel VTune amplifier.

40:42 With Intel, it's all about performance.

40:52 All right.

40:53 All right.

40:53 So let's move into the GUIs, the graphical interfaces.

40:57 And one of the first things that you talked about is creating interactive, dynamic graphs and things like that.

41:05 And while Matplotlib plays a big role there, you also talked about PyQT graph.

41:12 Why did you pick that over, say, Matplotlib?

41:14 The primary reason why I have selected PyQT graph over Matplotlib is for interactivity.

41:20 It's hard to imagine that you could have a highly performant charting library for Python, but that is exactly what PyQT graph is.

41:27 It's based on QT as the widget toolkit that runs in the back end.

41:31 But the interactivity is really good.

41:34 You can have graphs that draw spectra running at 50 frames a second quite easily.

41:39 And you can drag and zoom and pan all the while the animation is happening.

41:43 So you could have a data stream where you're plotting the data as it comes through live.

41:48 Oh, wow.

41:48 Whereas with Matplotlib, that degree of interactivity is not really there.

41:51 I see.

41:52 It's not because of a lack of ability.

41:54 It's because Matplotlib has been designed towards producing publication-ready type charts in a similar way to what Matlab's charting facilities were designed.

42:04 Whereas PyQT graph has been approached with a whole different use case in mind.

42:09 So in my chemical engineering work, PyQT graph has been very valuable for me to be able to plot live data and then examine it in real time, pan and zoom and move my moving data sets around.

42:21 That's awesome.

42:22 Yeah, if that was all that PyQT graph provided, that would already be enough.

42:27 And that was my largest use case for it.

42:29 But it has a fairly feature-complete widget library in the background that lets you plot, not plot, but create widgets on the fly for arbitrary Python data structures.

42:40 So you can get input cells and sliders and so on that can manipulate your data.

42:44 And PyQT graph provides all of that as well.

42:47 Yeah, that's excellent.

42:47 Yeah, if you want to embed some kind of like live data thing into your app, it sounds really cool for that.

42:52 Yeah, it definitely is a good choice.

42:54 And especially if you're already using PyQT, PyQT graph is a drop-in replacement.

42:58 Yeah.

42:59 You can add its chart windows as a widget inside your existing PyQT app.

43:03 Yeah, that's really great.

43:04 And there's a lot of interesting talk around PySide coming back to the same company that does Qt.

43:12 And that's, yeah, it sounds like it's going to be a real, like this is a vibrant growing area.

43:18 So that's great.

43:19 Yeah, I've got my eye on the resurgence of PySide as well.

43:22 Yeah, cool.

43:23 Yeah, I'm totally, totally excited for that.

43:25 So then the next thing that you talked about was one way to build your apps is using these graphical frameworks.

43:32 But a very popular one, even using CSS front-end frameworks like Bootstrap and stuff, is web development.

43:38 So there's another interesting library that lets you have Python logic in a desktop application,

43:45 but actually presents the user interface through a GUI.

43:48 You want to tell us about that?

43:50 Yeah, so I also included PyWebView, which is something that I found while doing the research for the book.

43:55 It's not something that I had used before.

43:57 But I was blown away at how such a powerful tool exists, and it could not be better known.

44:03 Most people know about the Electron framework and the Chrome-embreaded framework,

44:07 which can also make these desktop apps that rely on the WebKit engine to provide the visualization layer.

44:14 The interesting thing about PyWebView is that it doesn't require you to bundle something like Electron with your app.

44:20 It just uses the native browser.

44:22 I see.

44:22 Which is really amazing.

44:23 That is amazing.

44:24 Is it cross-platform?

44:25 Yeah, it's cross-platform.

44:26 So it will use Internet Explorer or Edge on Windows, and it'll use the WebView widget on OSX, which is what powers Safari,

44:35 and then on Linux it will use whatever is native there.

44:37 Interesting.

44:38 Yeah, I may be pushing up to the edge of the things I've actually played with.

44:42 But the Electron stuff, in order to implement that and put the logic in it, that's JavaScript, right?

44:52 Correct.

44:53 So if you want to have, well, no, not necessarily.

44:56 So if you just imagine a normal web application where you have your front-end layer that runs what we would say as in the browser with your CSS and your HTML and your JavaScript,

45:08 and then you have your back-end layer, which provides an API that receives REST calls or whatever, which we would typically write in Python.

45:15 You do exactly the same thing, but you just package it in one bundle as a desktop application.

45:19 Nice.

45:19 So from the perspective of the viewer, with PyWebView in particular, you don't really know that you're in a browser because you don't have all the same features and trimmings and buttons and menus that you get in a browser.

45:29 All you really get is the WebView window, and then you get to put in there whatever you want.

45:33 But you can power that with all these same technologies.

45:35 So you can specify the layout of your screen using HTML and style it with CSS.

45:42 And if you want some interactivity in the graphical layer itself, then you would have to write that with JavaScript as you would with a normal web application.

45:49 But your Python, that can provide much of the logic and back-end processing of what your use interface is advertising, can also run alongside that application on your desktop.

46:01 So from the perspective of a user of such an application, they would be oblivious to the fact that Python was being used at all.

46:07 Yeah, it looks really interesting.

46:08 And I kind of prefer CSS and HTML for GUI design.

46:13 So I may have to try this out.

46:15 Definitely worth looking into.

46:17 I definitely recommend it.

46:18 In the example that I used, I also used another Python library called Dominate, which allows you to create the HTML DOM and structures within the DOM directly from inside Python code.

46:29 But that was just me being too cute, I guess.

46:32 Just because you can.

46:32 You can also.

46:33 Yeah, exactly.

46:34 You can just write your HTML and CSS out as you normally wouldn't load that.

46:38 And that works fine.

46:38 Right.

46:39 So you would.

46:40 Could you do something like have like a Chameleon or Jinja 2 template and something like that and pull that in?

46:47 Yeah, absolutely.

46:48 Okay.

46:48 Definitely.

46:49 No question.

46:49 And the big benefit of PyWebView over using Electron is, again, that you don't have to distribute a fairly large browser engine alongside your app.

46:59 Yeah.

47:00 Excellent.

47:01 If you can find a way to bundle just the Python parts of your app, when you run it, it will use the native web widget of your target operating system.

47:10 Excellent.

47:11 Okay.

47:12 I really like that one.

47:13 And I definitely want to have a look at it as well.

47:16 So moving on to sort of the systems management, system tool stuff.

47:21 The first one you brought up was an example of something I was trying to do in one of my online classes that I was building.

47:27 And I'm like, oh, why is this so hard in the built-in process stuff?

47:34 And that's about managing processes with PSUtil.

47:36 Yeah.

47:37 So worry no more because there's a library called PSUtil that does everything you could possibly want in terms of accessing information about the system and more.

47:47 I have a feeling that PSUtil is going to be bad for the business for many monitoring companies, server monitoring frameworks, because it's so easy to run PSUtil in a daemon on your server and get it to send information back to you about which process is consuming how much memory.

48:03 If your particular application is misbehaving in some way or if the system itself has started to change how it is supposed to be operating, PSUtil makes all of that really easy.

48:14 I reckon in a day or two, you could probably whip something up that can give you as good performance monitoring as what you could get from a cloud provider currently.

48:21 Okay.

48:22 Wow.

48:22 Yeah.

48:23 You can ask for things like, what's the CPU percent on the system?

48:27 And if you have like eight cores, it'll give you an array of eight floating point numbers that are percents.

48:32 And you can say, what is the current process that I'm in?

48:34 How much memory is it using?

48:35 And things like that.

48:36 It's great.

48:37 Yeah.

48:37 And you can access all the other processes as well.

48:40 You can get similar information from all of them, from everything that's operating on your system.

48:44 Yeah.

48:45 Nice.

48:46 So that was for watching processes and system stuff.

48:50 Another thing that people often have to do is they have to watch a directory for when a file either changes or a new file arrives.

48:56 Like somebody's uploaded some new CSV file.

48:59 We've got to ingest it and do work on that.

49:01 And you talked about a thing called a watchdog for that.

49:05 Yeah.

49:05 Have you used that before?

49:06 I've not, but it sounds really cool.

49:08 And like the others you brought up, it's very nice that it's cross-platform, even though that implementation is quite different on the different OSs.

49:15 Yeah, that's right.

49:16 And just like psutil, it abstracts away some fairly complex work into a very nice, very easy to use API.

49:23 That is, again, cross-platform.

49:25 One of my requirements for selection in the book throughout is that every library had to work in a fairly easy way on all the three big target platforms.

49:35 So they should all be cross-platform.

49:38 And watchdog probably does the best job of hiding platform differences away because these notification systems are quite different on each of the target platforms.

49:48 And luckily, you don't have to worry about that whatsoever.

49:51 It completely hides away those differences.

49:53 And as you said, it gives you a way to monitor a particular directory for any changes.

49:57 Yeah, it's really nice.

49:58 So you just create a class driving from some built-in monitor event handler type of thing.

50:03 And you say, call this function when you create one or call this one when a function is modified.

50:08 And then you can just tell it to start observing.

50:11 And it actually does that in the background on a background thread, right?

50:14 Yeah, that's right.

50:14 Yeah, very cool.

50:15 So the other one you talked about, you alluded to before, is ptpython.

50:23 Which I've not played with this, but I'm thinking this is getting installed.

50:26 Because this is, I really don't love the REPL that much, the built-in one.

50:31 But this is cool.

50:32 I need to check this out.

50:33 So tell us about it.

50:34 Yeah, so ptpython is based on another Python library called Prompt Toolkit, which is a toolkit for making user interfaces in the shell or in a command line view.

50:45 And ptpython is a replacement Python interpreter, but it's supercharged for editing and editing history and bringing back previous functions and changing them.

50:55 And it has color support and a whole bunch of other features as well, which I could not get to in the discussion.

51:02 I pretty much the first thing that I install after updating PEP and setup tools in a new virtual env is ptpython.

51:08 And that's the interpreter that I use for doing any of that interactive kind of work.

51:12 Yeah, it makes a ton of sense.

51:13 Like, for example, if you, one of the things that drives me crazy in the REPL is I'll type out like a function or an if statement or a loop more likely.

51:22 And then I'll either make a mistake or I want to run it again slightly differently.

51:27 And then you've got to up arrow.

51:28 Like, okay, I know I'm going to up arrow five times and hit enter and then like sort of unroll the history so I can get back.

51:35 And then I got to remember the line I changed.

51:36 And like this one, if you say, I want to go back to some multi-line thing I worked on,

51:40 it actually pulls up the multi-line thing right there, which is already makes it worthwhile.

51:46 Plus the color and the auto-completion and all that.

51:49 That's great.

51:49 Yeah, that's right.

51:50 And so when you press up arrow and you get that multi-line statement that you did earlier,

51:53 I use the VI key bindings.

51:56 And that all works.

51:57 I can go to the top of the line, go down.

51:58 I can DD to delete a line or yank and paste.

52:02 So if you're used to Emacs, they have Emacs key binding support.

52:05 And if you're used to VI, you can enable VI key binding support.

52:08 And you get the power or much of the power of those keystrokes and commands inside every single line that you edit and enter inside PtPython.

52:18 Which is so much better than the built-in.

52:21 Yeah, that's fantastic.

52:22 If you run in a split screen in your terminal where you have, for example, your editor in the top half and a command line on the bottom half.

52:28 If you run PtPython in the bottom half, what's really interesting is if the key bindings match the editor that you're using,

52:34 you almost begin to feel like you're working in one environment because the key bindings work in your editor.

52:40 And then when you jump to the REPL, it works there as well the same way.

52:43 So that's really nice.

52:44 I work like that almost continuously.

52:46 Oh, that's really nice.

52:48 Yeah, I like it.

52:49 I'm definitely going to install it and check it out.

52:50 The next thing is moving on to the web APIs and HTTP services and so on is something I had not heard of,

52:58 but it's very nice.

53:00 It's called Hug for building APIs.

53:02 Yeah.

53:03 So just like we discussed earlier with Begins, what I really, really liked about Hug is how they try to maximally exploit the features of Python

53:12 to make as simple as possible use interface for you as a programmer to implement an API.

53:17 I have had experience before with the Django REST framework, which is an awesome industrial strength, very well designed, very sturdy and robust REST framework.

53:28 So I recommend that one strongly.

53:30 Flask also has a good REST framework.

53:32 Those are not bad choices at all.

53:33 But I did have the impression that very few people knew about Hug.

53:37 And for simpler kinds of applications, I think Hug makes it extremely easy to get a REST interface up.

53:43 Yeah.

53:44 And it's very service oriented, right?

53:47 It's not looking like some web framework that also happens to allow machines to talk to it and return JSON.

53:56 Like, for example, to take a function and make it a return JavaScript for get requests, you just say at Hug.get and you give it that decorator or a post or whatever, right?

54:08 And you make it a API.

54:11 Yeah, exactly.

54:12 And you're done pretty much.

54:13 And you get the documentation because it auto generates that from your function declarations.

54:17 And versioning is also pretty easy to add, which I had in the later section.

54:22 Yeah.

54:22 So basically, if you make a request to the base URL for the host that's running the Hug service, it will actually describe all the services and how you talk to them and what's the inputs, the outputs, everything.

54:35 And like you said, you can put versioning on it.

54:37 So basically, you don't have to go and change everything about your methods and try to somehow bolt versioning on.

54:47 You can just say in your decorator, this is for version two of the API.

54:51 It also does argument conversion and stuff like that, right?

54:55 That's right.

54:56 Yeah.

54:56 Nice.

54:56 So, cool.

54:57 It helps in the documentation, I guess, as well.

55:00 If you say, here's an integer and its name is this, like the documentation can say, hey, it takes an integer called this.

55:04 Nice.

55:05 Yeah.

55:06 Yeah, absolutely.

55:07 Documentation, particularly for things like this, it's really a pain to write by hand.

55:12 And no one should ever do that.

55:14 Definitely, you want to use a tool that makes it really easy to produce documentation and to keep the documentation up to date.

55:20 That's the key part, right?

55:21 Keep it up to date because it's easy to create it and then just leave it.

55:25 You know what?

55:26 Well, I guess that changed.

55:28 Yeah.

55:28 Sorry, that documentation was wrong.

55:30 Yeah.

55:30 Nice.

55:32 Okay.

55:32 So, one of the things that's pretty challenging, I think, let me rephrase that, is more challenging than I think it should be, is working with dates in Python.

55:40 And so, you have some cool libraries to work with that that you've found.

55:44 That's right.

55:44 So, the first option that I had there is not that unknown, I guess.

55:48 Many people who have had to deal with dates and times have used Arrow for several years now.

55:53 And the key thing about Arrow, or at least the key thing for me, I guess, is that it does away with this idea of having naive date times and so-called aware date times.

56:03 Aware date times are date time objects that carry with them the time zone that they apply to.

56:08 And naive date time objects do not have the time zone information attached.

56:13 And, yeah, things get really out of hand if you start mixing and matching those without an awareness of what you're doing.

56:19 And just by using Arrow, because it uses aware date time objects everywhere, simply by using Arrow, it means that you can avoid a certain class of problems where you're mixing up dates and times incorrectly.

56:31 Yeah, and you run into weird problems.

56:32 Like, if you try to subtract two date times, normally you get a time delta.

56:37 But if one of them is time zone aware and one's not, then it will crash, right?

56:42 So, no, you can't subtract these.

56:43 Yeah, that's right.

56:45 And worse is when you don't get crashes and you do arithmetic operations and the results that you're getting are not what you think you're getting.

56:52 For example, in one part of your code base, you might call the now function.

56:56 So, daytime.now, and then you get a time object.

56:59 And in a different part of your code base, you call a very similarly named function called UTCnow.

57:04 The problem is that the one gives you the time as it is in the UTC time zone, but without a time zone object attached.

57:12 And the first one gives you the time as it is now, but in your local time zone.

57:15 And the problem is that as a programmer, depending on the context of the code, you may perceive those two values to mean literally this moment in time right now.

57:24 But the values are vastly different.

57:26 They're obviously different by the extent of the times and differences.

57:29 And so, when you do operations on them, you get very strange results.

57:32 Or results that seem strange to you because of the assumptions you've made about what now actually means.

57:37 Right.

57:38 So, by using aware datetimes, you don't have those problems anymore.

57:41 The time deltas that you obtain by doing operations between these are always correct.

57:45 Yeah.

57:46 So, even if one is from .now and others .utcnow, it knows to normalize those to some common time zone before it does math.

57:57 Right?

57:58 Like, if it wants to look how far apart they are, I would say, no, those are actually, you know, either the same or like one millisecond apart or something like that.

58:05 Yeah, absolutely.

58:06 Nice.

58:07 So, that's coming from the universe into now.

58:10 Into, you know, bringing in the time and working with it.

58:15 The next library you talked about is about pulling time that's been saved already into a bunch of different formats and processing that.

58:22 Because parsing time can be super challenging.

58:25 I was talking on the previous episode with Anna Schneider.

58:28 They were pulling together data sources from all these different utilities.

58:31 And they said they have many, many different formats for time.

58:35 That there are over 700 different formats for time out there.

58:38 So, trying to just like deal with all that stuff is super painful.

58:41 So, pars datetime, which is what you talked about, really actually does an amazing job of that.

58:47 Even for human type stuff.

58:48 Yeah, that's right.

58:49 And this is another library that I discovered while doing the research for the book.

58:52 I had not used this one before.

58:53 I tried several libraries like this, but I was amazed.

58:56 In the section of this book, I give some examples where parsed datetime is used to parse fairly typical looking datetime strings.

59:05 But in the second half of the section, there's much more natural language type string sets that it also parses and does really well.

59:13 I had a lot of fun doing the section because I tried to find ways of writing my statement of what day it was in very different ways, in very strange ways.

59:20 And it seemed to get all of them.

59:22 Right.

59:23 Yeah.

59:24 The last option that I had in my list was a string that said two weeks and three days in the future.

59:29 And pass datetime correctly parsed that.

59:31 I know.

59:32 It's so amazing.

59:33 Pretty amazing.

59:33 Like when I had in mind of what would work, you know, you'll say, look, you can give it like 2016-07-16 or 7-16-2016.

59:43 These types of things.

59:44 And then I'll actually parse those all correctly.

59:46 But then you started to get more interesting.

59:49 And you said like yesterday, 10 minutes from now, three days ago.

59:53 And it just totally got all this.

59:54 And then you got to the most outrageous one, like you said.

59:56 Two weeks and three days in the future.

59:58 That's awesome.

59:59 Yeah, it's pretty cool.

01:00:00 I would really want to use this in an upcoming project.

01:00:03 I just need to find the right project.

01:00:04 Yeah, it's really good.

01:00:07 I have my PyPI package and I'm looking for the project in which to use it.

01:00:10 Exactly.

01:00:11 And I think I confused Arrow with parsed at time.

01:00:15 Arrow has the ability to give you like human relative time.

01:00:21 So you can say on any arrow time, you can say humanize and it'll say just now, ask it again a little bit later.

01:00:27 It'll say seconds ago or two hours ago or two hours in the future or something like that, which is really nice.

01:00:34 Yeah, that's awesome.

01:00:35 And the multilingual support is really good as well.

01:00:37 I see that as being hugely valuable in web services and web development.

01:00:42 No, I totally agree.

01:00:43 Then the last part that you looked at, you said, okay, let's, these are all very purpose focused packages that we've talked about.

01:00:51 Parsing date, time or scheduling something to recur every so often.

01:00:55 But there's a couple of general purpose libraries that you talked about.

01:00:59 And the first one, Boltons, is from a multi-time guest on the show, Mahmoud Hashabi.

01:01:05 And he put that out there from the guys at PayPal, which was great.

01:01:09 So you want to talk a little bit about what's good with Boltons?

01:01:11 Yeah, for sure.

01:01:12 The first thing, though, is that what's quite interesting to me, if you compare Python to some other languages,

01:01:17 is because the standard library is so big and it covers a lot of ground,

01:01:22 it's quite rare to find general purpose libraries in the Python ecosystem.

01:01:25 I thought that was quite interesting.

01:01:27 Boltons is one of the few ones that I did manage to find, where the intention is literally just to be a general purpose library for use in very, very different spheres.

01:01:37 Most of the packages that you get on the package index are dedicated towards a singular purpose, usually,

01:01:43 to perform some function, like all the other libraries that we've looked at.

01:01:47 So I really thought that was interesting in doing the research, that there are not very many general purpose libraries.

01:01:52 And my conclusion is that that must be because the standard library covers already most of the general purpose type of things that you need to do.

01:01:59 I agree that that's probably true.

01:02:00 I wonder if another reason, a secondary reason, is that it's pretty easy to bring in a bunch of small libraries.

01:02:09 It's not like you've got to download and get the header files and the lib files and the right, you know,

01:02:14 just statically linking and all that kind of stuff that you have to normally deal with.

01:02:17 It's if you just pip install and import a bunch of stuff, you're good to go.

01:02:21 So maybe it's also easier to have small libraries possibly.

01:02:25 But yeah, I think you're right that because a lot of stuff is built in, a lot of people maybe put their energies towards fixing built-in stuff.

01:02:33 And it's just, you know, it's a 25-year-old standard library, right?

01:02:36 It's pretty polished at this point.

01:02:38 I think you make a lot of sense.

01:02:39 It also makes a nice parallel with the Node ecosystem where, similarly, there aren't too many general purpose libraries.

01:02:45 And that's probably because it's so easy to bring in a lot of smaller libraries to make up the feature set you require.

01:02:50 Yeah, absolutely.

01:02:52 So maybe we could just really quickly touch on a few of the things.

01:02:54 So one of the things that's in there that's pretty nice is the caching functionality.

01:02:59 So you start out with cache utils.

01:03:01 That's right.

01:03:02 So the killer feature of the cache functionality in Boltons is the way that you can share a cache among multiple function calls.

01:03:09 It's not that easy to do with the LRU cache that you get in the standard library.

01:03:13 It's in the functools module.

01:03:16 So you have to import functools.lrucache.

01:03:19 The one that you get in Boltons is very easy to share amongst many different function calls as a decorator.

01:03:25 That is the main attraction for me to use the cache in Boltons versus the one in the standard library.

01:03:29 The LRI cache is also kind of interesting.

01:03:32 I had to stretch a little bit to come up with an application to use both caches in the same code base.

01:03:38 It's definitely good to keep an eye on the LRI cache.

01:03:42 Yeah.

01:03:42 Okay.

01:03:43 And let's see.

01:03:44 There was some other stuff that I thought was pretty interesting in there.

01:03:46 One of them you had talked about was the at exit function, which I'd never used the at exit function.

01:03:51 Yeah.

01:03:52 The at exit function is pretty neat.

01:03:54 You can basically set something up to run when it exits.

01:03:58 Right.

01:03:58 Yeah.

01:03:58 So if you want to save some data structures and reload them at startup, you can just register these.

01:04:03 Here's the shutdown functions to make sure you run.

01:04:05 That's cool.

01:04:05 Yeah, exactly.

01:04:06 Another stuff that was in there that was nice was the iter tools that would give you like windowed chunked data.

01:04:12 So for example, you talked about displaying that data interactively using PyQt graph.

01:04:20 You could maybe combine that with the windowed iter tool.

01:04:23 The iter tool is a windowed iter behavior and take some sort of stream of data and always show the last 50 pieces of the data,

01:04:30 like in a few lines of code, right?

01:04:31 Yeah, absolutely.

01:04:32 For sure.

01:04:33 The chunking and the chunked iter and the windowed iter, in my opinion, they're much better than the recipes that the standard library gives in its iter tools documentation,

01:04:41 which I see as a fairly clunky way of piecing together building blocks from iter tools to get these same effects.

01:04:48 I think it would make a pretty good addition to the standard library to have this chunked iter and windowed iter functions.

01:04:54 Yeah, it's very possible that eventually some of these just become consumed into the standard library over time.

01:04:59 It's great.

01:04:59 You also have, there's also some nice debugging tools.

01:05:02 That was cool.

01:05:04 So you can say, yeah, PDB on signal, for example.

01:05:09 How would you use that?

01:05:10 Yeah, so you can attach this PDB on signal function inside your running application.

01:05:16 And then by default, a keyboard interrupt handler will automatically be added to your program so that when a crash does occur,

01:05:23 or when you, not a crash, but when you control C to stop your program, your program can stop at that point in a debug session.

01:05:30 So for example, with a long running loop, if it's taking too long and you're wondering whether the program is doing the correct thing,

01:05:36 or perhaps you suspect that it is no longer doing the correct thing, you can control C and you can get a debug prompt inside the loop wherever you sent the signal.

01:05:45 Yeah, that's awesome.

01:05:46 Just, that could be pretty handy in the right situation.

01:05:48 Yeah, if you're wondering what the heck is this process doing?

01:05:50 Is it stuck talking to the database, stuck talking to the web service?

01:05:54 Is it just broken?

01:05:55 Yeah, let's have a look, right?

01:05:57 Yeah.

01:05:58 Yeah, very nice.

01:05:59 Okay, so let's, let's almost at the end so we should wrap it up.

01:06:03 But the last major piece that you talk about in the general libraries is Cython.

01:06:07 Yeah, so this 20 Libraries book actually came about as a follow on from an earlier video screencast series

01:06:15 that I did for O'Reilly, which was on Cython.

01:06:17 It's a huge five and a half hour long set of 75 videos covering how to get into Cython and how to start using it.

01:06:25 That sounds great.

01:06:26 I'll be sure to link to it from the show notes for everyone.

01:06:28 Yeah, sure.

01:06:29 I'll give you the link.

01:06:30 And yeah, Cython, I think, has started now to gain some, some mindshare in the Python community, but not that many people are still using it yet, because it does introduce some things that are more complex than what you usually have to deal with in a Python package.

01:06:45 For example, compiling with C extensions.

01:06:46 However, Cython, among many other things, Cython can give the average Python programmer two key things that have, that have long been desired.

01:06:57 The first one is Cython can speed up hotspots inside your source code easily affect a hundred or more if we're talking about basic math computation.

01:07:06 A hundred times is not something to take lightly.

01:07:09 It's the difference between, you know, running for a hundred days or running for one day for a very big long running process.

01:07:15 And the second thing that Cython can give you, again, for the right kind of situation, which might be mathematical computation, is an easy way to run your threads on different CPUs without the global interpreter lock interfering in any way whatsoever.

01:07:28 Something that I tweeted just yesterday was, there seems to be a misconception in some circles that you need OpenMP support to use parallelization in Cython.

01:07:36 And that's not the case.

01:07:38 You can get pretty good parallelization just with normal Python threads, as long as inside your Cython functions where you want to enable that, you release the GIL.

01:07:46 Right.

01:07:46 And there's a way you can even do that with context managers, right?

01:07:49 You can say with no GIL or something like that, right?

01:07:51 That's exactly right.

01:07:52 Yeah.

01:07:52 So for me as a Python programmer now in the situation that I'm in, the GIL is not really that big a problem for me.

01:07:59 It depends on the details of the situation.

01:08:02 But for heavy math computation, where I want to be able to access all the cores, and I simultaneously want my code to run faster than it normally might just with a plain CPython interpreter, Cython gives me both of those things in the same package.

01:08:14 It sounds really great.

01:08:15 I've not had a chance to do enough scientific work, but I can see it even being useful outside of scientific computational stuff.

01:08:24 For example, if you're writing, let's just say I'm writing some kind of ORM or something, and I'm spending a ton of time taking objects off the stream and the data layer and actually turning those into objects.

01:08:37 And just that processing there is like some big hotspot if I do a query that returns 100,000 records.

01:08:43 Maybe that loop could be written in Cython.

01:08:46 Is that right?

01:08:47 Yeah, absolutely.

01:08:47 So a good example is something that I've been doing at work for the past week, which is converting our protocol buffer code away from using Google's protocol buffer implementation to using a new tool called Pyrobuff.

01:08:59 Pyrobuff is itself written in Cython, and it generates a pyx file for you to use as your object implementation of the protocol buffer rather than just using your normal Python-based implementation of the protocol buffer.

01:09:12 I'm not going to go into what the protocol buffers are, but basically it's exactly what you were saying, object serialization that you shuffle between two places.

01:09:19 And Cython has been used to create the protocol buffer using Pyrobuff.

01:09:23 But in addition to that, I use the object, the particular object that that process generates inside another Cython file, which I can then use directly with no overhead from the Python interpreter.

01:09:32 Yeah, that's great.

01:09:33 That's really cool.

01:09:34 All right.

01:09:35 So you've definitely given me a broader view of where Cython is applicable, and that's cool.

01:09:40 So let's round it out with one final awesome thing that you point out.

01:09:45 And that's a GitHub repository or project that is just a huge collection of stuff like this that people have found awesome.

01:09:55 Like here's all the awesome Python packages for OCR.

01:09:58 Here's the awesome ones for e-commerce and so on.

01:10:01 And that's awesome Python on GitHub.

01:10:02 Yeah.

01:10:03 Awesome Python is so awesome that many of the other language communities have now begun to copy it.

01:10:08 So you can also find awesome Go and awesome Ruby and many of the other variations.

01:10:15 Very cool.

01:10:16 Very cool.

01:10:17 So, Kayla, this has been really interesting.

01:10:20 I learned a lot from your book and not necessarily many of the pieces that we talked about, but maybe even the ones that we didn't get a chance to cover.

01:10:30 There's a bunch of other interesting packages that we're using in conjunction with your demos that I'll leave it just vague.

01:10:38 And people can go check out the book, which at least a little while ago you could get as a free e-book from O'Reilly.

01:10:44 I'll link to it.

01:10:45 But highly recommended.

01:10:46 I think it was time well spent to go through it.

01:10:49 So thanks.

01:10:50 Okay.

01:10:50 Yeah, my pleasure.

01:10:51 Yeah.

01:10:51 So let me ask you, as I always do everyone, but it's a bit of a bigger list to pick from.

01:10:58 What's your favorite PyPI package?

01:11:00 If you have one out of all the books that you would say, like, okay, this is the thing people should take away if they're not going to get the book.

01:11:06 My pick for the PyPI package is pretty much anything inside the Bware project.

01:11:11 I very strongly feel that the contributions that Russell is making are very positive for the Python community and they're forward looking.

01:11:18 The things that he's working on in the Bware project are things that we need to have happen in our community and our space.

01:11:24 So for anyone who's thinking about finding a project online that they want to contribute to, maybe get a little bit of experience, that is a great place to go.

01:11:32 There's the Togo framework, which is intended to be used as a way of writing platform-native graphical user applications in Python.

01:11:40 But there are a whole bunch of other smaller projects that you can adopt and get into and dive in and play with the details.

01:11:45 There are projects for running Python on iOS, projects for running Python on Android, and a bunch of different other features.

01:11:52 He also has a project for packaging up Python for deployment to target machines, which is another issue that many people in Python feel has been, I guess, under-addressed.

01:12:01 The issue of deployment.

01:12:02 Yeah, I definitely think it's under-addressed for desktop spaces, absolutely.

01:12:06 Absolutely.

01:12:07 For sure.

01:12:07 Or mobile, for that matter.

01:12:09 But anywhere but the web or just your shell, I guess.

01:12:12 Yeah.

01:12:13 Yeah, okay, very cool.

01:12:14 That's great.

01:12:15 Check that one out.

01:12:15 So, Beware.

01:12:16 The name of the project is Beware.

01:12:18 If you're looking for something to work on, you could do much worse than that project.

01:12:22 If you're writing some code, what editor do you use?

01:12:23 Yeah, that's a great question.

01:12:24 So, I've been doing this a while now.

01:12:26 And for most of those years, I've been using Vim.

01:12:28 And since January, I've started using PyCharm because the scales have tipped the balance for me.

01:12:35 And the features that PyCharm now provides outweigh what I can do to the best of my knowledge with Vim configuration.

01:12:42 So, yeah, I'm now writing my Python code in PyCharm.

01:12:45 All right.

01:12:46 You and me both.

01:12:47 I love that one as well.

01:12:48 That's great.

01:12:48 Yep.

01:12:49 All right.

01:12:49 Final call to action for listeners out there.

01:12:52 First of all, check out the Beware project and contribute to that if you're looking to write some code.

01:12:56 Anything else?

01:12:57 Get your book.

01:12:57 Where do they get it?

01:12:58 Oh, you can get the book at O'Reilly.

01:12:59 I think if you search for 20 Python libraries you aren't using, that should be enough.

01:13:02 Google will find it for you.

01:13:04 And of course, then there's my Cython course as well.

01:13:06 If you do want to get more into Cython, you can check out my course.

01:13:09 As far as I know, I think it's the only video course currently available for Cython.

01:13:13 But I might be wrong about that.

01:13:15 But that's something else to check out.

01:13:16 And then maybe the last thing I would mention is just as a general comment, sometimes you see on forums like Reddit and other places, there's a lot of dissatisfaction with some of the decisions that the core Python development team make regarding certain features in the language and what gets included and what gets excluded and so on.

01:13:32 And I would encourage people to follow the newsletters and the mailing lists to see a bit more about the discussions that go into these decisions.

01:13:40 The core team has many, many difficult and complex issues to deal with regarding the features that they include or exclude.

01:13:46 And before I started the mailing list for Python Dev, I had the same thoughts about, you know, why was this designed that way?

01:13:52 Why didn't they include that?

01:13:53 Why is that not done?

01:13:53 But once you begin to follow the mailing list and you start to see the discussions and the complexities that they have to deal with, for example, in the pip project, that's another good example.

01:14:01 Once you begin to see the complexities that these teams are dealing with, you begin to understand why the decisions get made in the way that they do.

01:14:07 So I just want to make a point there that if anyone feels dissatisfied with what the core Python team has been doing, get involved and find out more about why the decisions are getting made in particular ways.

01:14:18 Yeah, I think that's great advice.

01:14:19 Certainly looking at how the trade-offs are being chosen is definitely important.

01:14:24 Thank you so much for sharing your book and all this research you did.

01:14:28 It's really helpful.

01:14:29 Sure, my pleasure.

01:14:30 I think people should, they should check out the book.

01:14:32 They'll definitely enjoy it.

01:14:32 Thanks for being on the show.

01:14:34 Yeah, thanks, Michael.

01:14:35 Great.

01:14:35 You bet.

01:14:35 Bye.

01:14:35 Okay, bye.

01:14:36 This has been another episode of Talk Python To Me.

01:14:41 Today's guest has been Caleb Hadding.

01:14:43 And this episode has been sponsored by Capital One and Intel.

01:14:47 Thank you both for supporting the show.

01:14:48 Are you a data scientist or Python developer who loves data?

01:14:53 If you're looking for a place to work on data science with truly big data that can affect millions of lives, then head on over to jobs.capitalone.com.com.

01:15:01 And check out the wide range of jobs that Capital One is trying to fill right now.

01:15:07 The Intel distribution for Python delivers the high-performance Intel C libraries built right into Python.

01:15:13 Get close to 100 times better performance for certain functions when using NumPy, SciPy, and scikit-learn.

01:15:19 Check them out at talkpython.fm/Intel.

01:15:22 Are you or a colleague trying to learn Python?

01:15:25 Have you tried books and videos that just left you bored by covering topics point by point?

01:15:30 Well, check out my online course, Python Jumpstart, by building 10 apps at talkpython.fm/course to experience a more engaging way to learn Python.

01:15:39 And if you're looking for something a little more advanced, try my Write Pythonic Code course at talkpython.fm/pythonic.

01:15:46 You'll find the show notes and links from this episode at talkpython.fm/episode slash show slash 77.

01:15:53 Be sure to subscribe to the show.

01:15:56 Open your favorite podcatcher and search for Python.

01:15:58 We should be right at the top.

01:15:59 You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct RSS feed at /rss on talkpython.fm.

01:16:09 Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix.

01:16:14 Corey just recently started selling his tracks on iTunes, so I recommend you check it out at talkpython.fm/music.

01:16:20 You can browse his tracks he has for sale on iTunes and listen to the full-length version of the theme song.

01:16:26 This is your host, Michael Kennedy.

01:16:28 Thanks so much for listening.

01:16:29 I really appreciate it.

01:16:30 Smix, let's get out of here.

01:16:32 Stay tuned.

01:16:54 Don't put it on the ground.

01:16:55 Thank you.