Learn Python with Talk Python's 270 hours of courses

Awesome Python Applications

Episode #234, published Tue, Oct 15, 2019, recorded Tue, Sep 24, 2019

Have you heard of awesome lists? They are well, pretty awesome! Gathering up the most loved libraries and packages for a given topic.

While most lists cover awesome developer tools and libraries, we don't have many examples of awesome *applications* both for use and for examples to draw from.

That's why Mahmoud Hashemi decided to create Awesome Python Applications, and you're about to dive headfirst into them!

Episode Deep Dive

Guest Introduction and Background

Mahmoud Hashemi is a seasoned Python developer currently serving as a principal engineer. In the past, he worked on large-scale Python deployments at PayPal, moved to a Series C startup, and then joined Simple Legal in Mountain View, California. Mahmoud is deeply involved in Python's open-source community, frequently giving talks at conferences and meetups. He is the creator of the Awesome Python Applications list, a collection of over 300 maintained and significant Python-based applications.

What to Know If You're New to Python

Here are some quick notes to help you get more from this conversation:

  • You’ll hear about the difference between Python applications and libraries, and why that distinction matters.
  • Look out for references to core Python packaging (e.g., setup.py and distribution challenges).
  • The conversation highlights Python 3 migration and how open-source projects handle major version upgrades.
  • GitHub and open-source etiquette (pull requests, commits, etc.) are also discussed, providing insight into how the community collaborates.

Key Points and Takeaways

  1. Why “Awesome Python Applications” Was Created Mahmoud saw a need to showcase not just Python libraries but actual applications to inspire developers and serve as real-world references. He found that while “awesome lists” typically highlight packages and tools, there was a gap around fully shipped software. By curating these applications, developers can discover best practices and architectures at scale.

  2. 5-Point Rubric for Inclusion Mahmoud applies a consistent checklist before including an application: It must be open source, actively maintained, significantly use Python, have an online repo, and be notable or prominent in its niche. These criteria ensure the projects remain relevant and valuable for the Python community.

    • Criteria: Open-source license, modern repo, active maintenance, well-known or stable usage, actual application vs. library.
  3. Distinguishing Applications from Libraries The curated list focuses on shipped solutions rather than frameworks or libraries. For example, you might see a fully functioning CMS, but not just a templating library. This distinction helps developers see how larger teams solve packaging, deployment, versioning, and user-facing design.

  4. Huge Scale of Analysis: 19 Million Lines of Code To understand trends, Mahmoud cloned every repo and analyzed 19 million lines of code, totaling over 2 million commits. This impressive dataset revealed patterns like Python 3 adoption rates, lines of code per project, and how many maintainers collaborate on each application.

    • Interesting Stat: ~30 GB of repository data from 250+ active apps.
  5. Python 3 Migration and Adoption The research showed that around two-thirds of the listed applications already support Python 3, even though many began in the Python 2 era. Projects like Mailman and Reddit made significant transitions, demonstrating how major codebases can modernize. This also underscores that the Python ecosystem is steadily moving forward.

  6. Notable and Surprising Examples

    • OpenEDX: A massive Django-based platform with over 300 committers and 51K+ commits.
    • Sentry: Over 26K commits and a million lines of Python, featuring BSD-3 licensing from a for-profit company.
    • Ganetti: A lesser-known cluster manager from Google mixing Python and Haskell. Each highlights unique architecture decisions, from monolithic repos to multi-language strategies.
  7. Multi-Lingual Codebases and Vendoring Several apps blend Python with other languages like Java or Haskell. Sentry vendors about 120K lines of external libraries to maintain internal consistency, while Ganneti splits logic across Python and Haskell. Studying these approaches shows how teams handle performance, compatibility, and advanced domain requirements.

  8. Leveraging Real Apps for Architecture Guidance Rather than starting from scratch or sifting through scattered tutorials, Mahmoud encourages developers to explore real, production-level open-source apps. Learning from a single, fully integrated codebase often surpasses reading multiple, disjointed blog posts or partial examples.

  9. Automated Curation and Community Help Mahmoud wrote scripts to check repos for 404s, update frequency, and license status. However, the list still relies on community pull requests and updates. Active user involvement ensures the list remains fresh and high-quality.

    • How to Contribute: Submit new projects or fixes via GitHub issues or pull requests on the main repo.
  10. Examples That Span Many Domains The conversation pointed to a wide range of categories, from desktop GUI (Flowblade, FreeCAD) to server apps (Pagure, OpenEDX), communication platforms (Zulip, SecureDrop), and data science tools (Orange, SageMath). These different categories highlight Python’s versatility and real-world impact.

  • Mentioned Applications

    :

Interesting Quotes and Stories

"We have the biggest language in the world right now, but how do you personally benefit from that?" , Mahmoud Hashemi

"One sure shot way to ruin it is to miss a dynamic library in your package, especially in something like a video editor." , Mahmoud Hashemi

Key Definitions and Terms

  • Vendoring: Including a library’s source code directly into your repository rather than depending on it externally, often to apply custom patches or ensure stability.
  • TUI (Text User Interface): Command-line interface that goes beyond simple print statements, often using libraries like curses to display menus or real-time data in a terminal.
  • Mono Repo: A single repository containing multiple components or micro-services, often used by large teams to simplify collaboration and versioning.
  • Entity-Attribute-Value (EAV): A schema design pattern storing data in a flexible structure, used notably by Reddit.

Learning Resources

Overall Takeaway

Python’s open-source world is vast, and beyond the well-known web frameworks and libraries lies an incredible collection of real, production-ready software. Mahmoud’s Awesome Python Applications list shows just how diverse and sophisticated these projects can be, from cluster managers at Google to video editors and data science platforms. By learning from live applications, rather than isolated code snippets, developers can discover proven best practices, tackle real-world architecture decisions, and contribute more effectively to the community.

Mahmoud on Twitter: @mhashemi
Launch announcement for project: sedimental.org
Awesome Python Applications site: awesome-python-applications
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Episode Transcript

Collapse transcript

00:00 Have you heard of awesome lists? They're, well, pretty awesome, gathering up the most loved

00:05 libraries and packages for a given topic. While most lists cover awesome developer tools and

00:10 libraries, we don't have many examples of awesome applications, both for use and as examples to

00:16 draw from. That's why Mahmoud Hashemi decided to create awesome Python applications, and you're

00:21 about to dive headfirst into them. This is Talk Python to Me, episode 234, recorded September 24th,

00:27 2019.

00:28 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the

00:46 ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter,

00:51 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm,

00:55 and follow the show on Twitter via at talkpython. This episode is brought to you by Linode and

01:01 Tidelift. Please check out what they're offering during their segments. It really helps support the

01:05 show. Mahmoud, welcome back to Talk Python to Me.

01:07 It's good to be back.

01:08 Man, it's great to be back. It wasn't that long ago that you were on Python Bytes,

01:11 and we still didn't get a chance to catch up because you were covering for me.

01:16 Yeah, but it was a great time. Yeah, happy to fill in any time. But really,

01:19 who could fill in your shoes?

01:20 Oh, man.

01:21 Thanks for that. You've been on the show before. I think the first time was quite early in the show's

01:27 history. You came to talk about a really interesting topic, Enterprise Python, right? Python being used

01:34 within the enterprise. We talked about a bunch of examples of that. And I feel like this is the

01:39 open source equivalent of that story just a little bit.

01:42 I was thinking about that today. Yeah, you're right. It is kind of similar,

01:46 different in a lot of ways. I think a lot more fun, but we'll get to that part. But yeah,

01:50 that was back in 2016. Time flies.

01:52 Yeah, time does fly. We've been at this stuff for a while.

01:54 I was at PayPal.

01:55 Yeah, that's right. And yeah, I mean, I guess that's probably a good time to just ask you,

01:59 what have you been up to these days? What's going on in your world?

02:02 I sort of had my fill of moving around different teams at PayPal, seeing how that whole business worked.

02:08 And it was cool to work in the enterprise. And I wanted to sort of like stay with that. So

02:12 I wanted to like kind of grow my own autonomy, as well as like just sort of see how startups worked.

02:19 I am in Silicon Valley, after all, you know, and PayPal is a startup in many ways, but it started

02:24 in like, what, 1998. So it's mostly started.

02:27 Yeah, it's mostly started at this point. You'd be surprised in some ways. But yeah.

02:31 Anyways, but basically, yeah, so I went to like a Series C company, did that for a while. It was

02:37 pretty cool. And then there's not too many teams, you can move around there. And like,

02:41 you know, so once you've learned it, you kind of learned it. And then I was like, okay, well,

02:45 let's come to a Series A company. And so for the past couple years, I've been at Simple Legal

02:50 here in Mountain View. And yeah, it's been really good. When I joined, the team was like four people

02:57 and or rather than we had like four engineers. And then now I think we're at like a dozen,

03:01 got a small team. I'm like principal engineer, do a lot of like code review,

03:05 architecture review, but also a fair amount of coding. And yeah, just having a blast.

03:10 That's cool. That looks like a fun product or service to work on. Is there a lot of Python

03:14 happening?

03:15 Best part, right? Like it's, it's not like PayPal where you're jockeying, like, you know,

03:19 for different technologies and so forth. It's like what we say goes, right? And we pick the

03:23 best technology, Python, Postgres, everything that everything we love. So we got that autonomy down.

03:29 But like, yeah, just to be clear, it's not like the most exciting company on the outside.

03:34 But I always tell people this, like, try to go for a boring company and then make it fun. Do it in

03:40 like, like choose to a boring company so that you can do things in a fun way. If you choose something

03:46 that's too exciting on the outside, then your day to day is just going to be a blur of boring stuff.

03:51 So yeah, we've done a lot of open source here. And I could go on about it for hours. But we got

03:57 other things to talk about.

03:58 Okay, some awesome stuff to talk about for sure. Yeah, I think there's a lot to that,

04:02 you know, people see sort of the company as the exciting thing, right? Like working for Tesla

04:08 or something like that. But, you know, if what you end up doing is writing C code to do a bunch of

04:14 internal boring stuff, well, then like, it doesn't matter how exciting the company is. That's not that

04:17 much fun.

04:18 No, exactly. Just bash scripts to shovel around logs. That's not exciting to me.

04:22 It doesn't matter if the logs are...

04:23 You know, I don't care how exciting the brand is.

04:25 Yes, exactly. Exactly. Cool. All right. Well, let's talk about some awesome things. Now,

04:32 there's been this history of awesome lists. Do you want to maybe just tell people about this,

04:37 this idea of awesome lists? I don't know how much of the history you know, but I know that you

04:41 must be involved because you've created one.

04:42 I'm actually not that great of an expert on these things because sort of when they started popping

04:48 up, I was already pretty well versed in, for instance, awesome Python. It's a huge repo,

04:53 like tons of contributors, been around a long time, got tons of content, and it's quite awesome in many

04:59 ways. But I rarely like refer to it or find much new there because I was sort of like, you know,

05:06 already along my Python path. But I'm sure it's out. It's a great resource for people out there. I refer

05:10 people to it all the time. So yeah, but this awesome list thing is a phenomenon. It's like a

05:16 meme on GitHub where it's like, I'm going to make a list of links, which are awesome for some definition

05:21 of awesome.

05:22 Yeah, for some definition of awesome. Exactly. And I don't even think it originated with Python. I feel

05:26 like there was some PHP stuff going on.

05:28 No, definitely not.

05:29 Yeah, awesome Python is really good. People should check that out. That's interesting. There's a bunch

05:34 of good libraries there. And there's also more focused ones. Like there's, I recently ran across

05:38 awesome ASGI for async server, like basically for async web frameworks and other AIO type of things,

05:47 asyncio things.

05:48 Yeah, it's just the default architecture for like linked content on GitHub, you know, if you want it

05:54 to be approachable. But yeah, there's this guy like Cinder Sorhus and basically, yeah, I'm pretty sure

06:00 that he's like node guy. Yeah. Anyways, he like has this sort of awesome authority. And it's like the

06:06 meta list. It's the list of lists that point to all the other awesome lists.

06:10 Oh, how cool. I didn't know there was an awesome list of awesome lists. That's super meta.

06:14 Oh, well, that's why they all have this badge with the cool sunglasses on it. And you know,

06:19 anyways, that's got like 120,000 stars, which is like, you know, it's good. Like, like, frankly,

06:24 so much of how we learn as a community is driven through streams of content that has to be manually

06:31 republished out. I'm thankful for anything out there that sort of constitutes a reference,

06:36 you know, that like a sort of institution I can go and look at rather than refreshing a Twitter stream

06:41 and hoping that somebody like has said something of value to me. Awesome lists are in the end on like

06:48 net. Pretty awesome.

06:49 I would say, yeah, they're aptly named. And I do like them. I feel like they're a little bit

06:53 vetted, right? Not it's not just a Yahoo of all Python things, right? It's not like PyPI or whatever.

06:59 It's these are the things people found to be extra good in these categories. I guess it's kind of like

07:04 Yahoo a little bit anyway, early, you know, 1996 Yahoo. So maybe tell us about your awesome list and like

07:10 how you came to come up with it and things like that.

07:13 Sure. So yeah, I didn't really set out to create an awesome list. I sort of backed my way into it.

07:20 So I started a couple of years back, like in I think 2017 or so, I started like giving conference

07:24 talks. I was already going to meetups and it was sort of the natural next step. Start giving talks.

07:29 I recommend everyone do the same. It's a good way to branch out.

07:32 Giving talks is a great step in like raising your profile and just breaking out of the mold of I'm

07:38 sort of an anonymous programmer, right?

07:40 Yeah. And just like writing blog posts, for instance, like, you know, once you actually sit down to

07:44 cover a topic, it exposes a lot of your own gaps. And so you end up filling in a lot of your

07:49 own knowledge. Yeah.

07:50 Just for personal benefit, I recommend it. So basically I was giving these talks and I was

07:54 covering topics like performance and packaging and testing and plugins and a bunch of architecture

08:00 stuff. And the thing is that like, like I just sort of alluded to, oftentimes when I'm starting out with

08:06 this talk, like I have some degree of like expertise in the area, but also I'm doing a lot of learning

08:11 myself. That's one of the reasons I like doing them. And so afterwards, people would have all these

08:15 questions, you know, I'd take it as a good sign. They thought the talk was good. They thought I know what I'm

08:18 talking about. That's great. But the fact is that like, when they're asking me about packaging

08:23 or plugins or, you know, performance, it's like, well, I only have my 10 years of experience to draw

08:29 from and I won't know every single answer, you know, to apply to their situation. I want to,

08:36 I desperately want to, but that's just not possible.

08:38 You can't always go through it, right? Like if you're packaging up an app and you know,

08:41 well, I got CX freeze to work for me. So now I can make it.

08:44 It's like, well, why are you going to go try all these other ones when you have one?

08:49 Exactly.

08:49 You've got a life to live and other apps to build, right? But other people can share their

08:53 experience and say, Oh, did you know about Pyox dodger? You're like, wait, what's that?

08:57 I certainly want that a hundred percent completion like stat, but it's just not going to happen. So

09:01 one thing I sort of wished I had was basically a way to refer them to a known working example,

09:09 just because it was available at the, at the conference at PyBay a couple of years back,

09:13 like the Zulip team was there and Zulip is like this chat application written in Python,

09:18 sort of like, it's sort of like a Slack, but kind of like blends in some email features. It's a pretty

09:24 cool design and they have a fully working application with a great community, a good onboarding process,

09:28 good docs, all this stuff. And so I'm like, look, this is a great exemplar. If you're writing a Django

09:34 server application and you're interested in say, introducing typing, they recently did that process and

09:40 you can go look at how they did it and you can get a lot better answer from looking at exemplars. And

09:45 you can, you know, for me, just spur of the moment. Right. Or even just some kind of talk where people

09:50 just talk about the concept of type annotations or type hints, but here's actually the GitHub

09:55 conversation and the issues and the PRs. And then this is the before and after and the trade-offs. And

10:01 it's just like a white paper, real world example of that. Right. Yeah. And the living maintainers who

10:08 like could answer those questions too. There's so much to draw from there. And so I was like,

10:13 okay, I'm going to make a list of Python applications just for me so that I can easily

10:19 refer to them. Like just off the top of my head, I was able to get together around 20 that I considered

10:25 pretty awesome. Can you think of like, I mean, Python is the biggest language in the world right

10:29 now, basically. So like, can you think of a few?

10:32 You would think that I would be able to start naming them. The real tricky part is these have

10:37 to be the open source ones, not just things written in Python. So Wagtail, for example,

10:43 is a CMS in Django, which is pretty nice. Reddit, Reddit is one. Trying to not cheat and think about

10:51 what is on your list, but these types of things.

10:54 How about you, the listener? Can you think of a application that is written in Python?

11:00 Okay. Time's up. Well, it turns out that there are more than the 20 I could come up with.

11:07 And when I started looking, it was sort of hard to find them. But once I found them,

11:12 like I sort of found it for myself in this sort of addictive cycle, like just always looking for more

11:17 because there was so much creativity there and often sort of underappreciated. So something I became

11:23 aware of pretty early on when I was developing my own, like, you know, open source Python application,

11:26 there's this contest that's actually happening now. It's called Wiki Loves Monuments. And

11:31 Wiki Loves Monuments is like a photography contest around the world. It's like the Olympics of

11:36 photography for free culture. And they needed a tool that would allow them to judge entries. So

11:42 I made one of those and, you know, had a nice team. There was some stuff we had to figure out on our

11:48 own. But as I was like building this, I was finding some other applications and they

11:52 had like a couple dozen stars, you know, and it's because GitHub is a place for developers.

11:58 It's for people who want to build software, not necessarily use software. Like, I don't know if

12:03 you remember the old days with like Two Cows and CNET and Download.com.

12:07 Two Cows was awesome. You were like, I need a thing. I mean, it was sort of the free and open app store,

12:14 right?

12:15 Right. It was freeware. It was shareware. It wasn't really like free software in that,

12:19 you know, it's the beer or whatever instead of the, you know, speech. But yeah, anyway, so basically,

12:25 I sort of like got a little bit addicted to finding these awesome applications, because

12:30 a lot of them were sort of diamonds in the rough, not getting as much appreciation as they probably

12:34 deserved. I mean, their users loved them, but not necessarily GitHub users.

12:38 Right. Well, GitHub's not exactly. Yeah.

12:41 It doesn't have the right incentives or the right sort of connective structure there, because GitHub is

12:46 all about connect. I feel like it's all about connecting you with libraries, either code you work

12:51 on or libraries you want to put into your application, but not just end user things that you could study or

12:57 make copies of, right?

12:58 Exactly. And so I ended up with like this sort of five point rubric for the things that I was looking for.

13:04 Basically, like it had to be free software. It had to have an online source repository. It's not much of a

13:08 reference if you can't see the up to date code that is actually shipped. It had to be using Python for a

13:14 significant part of its functionality, not necessarily pure Python. Let's be real, like realistic,

13:19 pragmatic applications are going to have a mix, you know, JavaScript, HTML, CSS, going to have to have

13:25 them. So until browsers run Python anyways. So yeah, then it had to be well known, or at least

13:31 prominent in its identifiable niche. There's some cool applications out there for like neuroscience and

13:37 maybe neuroscience people love them, or something like that doesn't have to be world famous. But

13:44 basically, it has to be maintained. This is another like really important thing. A lot of old wiki pages,

13:50 they sort of get stale knowns checking the links, and it just ends up being sort of a sad graveyard

13:54 after a few years. So these projects actually have to be maintained, the links have to be up,

13:59 they have to be functional on relevant platforms at the very least.

14:03 I've seen some of these types of applications you're talking about. And I'm like, found it on

14:06 GitHub, like, Oh, why is no one talking about this? This is great. And it says click here for a live

14:10 demo. And you click it. And it's 500. Crash. You're like, exactly. This cannot be getting that much love.

14:16 I'm setting out to sort of like make a list that doesn't do that. I'll get to that in a second. But

14:21 most importantly, and you alluded to it earlier, it's just that they have to be shipped applications,

14:26 not libraries or frameworks. So one of the first things I did was I went on the awesome Python list.

14:31 I'm like, surely there's some applications on this list I can use as exemplars. And I think I found like

14:35 three, right? Like Jupyter notebook. And it's not a huge surprise that Jupyter notebook's written in

14:40 Python. But awesome Python is all about the libraries. And I wanted a list of applications,

14:45 hence, awesome Python applications.

14:48 It's really good. And there was not really anything out there like this. And I think it's great.

14:53 When I was looking through it, you know, I expected a lot of web applications, and I'm sure we'll find

14:59 them and we'll talk about some of them. But there's also desktop GUIs, terminal, sort of ASCII type of

15:06 app, you know, ASCII, what a curses type applications, a lot of variety there.

15:11 There's so many ways to taxonomize it. And I could get into that if you'd like, because I've spent hours

15:15 mulling over it. But basically, we have the basic topic, like, you know, breakdown. But then we also got

15:21 sort of server versus like, if it's a server, like software, or if it's sort of like client or GUI

15:27 software. And then as far as the console is concerned, right, you have CLIs, which is just a

15:32 command line. And then you have a TUI, which is a text user interface, somewhat lesser known term.

15:38 And then you have sort of the interactive console style. Those are the three that work within the

15:43 terminal. And then you have like sort of some other interesting architectures out there too,

15:47 which I'm like, yeah, we'll cover it later. But the good news is that when I launched it on accident

15:54 on Brian Okkens testing code, is that I basically had found 180. Like I was turning over rocks. I was

16:02 like looking on Bitbucket, Launchpad, you know, this isn't just GitHub and Git, it's like Bazaar,

16:08 Mercurial, there's like Calathea and Peugeot, and like, which I hadn't really heard of, but I found

16:14 applications hiding on there. And of course, like all these Git labs out there that people self host

16:19 too, because there are all these great like sort of sub communities, the Debian sub community,

16:23 the Fedora, Red Hat, Gnome, and they all have their own cool organizational support and community. And

16:29 it's really interesting to just not just find the application, but also that community. So yeah,

16:34 we got 180 when I accidentally launched it sort of at the beginning of this year, or maybe like a little

16:38 bit before, like Merry Christmas to the Python community, I guess. But yeah, so I published a

16:44 blog post once that sort of started picking up, and that's still up on the site. So if you go to my blog,

16:49 like sedimental.org, it's just up there. I'm sure if you search awesome Python applications, it'll come

16:54 up. For sure. And I'll link to it in the show notes. I haven't blogged since then, because pretty much all

16:58 of my like, free time for content creation has just been going toward curating this. And I've been

17:04 learning a ton. I need to find a better way to share it. I guess that's sort of like why I'm on

17:07 the show. But the point is that at the beginning of this month, I was at something like 250 ish. And

17:13 I guess I should say at the beginning of September 2019, I was at the top of around 250 applications.

17:20 And now I'm at like 312. I'm aiming to get to 350 by end of September 2019. I say 350, not because it's a

17:29 particular goal, but because I have like a list of candidates that I need to evaluate.

17:34 A lot of our community, like sort of submitted is really like very time consumptive to find these.

17:39 So if people want to, you know, help me out, that'd be great.

17:42 Yeah, that's cool. So yeah, I think I'm going to do a PR for Doc Assemble for you, which is a legal

17:48 interviewing software based, I think it's based on Django.

17:51 I just looked it up and it actually looks very impressive. It's exactly the sort of polished

17:55 application that I think needs some more open source love. Anyways, I mean, one might think besides

18:00 my personal addiction, like why, why do this? And I have sort of like three goals that I'm trying to

18:07 hit, or at least trying to fulfill. I'm not sure if they'll ever really be hittable, then maybe not

18:13 like measurable enough. But I really want to like goal number one is I really want a better development

18:20 cycle. You know, like someone ideates idea for an app, you know, they go basically do maybe Django

18:29 start project, or they open up a blank editor, and they just start writing it and they start searching

18:35 on Stack Overflow almost immediately. You know, maybe they find a tutorial that gives them a to do or

18:39 something to get just barely get off the ground. But beyond that, it's all first principles.

18:45 And I just really want people to sort of benefit from all of the other discovery that these like

18:52 application developers have created. The way I put it in my high bay talk, like last month was that

18:57 Python is the biggest language in the world right now. But how do you personally benefit from that?

19:01 Yeah, you know, there's all this stuff out there. I mean, there's a bunch of great libraries we

19:05 benefit from. Absolutely. I do think one of the differentiators between a beginning programmer or a

19:12 beginner in an ecosystem. And somebody who's very experienced, some form of expert, I guess,

19:19 is that a lot of times the beginner will start coding and think everything has to be created from

19:25 scratch. Right? I need to, you know, load CSVs. So let me just like read the text of the file and start

19:33 splitting on, you know, commas or like weird stuff like that. Right? Whereas the more experienced person's

19:39 like, well, you know, pandas will read it, or there's also a CSV module in the library, like,

19:44 right, just use that. And I feel like this, it kind of helps the beginners close that gap to say,

19:51 I'm looking like I want to build something like that. Let me see what they did, right? Let me see

19:55 how they structured their file system, and they organize their code. And are they even using celery?

20:00 I heard I have to use celery. Do I have to use celery? I don't know. That's sort of the thing. If

20:03 you go on awesome Python right now, sure, you're going to find like hundreds of libraries. But what

20:07 are you supposed to use them all? Like what all ingredients goes into making sort of a complete

20:12 version of the app that you're sort of trying to build of that architecture? Yeah, I think that

20:17 basically, like my longer term goal here is like to have sort of a decision tree type interface.

20:23 So you can say like, well, look, I want to build a web application, you know, and then like,

20:27 you can sort of say, oh, I wanted to support this many users, or I wanted to basically use,

20:33 say, SQLAlchemy, or have a Docker image or something like that. And you can find your way to,

20:40 like an application that looks enough like the thing that you want, that you can just pull things

20:45 wholesale from it. And basically, not only get an application sooner, but also learn the best

20:52 practices without having to go through dozens of hours of conference talks and blog posts.

21:21 Where your users are, there's a data center for you. Whether you want to run a Python web app,

21:26 host a private Git server, or just a file server, you'll get native SSDs on all the machines,

21:31 a newly upgraded 200 gigabit network, 24 seven friendly support, even on holidays,

21:36 and a seven day money back guarantee. Need a little help with your infrastructure?

21:40 They even offer professional services to help you with architecture, migrations and more.

21:44 Do you want a dedicated server for free for the next four months?

21:47 Just visit talkpython.fm/Linode.

21:51 Compare this to the cookie cutter library of templates for us.

21:57 Yeah. I mean, that's actually, I hadn't really thought of that, but that's a very good point.

22:01 In a way, cookie cutter is trying to codify these architectures as well. I guess not to put too

22:07 fine of a point on it, right? But I mean, I have looked through 300 of these. I mean,

22:10 I've looked through over a thousand applications and very few of them, I like still bear the marks of

22:15 having started as a cookie cutter application. I think that cookie cutter certainly has its place

22:20 for getting a kickstart, especially for people who maybe don't want to write their own setup.py or

22:26 something like that. But yeah, I'm not sure that it's going to be enough architecture to really get

22:31 you completely off the ground. Like there are emerging technologies and containerization around like

22:37 that pack and say app image and snap and so forth, especially for these GUI applications on Linux,

22:45 just as an example. And maybe there's a cookie cutter for something like that. I don't think it's been

22:51 done enough times for it to like show up on that radar. And so basically you'll have to look for the

22:58 existing community within the existing, I mean, it's very hard to call the community actually,

23:02 within the existing ecosystem, like who has actually adopted this. And so that's like,

23:09 you know, I think I'll get to that in a minute here, but I did clone every single repository,

23:13 clocked in at around 30 gigabytes for 250 repos. I sort of ran a bunch of analysis and technology

23:20 discovery within it to figure out who's doing what. That's cool. Yeah. Like, in fact, let's just like

23:24 jump right into it because I'm, I've got these numbers and I just got to sort of get them out

23:28 there. Sure. Let me throw one quick comment though, on the cookie cutter bits while you're pulling up

23:32 the numbers. I feel like cookie cutter is great to help people jumpstart. And the idea is like,

23:37 you kind of get this prepackaged app, which is great. Like here's a way to do flask that already

23:41 talks to a database and a queue, or it already sends email or something. Right. That is a long ways away

23:47 from here's an actual application, right? Even when, as a developer, when you're building an

23:52 application, you think, Oh man, I'm almost done. I'm 80% done. You're like less than half done,

23:58 right? There's all these little edges you have to smooth off and these like edge cases that you don't

24:02 think about. And in these, these are real applications that shipped, which means they're way more polished.

24:09 And they're just, I don't even know that how comparable it is. They're both good for people

24:13 starting, but I feel like this is a much different set of things in the catalog.

24:18 And to be clear, they're polished for the end user. What I think is really interesting about them too,

24:23 is like all the rough edges they have. That can be really good for a programmer's ego to be like,

24:27 look, this worked. Okay. Like, I don't need to overthink this. Like, I'm just going to say,

24:32 this is good enough. If it's good enough for them, it's good enough for me. Ship it.

24:36 Yeah. It's easy to get hung up on like trying to make it perfect or trying to set up like infinite

24:41 scalability. So it's like Google and you're like, you have no users yet. Forget scalability.

24:45 Just get it out. Right? Exactly. And so I got my numbers up. Let's hear them. So yeah,

24:50 the quick methodology here is I just cloned every repo that's Mercurial, Bazaar, Git,

24:55 pulled them all. I have, you know, ransom slot count and like other analytics over the version

25:01 control history. We're looking at 19 million lines of code, 2 million commits with around 50,000

25:09 committers. That's an insane amount of effort. Like that is huge. And this is Python code.

25:14 I think I added it up and it was like hundreds of years of like maintainership or whatever you want

25:20 to call it, you know, like sometimes you go on maybe steam or something like that. And it sort of

25:24 aggregates for you how many hours was played on the game. And you sort of feel like you feel a little

25:28 bit terrible about humanity perhaps, but we all got to have fun anyways. But this, like I had like sort

25:35 of the opposite feeling, right? Like this, I think I actually experienced the true meaning of an awesome

25:41 list. Like I was in awe at the amount of like, you know, stuff I was looking at. And I sort of like

25:48 broke it down. So real quick thing, like about a third of them are server software with 64% being

25:54 like sort of desktop or CLI and just sort of like a quick breakdown there. I think the scariest thing for

26:01 me was basically Python compatibility, because we hear a lot of like doom and gloom at times,

26:08 but like contrasted with like a lot of excitement about Python 3. And I was worried that like, you

26:14 know, these maintainers are stretched so thin, they have not had time to add Python 3 support. And here

26:19 we are with something existential facing the Python application community. And I was very pleasantly

26:27 surprised to find that two thirds of these applications already support Python 3, like they run on Python 3.

26:33 That's great. It was a huge relief. This isn't the same thing as a library supporting Python 3. This is

26:37 them having converted a code base that is often quite large to Python 3, many of them. They didn't start

26:43 this way, because I think the average or sorry, the median application on the list is something like 10

26:48 years old. And some of them much older than that, like Mailman, the thing that Python uses and many other

26:54 organizations use for managing their email lists, that's written in Python, and it's made the jump,

27:00 you know, it supports Python 3. And that's pretty impressive. And if you look at sort of the

27:06 compatibility over time, you do see the Python, like two applications kind of tapering off, you haven't

27:11 had another Python, like I think the most recent Python 2 application was started in something like 2017,

27:16 versus, you know, Python 3 applications, which are being started today, Right. There's probably a bunch more Python 3 apps starting now than there are Python 2,

27:25 I would expect.

27:26 You see a nice healthy trend there. Thankfully, I was super relieved.

27:29 I'm sure you'd either have to be crazy or depend on a library that is Python 2 only deeply to start

27:38 a Python 2 app these days.

27:39 And those are rarer and rarer. So it's a little bit harder to justify. There's still some interesting

27:43 case studies in here. So for instance, I think OilShell actually converted to Python 3 and then back to

27:49 Python 2. And he has a whole like, you know, blog post about why he did that, the lead maintainer

27:56 there. So that one's an interesting case study. And I think that's sort of what I hope to like,

28:01 one of the deliverables I hope from comes out of the list is just like all the interesting case studies.

28:06 I think we'll get to those too.

28:07 Yeah, there could be a bunch of stuff coming out of there. I think, you know, another thing I think would be valuable, not that you don't already have enough going on,

28:14 but would be some kind of newsletter where just the stuff that gets added that month or something

28:19 comes in and just gets thrown out. That'd be great.

28:21 I'm too much of an engineer for that, Michael. That's why I've added an RSS feed.

28:26 You know, we just pull the data ourselves.

28:28 Yeah, exactly. Just pull the data yourself, right? If you want to do a newsletter and all that,

28:31 I fully support that. But yeah, I'm going to take a technical approach to this social problem.

28:37 Okay. It would be good to get there. It would be good to get there once the list sort of maybe

28:41 starts like plateauing a little bit more. And also once, yeah, once I get some more stuff in place to

28:46 keep the quality high, you know, because some of these projects might go like into an unmaintained

28:51 status and I'm not going to be the one vetting 350 projects every month, you know? So I need to

28:56 automate some things first.

28:58 How much automation could you do? Like, for example, every one of these has a link to their repo.

29:03 Could you look for unresponded to open issues or the lack of a commit for like one year puts it onto

29:12 like a warning list and then some contributor could come in and go, this looks suspicious.

29:17 That's almost exactly what I have planned. And I sort of have, I have a little command line

29:20 application I built for managing awesome lists. It's surprising how many awesome lists are just

29:25 manually maintained via PRs. But yeah, all my stuff is in a YAML file. I think I have something like

29:30 a thousand links in there. All of them are going to need to be checked for 404s and stuff like that.

29:34 But yeah, I do sort of have a informal bar of you need to have a commit in the last year.

29:39 There are exceptions to that. So for instance, you mentioned Reddit earlier. Reddit is a massively

29:45 important Python application for the internet. And they only have an archival version of the site up

29:51 from around 2017. But still, I mean, like, it was still a very big site in 2017 with a lot of

29:57 like real lessons. It is such an important site that I think it clearly deserves to be on there.

30:01 It actually surprised me that it was there. There are a couple other surprises in there too,

30:05 I think. I'll just sort of refer the audience maybe to the repo where we have a Jupyter notebook that has

30:11 all of the graphs in them. Like some of the like numbers I'm going to cite to you now are going to

30:16 be out of date in just a week or two when I give a PyGotham talk because I'm going to run it on 350

30:21 applications instead of 250. But we'll keep the Jupyter notebook in the repo up to date. But suffice

30:28 to say for now, like there's some really interesting findings in there.

30:31 Oh, I'm looking at it now. This is super well done. I'm so glad you put this up as a notebook so it can

30:35 just like live. That's great.

30:37 The most surprising trend, it doesn't affect me very much, but it was surprising to me just how like stark

30:44 it was the increase of QT based GUI applications compared to GTK based GUI applications. When you

30:52 look at the graphs in there, remember all of these applications have had a commit, like 95% of them

30:57 have had a commit in 2019. It may look like, oh, this is an old application, but it's maintained,

31:03 it's used, it is awesome and deserves some respect there. But like people are starting QT applications

31:09 these days, not GTK.

31:10 Yeah, yeah. I'm looking at that list. It's almost 50% QT and then 2% Pi game, 3% Kivi,

31:15 17% WX and 30% GTK. Yeah, pretty interesting. There's a bunch of graphs like that. This is a,

31:21 I don't know how long this is, but the scroll bar is very small.

31:24 That's great.

31:26 I'll do credit there. Like I was on a huge time crunch coming off of speaking tour in Africa.

31:31 Let's just say I gave a keynote in Tunisia, nothing too mysterious, but basically I was super time

31:36 crunched. And so that notebook right there is basically all my wife's work. She's in the commit

31:42 log and so forth, but credit where credit is due. I did not have time to do that stuff. And she really

31:47 saved my ass.

31:48 Yeah, she came through. That's awesome. Congrats to her. That's good work.

31:51 Yeah. Anyways, I could go on about the stats, but I think that probably at this point, people are,

31:56 people have got to be curious what other case studies are hiding on this list.

32:00 Yeah, they've got to be. So maybe pull out some highlights that stand out for you. And then I grabbed a

32:05 couple that maybe we can go more quickly through like more of them, just skip across to give people

32:09 a flavor of what we're talking about.

32:11 Absolutely. So one of the ones that I highlighted was OpenEDX. I know that Ned Batchelder is a big

32:18 deal in the Python community, and this is where he works, last I checked. But the EDX platform,

32:23 edX platform has something like 51,000 commits at the time of writing. And it's really interesting to me

32:30 because it is a mono repo. And it represents like a whole team's work. And so there are all these

32:36 dynamics that you can see happening there. And it's one of only three projects where no one developer

32:43 has more than 10% of the commit history.

32:45 Wow.

32:46 Yeah. It's the third largest Django project on the list with 300 committers. And it powers all of edX.org.

32:53 Yeah. And I think MIT's OpenCourseWare and a bunch of others as well.

32:57 Yeah. And just for contrast, 41% of applications are mostly written by one committer. How's that bizarre for you?

33:05 It is bizarre. But, you know, thinking about it, it makes sense to me, right? It seems like

33:09 somebody just wants this app to exist, so they're going to go create it. But then you see the things like Reddit or

33:14 OpenEdX or some of these others, you're like, well, these are built by large groups of people.

33:18 I sort of want to find a way to highlight these organizationally supported, foundation supported,

33:23 corporate supported applications, because they definitely follow a different sort of path

33:28 compared to your average just sort of side project.

33:32 What do you think about using folks out there listening, maybe they're PhD computer science folks

33:37 or other types of researchers? Do you think that this list might put together some interesting set

33:44 of data for people to go look at how apps are built?

33:46 I think that if there's someone who's doing some sort of combination, computer science, ethnography,

33:51 like, you know, studies, social science studies, there's absolutely a ton to learn from here.

33:56 I mean, I didn't sort of pick and choose applications based on these findings. These findings were

34:01 like, you can't exactly call it randomly selected, especially not with the Python qualifier in there,

34:07 right? But it is still a very interesting cross section. And if somebody wants to do that sort

34:11 of thing, I'm all for it. Happy to support that.

34:13 Yeah, that'd be cool. All right. So tell us some of the other ones.

34:15 Yeah, absolutely. So another one that I highlighted was Sentry. And so Sentry is very interesting because

34:23 it's so big and often promoted on podcasts and conferences and so forth. A lot of people are surprised

34:31 that their entire code base is just sitting on GitHub. It's got like it's 26,000 commits since

34:36 2008. For those who don't know, it's a web service and front end for cross platform application monitoring.

34:42 So it's sort of like New Relic and stuff like that. A little bit different than Datadog, though,

34:46 because it focuses on error reporting. But we're looking at a million lines of Python. I think it's one of the

34:52 very few projects that actually crossed that million line threshold. Because, you know, a million lines of

34:56 Python, it's like 10 million lines, 100 million lines of less efficient languages.

35:01 Yeah, exactly. A million lines of Python. That's a lot of code.

35:04 That does include about 120,000 vendored lines. Just so you know that I went looking for that

35:09 possibility. Like I went looking for libraries that maybe got vendored and tweaked and just copied in.

35:14 Maybe define that for folks, because not everyone will know exactly what you're talking about.

35:17 Yeah, of course. So vendoring is what you do when, you know, a library does something you don't want,

35:21 like drop support for intercompatibility with something else. And you're like, okay, well,

35:26 I'm just going to create my own little mini fork of this inside of my repo.

35:30 Right. Like if I didn't want to depend upon requests, theoretically, I could just jam the

35:35 source code for requests into my app.

35:36 Yeah. Like you probably didn't pay for it, but it's called vendoring because the person who like,

35:42 you know, publishes the library is a vendor. It's kind of like that.

35:46 You've taken responsibility for it, right?

35:48 Right.

35:48 Okay. But still, that's still a lot of code that is like, it's 880,000 lines. That's not

35:54 vendored.

35:54 Yeah. The largest Flask app I found in comparison is Pagure, P-A-G-U-R-E. And that's what's called a

36:01 forge. It's like a GitLab or a GitHub, but written in Python and it's written in Flask. So it's almost

36:07 like 10X smaller. So Sentry, its code is sitting all out there. And it's also BSD3 license, which is

36:13 very permissive for a for-profit application. So that's kind of interesting.

36:17 Yeah. That's super interesting.

36:18 The quirkiest thing I found though, the quirkiest like case study was this application called Gannetti.

36:23 And that's G-A-N-E-T-T-I. And it had like, I don't know, 100 or 200 like stars on GitHub or

36:31 something like that. And it was just like, what even is this thing? Because it was written in like

36:40 half Python, half Haskell. So like the very pure functional thing makes it with the very pragmatic

36:49 sort of systems thing. And it had 16,000 commits since 2007. So very mature. It's a cluster management

36:56 tool focused on long lived VMs used for workloads that don't have built in redundancy. So like web

37:01 servers, like, you know, web services, you can shut down the worker and start up a worker. The other

37:06 workers will take over. Right. But if you need like a job to not fail, and it's going to be very long

37:11 lived, right, you might use this. And so it was actually developed at Google. And that was also

37:17 very strange because Python and Haskell, especially Haskell, like these are not like super common

37:23 languages to find at Google for, you know, something that appears to be this important. And it's pretty

37:28 widely deployed. I found like a nice discussion of Wikimedia, like just talking, like, should we use OpenStack for

37:34 this? Or should we use Gennedy for this? And I think they ended up going with Gennedy. So it's still

37:39 pretty commonly used. That one was an oddity. Another one along those lines is a thing called

37:44 LocalStack. And that's a developer tool. It's useful because it sort of does like a mock service that

37:51 you can run locally of like AWS. And so if you're doing DevOps code, you can run a mini AWS locally and

37:57 like write tests against it. And that one is like, I think, a third Java is one of these blended

38:02 dead stories, right? Yeah, yeah. So like hybridizing with JavaScript and HTML and CSS, everyone expects

38:08 for a web service, right? But like to see things mixing with like Java and Haskell, these are some

38:12 pretty interesting case studies that have a whole story of their own, I'm sure.

38:17 This portion of Talk Python to Me is brought to you by Tidelift. Tidelift is the first managed open

38:22 source subscription, giving you commercial support and maintenance for the open source dependencies you

38:28 use to build your applications. And with Tidelift, you not only get more dependable software, but you pay

38:33 the maintainers of the exact packages you're using, which means your software will keep getting better.

38:38 The Tidelift subscription covers millions of open source projects across Python, JavaScript, Java, PHP,

38:44 Ruby, Ruby, .NET, and more. And the subscription includes security updates, licensing, verification,

38:49 and indemnification, maintenance and code improvements, package selection and version

38:54 guidance, roadmap input, and tooling and cloud integration. The bottom line is you get the

38:59 capabilities you'd expect and require from commercial software. But now for all the key open source software

39:05 you depend upon, just visit talkpython.fm/tidelift to get started today.

39:12 So which ones did you like?

39:13 Well, you know, I went through and I just wanted to pull like a couple that stood out to me on each

39:18 of the categories. So if we go to the awesome Python application list, there's a ton of categories.

39:24 You sort of put it into major categories and then the developer space just became a meta category,

39:30 right? So you've got internet and audio and graphics and productivity and education and science.

39:35 I pulled out the developer stuff compared to the non developer stuff because I think that the developer

39:40 stuff in general is a little bit easier to find. Like you're going to find blog posts and higher GitHub

39:45 stars for developer focused applications in many cases. It's not a super, super stark difference when

39:50 I ran the numbers, but definitely like, you know, most of the listeners have heard of Ansible,

39:55 you know?

39:56 Yeah, right, right. Or OpenStack or something like that. Yeah, Exactly, exactly.

40:00 Yeah, cool. So there's a bunch of stuff under there. And I didn't, I didn't pull them out quite

40:03 like that. But maybe just to give people a sense of what's out there. Some of these are really small

40:07 and niche. And you tell they're like for somebody and others, they've been like the top 10 sites on

40:13 the internet.

40:13 Yeah, let's just go. I'll just go through the list. And we can just maybe give me some real quick

40:17 thoughts. So under the internet category, we have Deluge, which is a popular lightweight

40:23 cross platform BitTorrent client.

40:25 Deluge, which is how I pronounce it. I mean, I don't know how to pronounce it.

40:29 But Deluge, okay.

40:30 But well, I don't know. It doesn't come with the pronunciation guide last I checked. But basically,

40:34 it's a BitTorrent client. And it has a few very interesting things about it. One,

40:38 it's a GUI application, it uses Twisted. If you just look like for best BitTorrent client,

40:44 you'll find it making those rankings. And people aren't saying like, this one's great,

40:48 because it's written in Python. You know, it's just a good application, very solid,

40:52 like the UI, etc. And it happens to be written in Python. And so that was one of the ones that

40:58 I got off the top of my head when I made my list of 20 because I am a Deluge user proudly backing up

41:05 my Linux images. The Deluge thing also has a little like hidden gem, which is that they managed to have

41:11 a web UI that is almost identical to their GUI UI. But it just like sort of like allows for remote

41:19 administration, it's still classified as a desktop application, because it's mainly written for

41:23 single user use. But it does have a web UI. And I'm not sure how they did it. But like,

41:29 the experience is the same. So like in the browser as using, I guess, like I'm not sure if they're QT or

41:36 GTK, I can check. Sure, sure.

41:37 But yeah, I'm a big fan of Deluge. I highly recommend checking out their repo.

41:41 Yeah, all these have links to the repo, it's a demo to their homepage, things like that. Another one is

41:46 the Lixire, a featureful file host and link shortener with API. So kind of like Bitly, maybe.

41:53 This one's an interesting one. Like it seems to be used by like, a community I don't fully

41:57 understand. But like, it gets a lot of use. So yeah, this is a recent addition. I'll probably

42:03 explore it a little bit more later. But yeah, it is like sort of like an imager combined with

42:09 like a Bitly, which sort of makes sense. Like if you're, you know, pasting things into an IRC chat

42:14 or something like that, it's got everything you need.

42:16 Yeah, super. The next one people might have heard of it's called Reddit.

42:19 Yeah. Yeah, we don't need to say a lot more about that other than this is sort of a snapshot from

42:24 2017. But I think it's still super cool that this is here because here's a high scale application. I

42:29 don't know exactly where it ranked, but it was in the top 20 sites at some point for destinations,

42:35 right? Definitely. I'm not sure where it's at now either. But yeah, definitely a major destination,

42:40 especially I'm sure for many listeners, whether you like it or not, you're going to like probably

42:44 find some useful stuff through Google search or something.

42:46 Yeah. But one interesting thing about Reddit is that they have a very interesting approach to their

42:51 like schema design. If you dig in, they're using what's called an OAV pattern or an EAV pattern. It's

42:57 an entity attribute value. And it's kind of like, it's sort of like object storage or document storage built

43:03 on top of relational database. And so some might consider it an anti-pattern to use it this to this

43:09 extent, because you're losing out on the constraints of a relational database. But it managed to work for

43:15 Reddit. So it can't be all bad. Anyways, theory meets reality, right?

43:21 Yeah, exactly. And some might say like, oh, it's very slow compared to, you know, an optimized schema

43:26 with better indexing and so forth. But one, like databases have done some degree of optimization for

43:33 this pattern. It has a Wikipedia page. It's a pretty well known pattern. But also Reddit is fast because of

43:39 caching. That's the like other main learning there. If you look in there, it has layers and layers of caches.

43:44 A very interesting, like, you know, exemplar to learn from.

43:47 Yeah, I would definitely think so. All right, next category is audio. And I grabbed two of those out

43:51 of there. One's called Exile, something like this, which is a cross platform audio player and library

43:57 organizer tag editor type thing that looks pretty interesting.

44:00 I found this because I went on, everyone's familiar with Wikipedia. There's also this thing called

44:06 Wikidata. And Wikidata allows you to use a Sparkle query, which is sort of like a, I don't know how to

44:11 describe it exactly. But it's sort of like SQL, but extensible for like graph networks, kind of.

44:17 So I ran a query for all of the software that had Wikipedia pages that was known via like, you know,

44:23 Wikipedia or Wikidata to have been written in Python. And so I actually found it through there. I'm not

44:28 a user myself, but yeah, it's actively developed. It had a release come out just a few months ago. And

44:34 it's sort of like, kind of like Amarok, if anyone remembers that, but it's got like Last.fm support and

44:39 plugins and so forth. Seems pretty cool.

44:41 Nice. Another one, Music Brains Picard, which automatically identifies tags and organizes

44:47 musical albums, which sounds pretty cool.

44:48 It's funny name, but just like amazing software. Let me just say, I don't use it as much as I used

44:54 to back when I used to like DJ and stuff, but basically the way that Picard works.

44:57 So there's a Music Brains Foundation that develops it. And they also have a sort of like, it's kind of

45:03 like a Wikipedia, but for album information, it's kind of like Discogs or something like that,

45:09 but open. And what it'll do is you can like take one of your CDs, make a backup, right? And that

45:15 comes through and it's just as track one, track two, track three, track four. But it'll take the

45:19 number of tracks combined with the length of each track to generate kind of a fingerprint,

45:25 match that against an internet database of like album listings and so forth, and then automatically

45:31 bring in all the fully filled out like ID3 tags. So everything shows up very searchable.

45:36 Oh, that's awesome. Yeah. With like album art and all that. Yeah.

45:39 Man, Michael, I used to spend, I mean, you're in the audio world now too. Like I'm sure you know,

45:43 I used to spend so many hours, like, you know, getting those tags just right and still have typos.

45:49 And then this sweeps through and I press one button, 20 seconds later, it's just all perfectly

45:56 organizable and tagged and so forth. It made me feel like a fool, but also made me feel pretty magical.

46:00 Yeah. Well, I'm sure you appreciated it more than if you hadn't done that stuff by hand.

46:04 And then one day I find out it's written in Python and I'm like, that's going on the list.

46:07 Yeah, it's definitely on the list. All right. Switching from a music audio to video,

46:11 we have two editors, Flowblade and Openshot. Flowblade does multi-track, non-linear video

46:17 editing for Linux and Openshot is cross-platform video editor for like many of the platforms.

46:22 Yeah, that one like sports free BSD and Windows as well. So with these ones, like, I think what's

46:27 interesting is that a video editor is a very large application. It's going to have to have a lot of

46:32 codecs and other things. And those things are very touchy based on what platform they're running on.

46:37 They're incredibly finicky. It's super painful to develop that stuff. I've done it.

46:42 Exactly. Talk Python training. So very finicky stuff. And one sure shot way to like, you know,

46:50 ruin it too, is like to miss a dynamic library in your package. So being able to go in there and look

46:57 at how they do their packaging. Sure, it's a little bit like, you know, dirty, but I'm sure that you're

47:02 already deep in the dirty if you're looking for how to do this kind of application freezing. So it's

47:07 really useful to have those to refer to. Those are great examples. For the graphics world,

47:11 we have a free CAD for a general purpose CAD editor. That sounds pretty intense. And it's

47:16 awesome that that's in Python. So with free CAD, part of me was a little bit like sort of skeptical,

47:21 right? Like, could an open source community come up with something like CAD? Because I mean,

47:26 when you're dealing with the BIMS world, right, like, you know, it's like construction and design and so

47:32 forth. It's, it's just so much. And who could possibly have the side resources to do this? But you know,

47:39 they're doing it, they have a wiki, and they don't seem to be giving up. It's pretty impressive. I read

47:45 some sort of third party reviews. And it basically says, like, look, this may not be like top tier,

47:52 right? Like it may not be ready for like full professional shop usage and so forth. But it's

47:57 surprisingly usable. I'm not much of a CAD user myself. So I'll just have to take their word for it.

48:02 Right? If somebody if somebody listening is a CAD user, let me know, let me know if I should expand my

48:06 description here, write a review, support Python CAD. Speaking of supporting, some of these have

48:10 fund next to them. Exactly. So the sorts of links that I collect for each ones, basically, I'll do

48:16 repo, home, Wikipedia, if they have it, GitHub, if they're not on GitHub, but they do have a GitHub

48:22 mirror, demo, if there's a version of it running that you can like just try out docs, which is like

48:28 usually a read the docs link or something similar. And I added fund, which is basically going to be like

48:33 a Patreon link or a PayPal link. If I can find a way to show that these people are making into this

48:41 current generation of like approach to open source, which is like, yes, please do pay me. So I can spend

48:47 more time on this, then I will absolutely link it here. Because they deserve all the help they can get.

48:53 Yeah, absolutely. I think it's great that you're highlighting that. The other one in graphics is

48:57 Kuru image server. So I guess you give it like a large image and you can say I would like it at

49:02 right now I wanted a 250 by 250 or whatever. And it just, you don't have to keep redoing it. It just

49:09 automatically regenerates and probably caches that. Yeah, exactly. You upload an image and then it can

49:13 scale it to all different sizes. They say they have all sorts of tweaks and optimizations to do that sort

49:18 of stuff efficiently and performantly. And yeah, it has caching and stuff too. That's a pretty tough

49:24 task. You know, like that stuff can be pretty expensive. So I was like, maybe if someone's

49:28 interested in performance tweaks for images and so forth, this would be a good project to go see

49:32 how they achieve that. All right, let's go through the games real quick. There's a game section. I'll

49:36 just quickly go through them. You can just give me maybe some general thoughts. Sure. There's frets on

49:40 fire X, which I'm guessing is like one of those sort of music things come down and you hit them and

49:44 then you've got the streaming platform is like a homemade Twitch, which is pretty cool. You got Lucas

49:50 chest and unknown horizons, which is a cool strategy game. The game section is one that I definitely

49:54 wish was larger. A game is again, a very finicky, large application and one that's tough to undertake.

50:00 A lot of also there aren't a ton of open source and free ones. Like I know that Python was used in

50:05 one or two of the civilization games for scripting, but like that's not open source. So, but unknown

50:10 horizons is pretty similar to that sort of thing. It is an RTS. It makes extensive use of Python. So I'm

50:15 happy to see that there frets on fire is a little bit older, but from what I can tell,

50:19 it still works for some people. So I guess like dance dance revolution never gets old or maybe,

50:24 I don't know. Is it a rock band type thing? It says frets. I don't know. It's something like that.

50:28 Something like that. Yeah. Eve, Eve online is another good one to throw in there,

50:31 but obviously it's not open source. So, you know, you can't, can't really do that. Put it in there.

50:35 What else was second life? If you consider it a game, maybe, maybe you consider it life.

50:41 That's right. Maybe you do. All right. Under productivity, the top one here that I grabbed

50:45 is actually something I use to manage all of my servers and my infrastructure. And I love it.

50:49 Just love it. Glances. Yeah. It looks really good. I guess I don't manage enough servers to really

50:54 like me. I'm just like a, an H top user, but it's like H top, but better. Yeah. You get all sorts of

50:59 cool stuff. Like if I just run glances real quick, I guess I don't, there you go. It gives you graphs for

51:05 CPU memory swap, how much Ram you're using, like server load over one minute, five minute,

51:12 15 minutes. You can sort by memory usage, by CP usage. It shows you like your disc IO rates,

51:18 your network rate, all just like all sorts of stuff and great little hot keys for it. Yeah. It's,

51:23 it's like a shop, but yeah, like all the stuff you want.

51:27 Yeah. I'll have to give it a shot.

51:28 It's pretty good. Let's see. We also have bleach bit for like privacy. So it cleans up some stuff out

51:33 of your system, but also I'm guessing it like rewrites every empty bit of space or something like

51:38 that. Just giving the bleach part. I don't know. I used to work way back in the day, like in high

51:43 school for a brief time at like a computer repair shop in South Dakota, no less, but basically people,

51:49 they would come in with malware, spyware, all this stuff. And, you know, my computer's slow and I

51:55 have all these pop-ups. Yeah, exactly. I only installed nine toolbars on my IE 5.5 or whatever.

52:02 Like a 50 pixel bar in the middle where the actual content is.

52:06 Exactly. Oh man. You're putting too much of a, of an age on us here, Michael. Keep this content

52:13 evergreen. Anyways. All right. But basically what bleach bit is, I didn't realize it was written in

52:17 Python, but it is one of these cleaners that will go through and like, not just empty out your temp

52:22 directories and so forth, but also like clean up your registry because a lot of this software will make

52:26 access registry rights. And I guess, especially back in the day that would slow down windows.

52:31 So yeah, it removes like tracking things like increasingly that's become the focus to like

52:37 sort of clean out your browsers of like, you know, any weird flash cookies and that sort of thing.

52:42 Cool. Yeah. That's great. Yeah. Last one on the productivity space is GM vault, Gmail backup.

52:46 I guess we've sort of foregone, I mean, I personally have foregone some of this stuff in my Gmail,

52:51 but probably I should back it up, you know, I never know when big G is going to like, you know,

52:55 do something uncouth. Yeah. You never know. It is kind of nice to have that. I actually went and found

53:00 us, found out that if you go to the, the Google doc, just the Google data export, you can say export my

53:09 docs. And one of the problems, you can run Google drive and stuff and it'll have like your docs from

53:15 Gmail in there, but they're just a hyperlink back to Google docs. Right. But if you run the export,

53:19 you can say export it as word documents, Excel spreadsheets and a PowerPoint and actually get

53:26 the content of it, not just links to it back in drive. It's pretty cool. Yeah. This sounds like

53:32 this kind of sort of in that realm organization, a lot of archiving stuff in this world and library

53:38 bits. There's a funny one as well. Archive Matica, digital preservation and then archive box,

53:44 which is like self-hosted sort of way back machine. Exactly. Archive Matica. It's an interesting one

53:49 because it's sort of like, I think it's sort of targeted at like libraries and actual like archives,

53:56 whereas archive box is a little bit more like gorilla, like Yahoo says they're going to delete

54:01 geo cities. Okay. Let's like, you know, pull it all down. Like it's sort of those two schools,

54:06 which are definitely adjacent, but a little bit different. Yeah. Similar to that, I guess we have

54:09 open library, which is a web application for like a library catalog. So if for some reason you,

54:14 you have a small little library, you know, you don't have to like start from scratch.

54:18 And you'd be surprised like, yeah, libraries, I was at a library recently, like it only was open a few

54:22 hours a week, but they had a tremendous collection and they could definitely use something like this

54:27 because they didn't have anything to search through. You know, you just had to walk the stacks.

54:31 That's crazy. And the last one, the one that I said was funny, it's called I hate money.

54:34 There are some people who are very into sort of like personal finance management,

54:39 like I guess people who like sort of picked up wanting to balance the checkbooks epigenetically

54:45 somehow passed down to them. And, but no, there's like this whole movement of people who do plain text

54:50 accounting and get version. They're like, you know, they're, they're actual financial books.

54:55 I think I hate money sort of comes from that domain combined with the self hosting domain.

55:01 Yeah. Interesting. They're like, we're not doing meant forget meant.

55:04 This one seems to be about like sort of shared budgeting and stuff too. So this is like Fava

55:09 is the one that I was thinking of that I added recently, but I hate money is like basically

55:12 like you have roommates and you just want to keep track of who bought what when,

55:16 and it's like a little bit better than like a Google sheet.

55:18 Yeah. Yeah. That's pretty cool. All right. So I'll, I'll speed through a few more here real quick.

55:22 So communication with ask bot, which is very similar to stack overflow.

55:25 Also quite interesting and secure drop, which is like a whistleblower submission system

55:30 for media organizations. These are cool.

55:32 If you're into self hosting or you have a team that you don't want to pay for stack overflow or

55:36 something like that, then you can like run your own ask bot. And secure drop is a really important

55:40 one. That one was originally written by like Aaron Swartz and it's managed by freedom of the press

55:44 foundation. They just, I think it came out with a new release and yeah, that's huge for journalism in our

55:50 time.

55:51 Yeah, absolutely. One that I think is probably going to be really a welcome one for,

55:55 a lot of teachers and professors out there would be NB greater. This is under the education system,

56:00 which is Jupyter based notebook, basically create assignments in there and it will grade them

56:05 automatically for you. That sounds wonderful.

56:06 Not quite that level of teacher myself, but I think that it's pretty cool idea. Like instead of having

56:11 just workbooks, you can actually give a live notebook, have them fill in some blanks, do some things,

56:17 and then like have them submit the IPyNB and grade that. Yeah.

56:22 Another one that I like to point out, I don't know if it's under education, but a lot of people

56:25 seem to know about it, even if they're not developers, but sort of like education related

56:29 is called Anki, A-N-K-I. And it's basically like a flashcard program. And I meet all these lawyers

56:35 and doctors. They're like, oh man, I would not have made it through school without Anki, right?

56:40 It's like as important to them as Wikipedia or something like that, because it's sort of like

56:43 spaced repetition memorization tool written in Python.

56:46 If you're going to do anatomy or something like that, right?

56:48 You got to memorize it all.

56:49 There's no reason to rhyme. It's just that's part of the bone. It's called that. So we're going to

56:53 learn that. Yeah.

56:55 There's probably some reason, but yeah, it's a lot of memorization.

56:58 You can't Wikipedia thing during a surgery, I imagine.

57:00 Just hold real still. I'm just, I'm researching. Yeah. So the last one, speaking of this kind of stuff

57:07 is science that I want to dig into. And then I think we're going to be probably out of time for

57:11 touching on them. But I felt when I looked at the science area, I felt like there's about equal

57:16 amount as some of the other categories, but this is like really polished, really serious stuff. So we

57:22 have Ascend, which is mathematical chemical processing modeling. We have Cell Profiler, which is

57:28 interactive data exploration of biological image sets. We have SageMath, which is a competitor to

57:34 Matlab and Mathematica. Like these are, especially SageMath, these are real things.

57:38 No, SageMath is kind of a triumph, right? Like if you're maybe a Matlab user or something like that,

57:44 you should definitely check it out if you haven't already. But yeah, all of these science applications,

57:49 they get a lot of usage from their like academic counterparts, student counterparts and so forth.

57:53 But we don't often think to go find them as exemplars for like applications we might want to build.

58:00 But yeah, there's some really interesting ones in here. Like the CCAN was one that I like

58:04 jumped out at me because it is a data management system. So it's sort of like data hub that you can

58:09 host yourself. And so if you are running an organization like university or government,

58:15 you know, and you want to do anything with open data, you're going to need some sort of data hub,

58:18 some sort of portal for people to find that data and you need to manage your data on there.

58:22 And there's an open source one written in Python.

58:24 Yeah.

58:24 And I've done there's another one that's called Orange. It's like kind of like a component based

58:29 data mining software for graphical interactive data analysis and visualization, but you can train

58:34 machine learning models graphically. And it sort of has like a signal processing type metaphor. So

58:40 like you have this thing, and then you drag like a little curve into that thing and make like a little

58:45 flowchart. And then once it's working, you can export a Python program from that. And I've been using

58:50 that since I think 2012, 2013. So it's like from somewhere in Eastern Europe. And I think the professor

58:57 who like leads the lab that writes it, like I think I have his book as well, itself, like kind of a triumph

59:02 too. It also uses QT4 and QT5. So if you're undergoing a QT transition with your application, might be an

59:09 interesting one to look at too.

59:10 Oh, wow, that is a super interesting angle.

59:11 Yeah.

59:12 Yeah. Okay, great. So I think we're probably out of time for diving into anymore. But there's so much more

59:16 to cover. So there's the CMS category, the ERP category.

59:19 Business software.

59:20 Yeah, oh my goodness. SAS and all that. So static sites. And then there's the dev super category,

59:26 I'm calling it because there's 129 items in there and a bunch of subcategories like source control

59:31 and stuff. And people can just go through the list and find it. I think this is great.

59:35 Yeah, I could go on for hours. I really could. Each one's more exciting than the last.

59:39 These are so good. All right. I think we should sort of leave it there and people can go and they can

59:45 explore the cover. There's probably about 250 we haven't even touched on at least. And then people

59:50 also out there who are listening, maybe they want to, they maintain one of these projects or they use

59:55 one of these projects. They want to recommend it to you. What's the story there?

59:59 I have a GitHub issue template. Jump in there, like, you know, make sure it actually fulfills the criteria

01:00:04 of being like maintained and so forth. Again, I'm not super, super strict on that. But like,

01:00:10 if one particular category is like really overpopulated and it's like the seventh link

01:00:15 shortener or something like that, I got to be a little bit decisive there. If it's something that

01:00:20 is pretty like undernourished category, right? Like, you know, more is better, the more merrier. So if you

01:00:25 have like a game that you know of written in Python, it'll probably make its way in.

01:00:28 Yeah, super. Maybe someday the game category will break up into like, tower defense and strategy or whatever. Who knows?

01:00:35 I really want to do a taxonomy, like sort of refactoring because some of these categories are

01:00:39 bursting at the seams. I recently added the storage category because I found all of these like,

01:00:43 database related things written in Python. I don't know, there's just so much.

01:00:48 Yeah, super cool. All right. Now before you get out here, the last two questions real quick.

01:00:53 Sure.

01:00:53 For you. When you write some Python code, what editor are you using these days?

01:00:57 Yeah, I'm still getting it done with Emacs. But you know, I'm open to experimentation. And one of

01:01:02 these days, maybe one of these other editors will get its hooks in me.

01:01:04 Yeah, cool, cool. Well, I'll keep asking you each time you come on the show.

01:01:07 And then notable PyPI package. You've got some good ones out there.

01:01:11 The thing that still dominates for me is Glom, right? So I have this Python package is called Glom.

01:01:17 You use it for deep getting into a dictionary, but also a variety of other things. It's sort of like

01:01:22 a data templating system. And more and more, it's becoming like a higher level programming

01:01:26 language almost. So right now we're building streaming support into it because a lot of

01:01:31 people have brought up like sort of the size of the data that they want to manipulate with Glom.

01:01:36 And so we got to have some sort of streaming metaphor in there. But I'll also point out that

01:01:41 a lot of these awesome Python applications are distributed through PyPI in some way. So I have

01:01:46 PyPI URLs on a bunch of them too. So shout out to them.

01:01:50 Yeah, yeah. Awesome. Awesome. All right. Well, final call to action. People want to get involved

01:01:54 in your project. What are the ways I already touched on some of it for submitting stuff,

01:01:58 but what are some of the ways people can get started or get involved?

01:02:01 Yeah, definitely check out the repo on GitHub. Check out the listing in the readme. Check out that

01:02:06 Jupyter notebook that's in there. Look for the RSS link as well. If you're an RSS reader user like me,

01:02:12 a bunch of interesting Python RSS readers. If you're not, you can just download one on that page too.

01:02:17 And yeah, then submit some more applications and also keep your eye out for the PyBay talk and probably

01:02:24 the PyGotham talk. All right. Excellent. Well, this was so much fun to talk to you about all this stuff.

01:02:28 And, you know, nice work on having this angle because I don't feel like it was covered and you've done a

01:02:34 really good job. It seems like it's taken off. It's got almost 10,000 stars on GitHub.

01:02:39 Yeah. I mean, and it's not really about that, but I am happy to see that it sort of gets out there

01:02:44 more. Speaking of giving it more coverage, I did technically start a YouTube channel. You can like

01:02:48 and subscribe if you'd like. It's yak.party, Y-A-K dot party. Maybe I'll post some APA stuff to it

01:02:55 in the coming future once I'm done with these talks.

01:02:57 That sounds great. All right. Well, thanks for being here as always.

01:03:00 Absolutely. Anytime, Mike.

01:03:01 You bet. Bye.

01:03:02 This has been another episode of Talk Python to Me. Our guest on this episode was Mahmoud Hashemi,

01:03:08 and it's been brought to you by Linode and Tidelift. Linode is your go-to hosting for whatever you're

01:03:14 building with Python. Get four months free at talkpython.fm/linode. That's L-I-N-O-D-E.

01:03:20 If you run an open source project, Tidelift wants to help you get paid for keeping it going strong.

01:03:25 Just visit talkpython.fm/Tidelift, search for your package, and get started today.

01:03:32 Want to level up your Python? If you're just getting started, try my Python Jumpstart by

01:03:36 Building 10 Apps course. Or if you're looking for something more advanced, check out our new

01:03:41 async course that digs into all the different types of async programming you can do in Python.

01:03:46 And of course, if you're interested in more than one of these, be sure to check out our

01:03:50 Everything Bundle. It's like a subscription that never expires. Be sure to subscribe to the show.

01:03:55 Open your favorite podcatcher and search for Python. We should be right at the top.

01:03:59 You can also find the iTunes feed at /itunes, the Google Play feed at /play,

01:04:03 and the direct RSS feed at /rss on talkpython.fm. This is your host, Michael Kennedy. Thanks so much

01:04:11 for listening. I really appreciate it. Now get out there and write some Python code.

01:04:14 you you

Talk Python's Mastodon Michael Kennedy's Mastodon