#234: Awesome Python Applications Transcript
00:00 Have you heard of awesome lists? They're, well, pretty awesome, gathering up the most loved
00:05 libraries and packages for a given topic. While most lists cover awesome developer tools and
00:10 libraries, we don't have many examples of awesome applications, both for use and as examples to
00:16 draw from. That's why Mahmoud Hashemi decided to create awesome Python applications, and you're
00:21 about to dive headfirst into them. This is Talk Python to Me, episode 234, recorded September 24th,
00:27 2019.
00:28 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the
00:46 ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter,
00:51 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm,
00:55 and follow the show on Twitter via at talkpython. This episode is brought to you by Linode and
01:01 Tidelift. Please check out what they're offering during their segments. It really helps support the
01:05 show. Mahmoud, welcome back to Talk Python to Me.
01:07 It's good to be back.
01:08 Man, it's great to be back. It wasn't that long ago that you were on Python Bytes,
01:11 and we still didn't get a chance to catch up because you were covering for me.
01:16 Yeah, but it was a great time. Yeah, happy to fill in any time. But really,
01:19 who could fill in your shoes?
01:20 Oh, man.
01:21 Thanks for that. You've been on the show before. I think the first time was quite early in the show's
01:27 history. You came to talk about a really interesting topic, Enterprise Python, right? Python being used
01:34 within the enterprise. We talked about a bunch of examples of that. And I feel like this is the
01:39 open source equivalent of that story just a little bit.
01:42 I was thinking about that today. Yeah, you're right. It is kind of similar,
01:46 different in a lot of ways. I think a lot more fun, but we'll get to that part. But yeah,
01:50 that was back in 2016. Time flies.
01:52 Yeah, time does fly. We've been at this stuff for a while.
01:54 I was at PayPal.
01:55 Yeah, that's right. And yeah, I mean, I guess that's probably a good time to just ask you,
01:59 what have you been up to these days? What's going on in your world?
02:02 I sort of had my fill of moving around different teams at PayPal, seeing how that whole business worked.
02:08 And it was cool to work in the enterprise. And I wanted to sort of like stay with that. So
02:12 I wanted to like kind of grow my own autonomy, as well as like just sort of see how startups worked.
02:19 I am in Silicon Valley, after all, you know, and PayPal is a startup in many ways, but it started
02:24 in like, what, 1998. So it's mostly started.
02:27 Yeah, it's mostly started at this point. You'd be surprised in some ways. But yeah.
02:31 Anyways, but basically, yeah, so I went to like a Series C company, did that for a while. It was
02:37 pretty cool. And then there's not too many teams, you can move around there. And like,
02:41 you know, so once you've learned it, you kind of learned it. And then I was like, okay, well,
02:45 let's come to a Series A company. And so for the past couple years, I've been at Simple Legal
02:50 here in Mountain View. And yeah, it's been really good. When I joined, the team was like four people
02:57 and or rather than we had like four engineers. And then now I think we're at like a dozen,
03:01 got a small team. I'm like principal engineer, do a lot of like code review,
03:05 architecture review, but also a fair amount of coding. And yeah, just having a blast.
03:10 That's cool. That looks like a fun product or service to work on. Is there a lot of Python
03:14 happening?
03:15 Best part, right? Like it's, it's not like PayPal where you're jockeying, like, you know,
03:19 for different technologies and so forth. It's like what we say goes, right? And we pick the
03:23 best technology, Python, Postgres, everything that everything we love. So we got that autonomy down.
03:29 But like, yeah, just to be clear, it's not like the most exciting company on the outside.
03:34 But I always tell people this, like, try to go for a boring company and then make it fun. Do it in
03:40 like, like choose to a boring company so that you can do things in a fun way. If you choose something
03:46 that's too exciting on the outside, then your day to day is just going to be a blur of boring stuff.
03:51 So yeah, we've done a lot of open source here. And I could go on about it for hours. But we got
03:57 other things to talk about.
03:58 Okay, some awesome stuff to talk about for sure. Yeah, I think there's a lot to that,
04:02 you know, people see sort of the company as the exciting thing, right? Like working for Tesla
04:08 or something like that. But, you know, if what you end up doing is writing C code to do a bunch of
04:14 internal boring stuff, well, then like, it doesn't matter how exciting the company is. That's not that
04:17 much fun.
04:18 No, exactly. Just bash scripts to shovel around logs. That's not exciting to me.
04:22 It doesn't matter if the logs are...
04:23 You know, I don't care how exciting the brand is.
04:25 Yes, exactly. Exactly. Cool. All right. Well, let's talk about some awesome things. Now,
04:32 there's been this history of awesome lists. Do you want to maybe just tell people about this,
04:37 this idea of awesome lists? I don't know how much of the history you know, but I know that you
04:41 must be involved because you've created one.
04:42 I'm actually not that great of an expert on these things because sort of when they started popping
04:48 up, I was already pretty well versed in, for instance, awesome Python. It's a huge repo,
04:53 like tons of contributors, been around a long time, got tons of content, and it's quite awesome in many
04:59 ways. But I rarely like refer to it or find much new there because I was sort of like, you know,
05:06 already along my Python path. But I'm sure it's out. It's a great resource for people out there. I refer
05:10 people to it all the time. So yeah, but this awesome list thing is a phenomenon. It's like a
05:16 meme on GitHub where it's like, I'm going to make a list of links, which are awesome for some definition
05:21 of awesome.
05:22 Yeah, for some definition of awesome. Exactly. And I don't even think it originated with Python. I feel
05:26 like there was some PHP stuff going on.
05:28 No, definitely not.
05:29 Yeah, awesome Python is really good. People should check that out. That's interesting. There's a bunch
05:34 of good libraries there. And there's also more focused ones. Like there's, I recently ran across
05:38 awesome ASGI for async server, like basically for async web frameworks and other AIO type of things,
05:47 asyncio things.
05:48 Yeah, it's just the default architecture for like linked content on GitHub, you know, if you want it
05:54 to be approachable. But yeah, there's this guy like Cinder Sorhus and basically, yeah, I'm pretty sure
06:00 that he's like node guy. Yeah. Anyways, he like has this sort of awesome authority. And it's like the
06:06 meta list. It's the list of lists that point to all the other awesome lists.
06:10 Oh, how cool. I didn't know there was an awesome list of awesome lists. That's super meta.
06:14 Oh, well, that's why they all have this badge with the cool sunglasses on it. And you know,
06:19 anyways, that's got like 120,000 stars, which is like, you know, it's good. Like, like, frankly,
06:24 so much of how we learn as a community is driven through streams of content that has to be manually
06:31 republished out. I'm thankful for anything out there that sort of constitutes a reference,
06:36 you know, that like a sort of institution I can go and look at rather than refreshing a Twitter stream
06:41 and hoping that somebody like has said something of value to me. Awesome lists are in the end on like
06:48 net. Pretty awesome.
06:49 I would say, yeah, they're aptly named. And I do like them. I feel like they're a little bit
06:53 vetted, right? Not it's not just a Yahoo of all Python things, right? It's not like PyPI or whatever.
06:59 It's these are the things people found to be extra good in these categories. I guess it's kind of like
07:04 Yahoo a little bit anyway, early, you know, 1996 Yahoo. So maybe tell us about your awesome list and like
07:10 how you came to come up with it and things like that.
07:13 Sure. So yeah, I didn't really set out to create an awesome list. I sort of backed my way into it.
07:20 So I started a couple of years back, like in I think 2017 or so, I started like giving conference
07:24 talks. I was already going to meetups and it was sort of the natural next step. Start giving talks.
07:29 I recommend everyone do the same. It's a good way to branch out.
07:32 Giving talks is a great step in like raising your profile and just breaking out of the mold of I'm
07:38 sort of an anonymous programmer, right?
07:40 Yeah. And just like writing blog posts, for instance, like, you know, once you actually sit down to
07:44 cover a topic, it exposes a lot of your own gaps. And so you end up filling in a lot of your
07:49 own knowledge. Yeah.
07:50 Just for personal benefit, I recommend it. So basically I was giving these talks and I was
07:54 covering topics like performance and packaging and testing and plugins and a bunch of architecture
08:00 stuff. And the thing is that like, like I just sort of alluded to, oftentimes when I'm starting out with
08:06 this talk, like I have some degree of like expertise in the area, but also I'm doing a lot of learning
08:11 myself. That's one of the reasons I like doing them. And so afterwards, people would have all these
08:15 questions, you know, I'd take it as a good sign. They thought the talk was good. They thought I know what I'm
08:18 talking about. That's great. But the fact is that like, when they're asking me about packaging
08:23 or plugins or, you know, performance, it's like, well, I only have my 10 years of experience to draw
08:29 from and I won't know every single answer, you know, to apply to their situation. I want to,
08:36 I desperately want to, but that's just not possible.
08:38 You can't always go through it, right? Like if you're packaging up an app and you know,
08:41 well, I got CX freeze to work for me. So now I can make it.
08:44 It's like, well, why are you going to go try all these other ones when you have one?
08:49 Exactly.
08:49 You've got a life to live and other apps to build, right? But other people can share their
08:53 experience and say, Oh, did you know about Pyox dodger? You're like, wait, what's that?
08:57 I certainly want that a hundred percent completion like stat, but it's just not going to happen. So
09:01 one thing I sort of wished I had was basically a way to refer them to a known working example,
09:09 just because it was available at the, at the conference at PyBay a couple of years back,
09:13 like the Zulip team was there and Zulip is like this chat application written in Python,
09:18 sort of like, it's sort of like a Slack, but kind of like blends in some email features. It's a pretty
09:24 cool design and they have a fully working application with a great community, a good onboarding process,
09:28 good docs, all this stuff. And so I'm like, look, this is a great exemplar. If you're writing a Django
09:34 server application and you're interested in say, introducing typing, they recently did that process and
09:40 you can go look at how they did it and you can get a lot better answer from looking at exemplars. And
09:45 you can, you know, for me, just spur of the moment. Right. Or even just some kind of talk where people
09:50 just talk about the concept of type annotations or type hints, but here's actually the GitHub
09:55 conversation and the issues and the PRs. And then this is the before and after and the trade-offs. And
10:01 it's just like a white paper, real world example of that. Right. Yeah. And the living maintainers who
10:08 like could answer those questions too. There's so much to draw from there. And so I was like,
10:13 okay, I'm going to make a list of Python applications just for me so that I can easily
10:19 refer to them. Like just off the top of my head, I was able to get together around 20 that I considered
10:25 pretty awesome. Can you think of like, I mean, Python is the biggest language in the world right
10:29 now, basically. So like, can you think of a few?
10:32 You would think that I would be able to start naming them. The real tricky part is these have
10:37 to be the open source ones, not just things written in Python. So Wagtail, for example,
10:43 is a CMS in Django, which is pretty nice. Reddit, Reddit is one. Trying to not cheat and think about
10:51 what is on your list, but these types of things.
10:54 How about you, the listener? Can you think of a application that is written in Python?
11:00 Okay. Time's up. Well, it turns out that there are more than the 20 I could come up with.
11:07 And when I started looking, it was sort of hard to find them. But once I found them,
11:12 like I sort of found it for myself in this sort of addictive cycle, like just always looking for more
11:17 because there was so much creativity there and often sort of underappreciated. So something I became
11:23 aware of pretty early on when I was developing my own, like, you know, open source Python application,
11:26 there's this contest that's actually happening now. It's called Wiki Loves Monuments. And
11:31 Wiki Loves Monuments is like a photography contest around the world. It's like the Olympics of
11:36 photography for free culture. And they needed a tool that would allow them to judge entries. So
11:42 I made one of those and, you know, had a nice team. There was some stuff we had to figure out on our
11:48 own. But as I was like building this, I was finding some other applications and they
11:52 had like a couple dozen stars, you know, and it's because GitHub is a place for developers.
11:58 It's for people who want to build software, not necessarily use software. Like, I don't know if
12:03 you remember the old days with like Two Cows and CNET and Download.com.
12:07 Two Cows was awesome. You were like, I need a thing. I mean, it was sort of the free and open app store,
12:14 right?
12:15 Right. It was freeware. It was shareware. It wasn't really like free software in that,
12:19 you know, it's the beer or whatever instead of the, you know, speech. But yeah, anyway, so basically,
12:25 I sort of like got a little bit addicted to finding these awesome applications, because
12:30 a lot of them were sort of diamonds in the rough, not getting as much appreciation as they probably
12:34 deserved. I mean, their users loved them, but not necessarily GitHub users.
12:38 Right. Well, GitHub's not exactly. Yeah.
12:41 It doesn't have the right incentives or the right sort of connective structure there, because GitHub is
12:46 all about connect. I feel like it's all about connecting you with libraries, either code you work
12:51 on or libraries you want to put into your application, but not just end user things that you could study or
12:57 make copies of, right?
12:58 Exactly. And so I ended up with like this sort of five point rubric for the things that I was looking for.
13:04 Basically, like it had to be free software. It had to have an online source repository. It's not much of a
13:08 reference if you can't see the up to date code that is actually shipped. It had to be using Python for a
13:14 significant part of its functionality, not necessarily pure Python. Let's be real, like realistic,
13:19 pragmatic applications are going to have a mix, you know, JavaScript, HTML, CSS, going to have to have
13:25 them. So until browsers run Python anyways. So yeah, then it had to be well known, or at least
13:31 prominent in its identifiable niche. There's some cool applications out there for like neuroscience and
13:37 maybe neuroscience people love them, or something like that doesn't have to be world famous. But
13:44 basically, it has to be maintained. This is another like really important thing. A lot of old wiki pages,
13:50 they sort of get stale knowns checking the links, and it just ends up being sort of a sad graveyard
13:54 after a few years. So these projects actually have to be maintained, the links have to be up,
13:59 they have to be functional on relevant platforms at the very least.
14:03 I've seen some of these types of applications you're talking about. And I'm like, found it on
14:06 GitHub, like, Oh, why is no one talking about this? This is great. And it says click here for a live
14:10 demo. And you click it. And it's 500. Crash. You're like, exactly. This cannot be getting that much love.
14:16 I'm setting out to sort of like make a list that doesn't do that. I'll get to that in a second. But
14:21 most importantly, and you alluded to it earlier, it's just that they have to be shipped applications,
14:26 not libraries or frameworks. So one of the first things I did was I went on the awesome Python list.
14:31 I'm like, surely there's some applications on this list I can use as exemplars. And I think I found like
14:35 three, right? Like Jupyter notebook. And it's not a huge surprise that Jupyter notebook's written in
14:40 Python. But awesome Python is all about the libraries. And I wanted a list of applications,
14:45 hence, awesome Python applications.
14:48 It's really good. And there was not really anything out there like this. And I think it's great.
14:53 When I was looking through it, you know, I expected a lot of web applications, and I'm sure we'll find
14:59 them and we'll talk about some of them. But there's also desktop GUIs, terminal, sort of ASCII type of
15:06 app, you know, ASCII, what a curses type applications, a lot of variety there.
15:11 There's so many ways to taxonomize it. And I could get into that if you'd like, because I've spent hours
15:15 mulling over it. But basically, we have the basic topic, like, you know, breakdown. But then we also got
15:21 sort of server versus like, if it's a server, like software, or if it's sort of like client or GUI
15:27 software. And then as far as the console is concerned, right, you have CLIs, which is just a
15:32 command line. And then you have a TUI, which is a text user interface, somewhat lesser known term.
15:38 And then you have sort of the interactive console style. Those are the three that work within the
15:43 terminal. And then you have like sort of some other interesting architectures out there too,
15:47 which I'm like, yeah, we'll cover it later. But the good news is that when I launched it on accident
15:54 on Brian Okkens testing code, is that I basically had found 180. Like I was turning over rocks. I was
16:02 like looking on Bitbucket, Launchpad, you know, this isn't just GitHub and Git, it's like Bazaar,
16:08 Mercurial, there's like Calathea and Peugeot, and like, which I hadn't really heard of, but I found
16:14 applications hiding on there. And of course, like all these Git labs out there that people self host
16:19 too, because there are all these great like sort of sub communities, the Debian sub community,
16:23 the Fedora, Red Hat, Gnome, and they all have their own cool organizational support and community. And
16:29 it's really interesting to just not just find the application, but also that community. So yeah,
16:34 we got 180 when I accidentally launched it sort of at the beginning of this year, or maybe like a little
16:38 bit before, like Merry Christmas to the Python community, I guess. But yeah, so I published a
16:44 blog post once that sort of started picking up, and that's still up on the site. So if you go to my blog,
16:49 like sedimental.org, it's just up there. I'm sure if you search awesome Python applications, it'll come
16:54 up. For sure. And I'll link to it in the show notes. I haven't blogged since then, because pretty much all
16:58 of my like, free time for content creation has just been going toward curating this. And I've been
17:04 learning a ton. I need to find a better way to share it. I guess that's sort of like why I'm on
17:07 the show. But the point is that at the beginning of this month, I was at something like 250 ish. And
17:13 I guess I should say at the beginning of September 2019, I was at the top of around 250 applications.
17:20 And now I'm at like 312. I'm aiming to get to 350 by end of September 2019. I say 350, not because it's a
17:29 particular goal, but because I have like a list of candidates that I need to evaluate.
17:34 A lot of our community, like sort of submitted is really like very time consumptive to find these.
17:39 So if people want to, you know, help me out, that'd be great.
17:42 Yeah, that's cool. So yeah, I think I'm going to do a PR for Doc Assemble for you, which is a legal
17:48 interviewing software based, I think it's based on Django.
17:51 I just looked it up and it actually looks very impressive. It's exactly the sort of polished
17:55 application that I think needs some more open source love. Anyways, I mean, one might think besides
18:00 my personal addiction, like why, why do this? And I have sort of like three goals that I'm trying to
18:07 hit, or at least trying to fulfill. I'm not sure if they'll ever really be hittable, then maybe not
18:13 like measurable enough. But I really want to like goal number one is I really want a better development
18:20 cycle. You know, like someone ideates idea for an app, you know, they go basically do maybe Django
18:29 start project, or they open up a blank editor, and they just start writing it and they start searching
18:35 on Stack Overflow almost immediately. You know, maybe they find a tutorial that gives them a to do or
18:39 something to get just barely get off the ground. But beyond that, it's all first principles.
18:45 And I just really want people to sort of benefit from all of the other discovery that these like
18:52 application developers have created. The way I put it in my high bay talk, like last month was that
18:57 Python is the biggest language in the world right now. But how do you personally benefit from that?
19:01 Yeah, you know, there's all this stuff out there. I mean, there's a bunch of great libraries we
19:05 benefit from. Absolutely. I do think one of the differentiators between a beginning programmer or a
19:12 beginner in an ecosystem. And somebody who's very experienced, some form of expert, I guess,
19:19 is that a lot of times the beginner will start coding and think everything has to be created from
19:25 scratch. Right? I need to, you know, load CSVs. So let me just like read the text of the file and start
19:33 splitting on, you know, commas or like weird stuff like that. Right? Whereas the more experienced person's
19:39 like, well, you know, pandas will read it, or there's also a CSV module in the library, like,
19:44 right, just use that. And I feel like this, it kind of helps the beginners close that gap to say,
19:51 I'm looking like I want to build something like that. Let me see what they did, right? Let me see
19:55 how they structured their file system, and they organize their code. And are they even using celery?
20:00 I heard I have to use celery. Do I have to use celery? I don't know. That's sort of the thing. If
20:03 you go on awesome Python right now, sure, you're going to find like hundreds of libraries. But what
20:07 are you supposed to use them all? Like what all ingredients goes into making sort of a complete
20:12 version of the app that you're sort of trying to build of that architecture? Yeah, I think that
20:17 basically, like my longer term goal here is like to have sort of a decision tree type interface.
20:23 So you can say like, well, look, I want to build a web application, you know, and then like,
20:27 you can sort of say, oh, I wanted to support this many users, or I wanted to basically use,
20:33 say, SQLAlchemy, or have a Docker image or something like that. And you can find your way to,
20:40 like an application that looks enough like the thing that you want, that you can just pull things
20:45 wholesale from it. And basically, not only get an application sooner, but also learn the best
20:52 practices without having to go through dozens of hours of conference talks and blog posts.
21:21 Where your users are, there's a data center for you. Whether you want to run a Python web app,
21:26 host a private Git server, or just a file server, you'll get native SSDs on all the machines,
21:31 a newly upgraded 200 gigabit network, 24 seven friendly support, even on holidays,
21:36 and a seven day money back guarantee. Need a little help with your infrastructure?
21:40 They even offer professional services to help you with architecture, migrations and more.
21:44 Do you want a dedicated server for free for the next four months?
21:47 Just visit talkpython.fm/Linode.
21:51 Compare this to the cookie cutter library of templates for us.
21:57 Yeah. I mean, that's actually, I hadn't really thought of that, but that's a very good point.
22:01 In a way, cookie cutter is trying to codify these architectures as well. I guess not to put too
22:07 fine of a point on it, right? But I mean, I have looked through 300 of these. I mean,
22:10 I've looked through over a thousand applications and very few of them, I like still bear the marks of
22:15 having started as a cookie cutter application. I think that cookie cutter certainly has its place
22:20 for getting a kickstart, especially for people who maybe don't want to write their own setup.py or
22:26 something like that. But yeah, I'm not sure that it's going to be enough architecture to really get
22:31 you completely off the ground. Like there are emerging technologies and containerization around like
22:37 that pack and say app image and snap and so forth, especially for these GUI applications on Linux,
22:45 just as an example. And maybe there's a cookie cutter for something like that. I don't think it's been
22:51 done enough times for it to like show up on that radar. And so basically you'll have to look for the
22:58 existing community within the existing, I mean, it's very hard to call the community actually,
23:02 within the existing ecosystem, like who has actually adopted this. And so that's like,
23:09 you know, I think I'll get to that in a minute here, but I did clone every single repository,
23:13 clocked in at around 30 gigabytes for 250 repos. I sort of ran a bunch of analysis and technology
23:20 discovery within it to figure out who's doing what. That's cool. Yeah. Like, in fact, let's just like
23:24 jump right into it because I'm, I've got these numbers and I just got to sort of get them out
23:28 there. Sure. Let me throw one quick comment though, on the cookie cutter bits while you're pulling up
23:32 the numbers. I feel like cookie cutter is great to help people jumpstart. And the idea is like,
23:37 you kind of get this prepackaged app, which is great. Like here's a way to do flask that already
23:41 talks to a database and a queue, or it already sends email or something. Right. That is a long ways away
23:47 from here's an actual application, right? Even when, as a developer, when you're building an
23:52 application, you think, Oh man, I'm almost done. I'm 80% done. You're like less than half done,
23:58 right? There's all these little edges you have to smooth off and these like edge cases that you don't
24:02 think about. And in these, these are real applications that shipped, which means they're way more polished.
24:09 And they're just, I don't even know that how comparable it is. They're both good for people
24:13 starting, but I feel like this is a much different set of things in the catalog.
24:18 And to be clear, they're polished for the end user. What I think is really interesting about them too,
24:23 is like all the rough edges they have. That can be really good for a programmer's ego to be like,
24:27 look, this worked. Okay. Like, I don't need to overthink this. Like, I'm just going to say,
24:32 this is good enough. If it's good enough for them, it's good enough for me. Ship it.
24:36 Yeah. It's easy to get hung up on like trying to make it perfect or trying to set up like infinite
24:41 scalability. So it's like Google and you're like, you have no users yet. Forget scalability.
24:45 Just get it out. Right? Exactly. And so I got my numbers up. Let's hear them. So yeah,
24:50 the quick methodology here is I just cloned every repo that's Mercurial, Bazaar, Git,
24:55 pulled them all. I have, you know, ransom slot count and like other analytics over the version
25:01 control history. We're looking at 19 million lines of code, 2 million commits with around 50,000
25:09 committers. That's an insane amount of effort. Like that is huge. And this is Python code.
25:14 I think I added it up and it was like hundreds of years of like maintainership or whatever you want
25:20 to call it, you know, like sometimes you go on maybe steam or something like that. And it sort of
25:24 aggregates for you how many hours was played on the game. And you sort of feel like you feel a little
25:28 bit terrible about humanity perhaps, but we all got to have fun anyways. But this, like I had like sort
25:35 of the opposite feeling, right? Like this, I think I actually experienced the true meaning of an awesome
25:41 list. Like I was in awe at the amount of like, you know, stuff I was looking at. And I sort of like
25:48 broke it down. So real quick thing, like about a third of them are server software with 64% being
25:54 like sort of desktop or CLI and just sort of like a quick breakdown there. I think the scariest thing for
26:01 me was basically Python compatibility, because we hear a lot of like doom and gloom at times,
26:08 but like contrasted with like a lot of excitement about Python 3. And I was worried that like, you
26:14 know, these maintainers are stretched so thin, they have not had time to add Python 3 support. And here
26:19 we are with something existential facing the Python application community. And I was very pleasantly
26:27 surprised to find that two thirds of these applications already support Python 3, like they run on Python 3.
26:33 That's great. It was a huge relief. This isn't the same thing as a library supporting Python 3. This is
26:37 them having converted a code base that is often quite large to Python 3, many of them. They didn't start
26:43 this way, because I think the average or sorry, the median application on the list is something like 10
26:48 years old. And some of them much older than that, like Mailman, the thing that Python uses and many other
26:54 organizations use for managing their email lists, that's written in Python, and it's made the jump,
27:00 you know, it supports Python 3. And that's pretty impressive. And if you look at sort of the
27:06 compatibility over time, you do see the Python, like two applications kind of tapering off, you haven't
27:11 had another Python, like I think the most recent Python 2 application was started in something like 2017,
27:16 versus, you know, Python 3 applications, which are being started today, Right. There's probably a bunch more Python 3 apps starting now than there are Python 2,
27:25 I would expect.
27:26 You see a nice healthy trend there. Thankfully, I was super relieved.
27:29 I'm sure you'd either have to be crazy or depend on a library that is Python 2 only deeply to start
27:38 a Python 2 app these days.
27:39 And those are rarer and rarer. So it's a little bit harder to justify. There's still some interesting
27:43 case studies in here. So for instance, I think OilShell actually converted to Python 3 and then back to
27:49 Python 2. And he has a whole like, you know, blog post about why he did that, the lead maintainer
27:56 there. So that one's an interesting case study. And I think that's sort of what I hope to like,
28:01 one of the deliverables I hope from comes out of the list is just like all the interesting case studies.
28:06 I think we'll get to those too.
28:07 Yeah, there could be a bunch of stuff coming out of there. I think, you know, another thing I think would be valuable, not that you don't already have enough going on,
28:14 but would be some kind of newsletter where just the stuff that gets added that month or something
28:19 comes in and just gets thrown out. That'd be great.
28:21 I'm too much of an engineer for that, Michael. That's why I've added an RSS feed.
28:26 You know, we just pull the data ourselves.
28:28 Yeah, exactly. Just pull the data yourself, right? If you want to do a newsletter and all that,
28:31 I fully support that. But yeah, I'm going to take a technical approach to this social problem.
28:37 Okay. It would be good to get there. It would be good to get there once the list sort of maybe
28:41 starts like plateauing a little bit more. And also once, yeah, once I get some more stuff in place to
28:46 keep the quality high, you know, because some of these projects might go like into an unmaintained
28:51 status and I'm not going to be the one vetting 350 projects every month, you know? So I need to
28:56 automate some things first.
28:58 How much automation could you do? Like, for example, every one of these has a link to their repo.
29:03 Could you look for unresponded to open issues or the lack of a commit for like one year puts it onto
29:12 like a warning list and then some contributor could come in and go, this looks suspicious.
29:17 That's almost exactly what I have planned. And I sort of have, I have a little command line
29:20 application I built for managing awesome lists. It's surprising how many awesome lists are just
29:25 manually maintained via PRs. But yeah, all my stuff is in a YAML file. I think I have something like
29:30 a thousand links in there. All of them are going to need to be checked for 404s and stuff like that.
29:34 But yeah, I do sort of have a informal bar of you need to have a commit in the last year.
29:39 There are exceptions to that. So for instance, you mentioned Reddit earlier. Reddit is a massively
29:45 important Python application for the internet. And they only have an archival version of the site up
29:51 from around 2017. But still, I mean, like, it was still a very big site in 2017 with a lot of
29:57 like real lessons. It is such an important site that I think it clearly deserves to be on there.
30:01 It actually surprised me that it was there. There are a couple other surprises in there too,
30:05 I think. I'll just sort of refer the audience maybe to the repo where we have a Jupyter notebook that has
30:11 all of the graphs in them. Like some of the like numbers I'm going to cite to you now are going to
30:16 be out of date in just a week or two when I give a PyGotham talk because I'm going to run it on 350
30:21 applications instead of 250. But we'll keep the Jupyter notebook in the repo up to date. But suffice
30:28 to say for now, like there's some really interesting findings in there.
30:31 Oh, I'm looking at it now. This is super well done. I'm so glad you put this up as a notebook so it can
30:35 just like live. That's great.
30:37 The most surprising trend, it doesn't affect me very much, but it was surprising to me just how like stark
30:44 it was the increase of QT based GUI applications compared to GTK based GUI applications. When you
30:52 look at the graphs in there, remember all of these applications have had a commit, like 95% of them
30:57 have had a commit in 2019. It may look like, oh, this is an old application, but it's maintained,
31:03 it's used, it is awesome and deserves some respect there. But like people are starting QT applications
31:09 these days, not GTK.
31:10 Yeah, yeah. I'm looking at that list. It's almost 50% QT and then 2% Pi game, 3% Kivi,
31:15 17% WX and 30% GTK. Yeah, pretty interesting. There's a bunch of graphs like that. This is a,
31:21 I don't know how long this is, but the scroll bar is very small.
31:24 That's great.
31:26 I'll do credit there. Like I was on a huge time crunch coming off of speaking tour in Africa.
31:31 Let's just say I gave a keynote in Tunisia, nothing too mysterious, but basically I was super time
31:36 crunched. And so that notebook right there is basically all my wife's work. She's in the commit
31:42 log and so forth, but credit where credit is due. I did not have time to do that stuff. And she really
31:47 saved my ass.
31:48 Yeah, she came through. That's awesome. Congrats to her. That's good work.
31:51 Yeah. Anyways, I could go on about the stats, but I think that probably at this point, people are,
31:56 people have got to be curious what other case studies are hiding on this list.
32:00 Yeah, they've got to be. So maybe pull out some highlights that stand out for you. And then I grabbed a
32:05 couple that maybe we can go more quickly through like more of them, just skip across to give people
32:09 a flavor of what we're talking about.
32:11 Absolutely. So one of the ones that I highlighted was OpenEDX. I know that Ned Batchelder is a big
32:18 deal in the Python community, and this is where he works, last I checked. But the EDX platform,
32:23 edX platform has something like 51,000 commits at the time of writing. And it's really interesting to me
32:30 because it is a mono repo. And it represents like a whole team's work. And so there are all these
32:36 dynamics that you can see happening there. And it's one of only three projects where no one developer
32:43 has more than 10% of the commit history.
32:45 Wow.
32:46 Yeah. It's the third largest Django project on the list with 300 committers. And it powers all of edX.org.
32:53 Yeah. And I think MIT's OpenCourseWare and a bunch of others as well.
32:57 Yeah. And just for contrast, 41% of applications are mostly written by one committer. How's that bizarre for you?
33:05 It is bizarre. But, you know, thinking about it, it makes sense to me, right? It seems like
33:09 somebody just wants this app to exist, so they're going to go create it. But then you see the things like Reddit or
33:14 OpenEdX or some of these others, you're like, well, these are built by large groups of people.
33:18 I sort of want to find a way to highlight these organizationally supported, foundation supported,
33:23 corporate supported applications, because they definitely follow a different sort of path
33:28 compared to your average just sort of side project.
33:32 What do you think about using folks out there listening, maybe they're PhD computer science folks
33:37 or other types of researchers? Do you think that this list might put together some interesting set
33:44 of data for people to go look at how apps are built?
33:46 I think that if there's someone who's doing some sort of combination, computer science, ethnography,
33:51 like, you know, studies, social science studies, there's absolutely a ton to learn from here.
33:56 I mean, I didn't sort of pick and choose applications based on these findings. These findings were
34:01 like, you can't exactly call it randomly selected, especially not with the Python qualifier in there,
34:07 right? But it is still a very interesting cross section. And if somebody wants to do that sort
34:11 of thing, I'm all for it. Happy to support that.
34:13 Yeah, that'd be cool. All right. So tell us some of the other ones.
34:15 Yeah, absolutely. So another one that I highlighted was Sentry. And so Sentry is very interesting because
34:23 it's so big and often promoted on podcasts and conferences and so forth. A lot of people are surprised
34:31 that their entire code base is just sitting on GitHub. It's got like it's 26,000 commits since
34:36 2008. For those who don't know, it's a web service and front end for cross platform application monitoring.
34:42 So it's sort of like New Relic and stuff like that. A little bit different than Datadog, though,
34:46 because it focuses on error reporting. But we're looking at a million lines of Python. I think it's one of the
34:52 very few projects that actually crossed that million line threshold. Because, you know, a million lines of
34:56 Python, it's like 10 million lines, 100 million lines of less efficient languages.
35:01 Yeah, exactly. A million lines of Python. That's a lot of code.
35:04 That does include about 120,000 vendored lines. Just so you know that I went looking for that
35:09 possibility. Like I went looking for libraries that maybe got vendored and tweaked and just copied in.
35:14 Maybe define that for folks, because not everyone will know exactly what you're talking about.
35:17 Yeah, of course. So vendoring is what you do when, you know, a library does something you don't want,
35:21 like drop support for intercompatibility with something else. And you're like, okay, well,
35:26 I'm just going to create my own little mini fork of this inside of my repo.
35:30 Right. Like if I didn't want to depend upon requests, theoretically, I could just jam the
35:35 source code for requests into my app.
35:36 Yeah. Like you probably didn't pay for it, but it's called vendoring because the person who like,
35:42 you know, publishes the library is a vendor. It's kind of like that.
35:46 You've taken responsibility for it, right?
35:48 Right.
35:48 Okay. But still, that's still a lot of code that is like, it's 880,000 lines. That's not
35:54 vendored.
35:54 Yeah. The largest Flask app I found in comparison is Pagure, P-A-G-U-R-E. And that's what's called a
36:01 forge. It's like a GitLab or a GitHub, but written in Python and it's written in Flask. So it's almost
36:07 like 10X smaller. So Sentry, its code is sitting all out there. And it's also BSD3 license, which is
36:13 very permissive for a for-profit application. So that's kind of interesting.
36:17 Yeah. That's super interesting.
36:18 The quirkiest thing I found though, the quirkiest like case study was this application called Gannetti.
36:23 And that's G-A-N-E-T-T-I. And it had like, I don't know, 100 or 200 like stars on GitHub or
36:31 something like that. And it was just like, what even is this thing? Because it was written in like
36:40 half Python, half Haskell. So like the very pure functional thing makes it with the very pragmatic
36:49 sort of systems thing. And it had 16,000 commits since 2007. So very mature. It's a cluster management
36:56 tool focused on long lived VMs used for workloads that don't have built in redundancy. So like web
37:01 servers, like, you know, web services, you can shut down the worker and start up a worker. The other
37:06 workers will take over. Right. But if you need like a job to not fail, and it's going to be very long
37:11 lived, right, you might use this. And so it was actually developed at Google. And that was also
37:17 very strange because Python and Haskell, especially Haskell, like these are not like super common
37:23 languages to find at Google for, you know, something that appears to be this important. And it's pretty
37:28 widely deployed. I found like a nice discussion of Wikimedia, like just talking, like, should we use OpenStack for
37:34 this? Or should we use Gennedy for this? And I think they ended up going with Gennedy. So it's still
37:39 pretty commonly used. That one was an oddity. Another one along those lines is a thing called
37:44 LocalStack. And that's a developer tool. It's useful because it sort of does like a mock service that
37:51 you can run locally of like AWS. And so if you're doing DevOps code, you can run a mini AWS locally and
37:57 like write tests against it. And that one is like, I think, a third Java is one of these blended
38:02 dead stories, right? Yeah, yeah. So like hybridizing with JavaScript and HTML and CSS, everyone expects
38:08 for a web service, right? But like to see things mixing with like Java and Haskell, these are some
38:12 pretty interesting case studies that have a whole story of their own, I'm sure.
38:17 This portion of Talk Python to Me is brought to you by Tidelift. Tidelift is the first managed open
38:22 source subscription, giving you commercial support and maintenance for the open source dependencies you
38:28 use to build your applications. And with Tidelift, you not only get more dependable software, but you pay
38:33 the maintainers of the exact packages you're using, which means your software will keep getting better.
38:38 The Tidelift subscription covers millions of open source projects across Python, JavaScript, Java, PHP,
38:44 Ruby, Ruby, .NET, and more. And the subscription includes security updates, licensing, verification,
38:49 and indemnification, maintenance and code improvements, package selection and version
38:54 guidance, roadmap input, and tooling and cloud integration. The bottom line is you get the
38:59 capabilities you'd expect and require from commercial software. But now for all the key open source software
39:05 you depend upon, just visit talkpython.fm/tidelift to get started today.
39:12 So which ones did you like?
39:13 Well, you know, I went through and I just wanted to pull like a couple that stood out to me on each
39:18 of the categories. So if we go to the awesome Python application list, there's a ton of categories.
39:24 You sort of put it into major categories and then the developer space just became a meta category,
39:30 right? So you've got internet and audio and graphics and productivity and education and science.
39:35 I pulled out the developer stuff compared to the non developer stuff because I think that the developer
39:40 stuff in general is a little bit easier to find. Like you're going to find blog posts and higher GitHub
39:45 stars for developer focused applications in many cases. It's not a super, super stark difference when
39:50 I ran the numbers, but definitely like, you know, most of the listeners have heard of Ansible,
39:55 you know?
39:56 Yeah, right, right. Or OpenStack or something like that. Yeah, Exactly, exactly.
40:00 Yeah, cool. So there's a bunch of stuff under there. And I didn't, I didn't pull them out quite
40:03 like that. But maybe just to give people a sense of what's out there. Some of these are really small
40:07 and niche. And you tell they're like for somebody and others, they've been like the top 10 sites on
40:13 the internet.
40:13 Yeah, let's just go. I'll just go through the list. And we can just maybe give me some real quick
40:17 thoughts. So under the internet category, we have Deluge, which is a popular lightweight
40:23 cross platform BitTorrent client.
40:25 Deluge, which is how I pronounce it. I mean, I don't know how to pronounce it.
40:29 But Deluge, okay.
40:30 But well, I don't know. It doesn't come with the pronunciation guide last I checked. But basically,
40:34 it's a BitTorrent client. And it has a few very interesting things about it. One,
40:38 it's a GUI application, it uses Twisted. If you just look like for best BitTorrent client,
40:44 you'll find it making those rankings. And people aren't saying like, this one's great,
40:48 because it's written in Python. You know, it's just a good application, very solid,
40:52 like the UI, etc. And it happens to be written in Python. And so that was one of the ones that
40:58 I got off the top of my head when I made my list of 20 because I am a Deluge user proudly backing up
41:05 my Linux images. The Deluge thing also has a little like hidden gem, which is that they managed to have
41:11 a web UI that is almost identical to their GUI UI. But it just like sort of like allows for remote
41:19 administration, it's still classified as a desktop application, because it's mainly written for
41:23 single user use. But it does have a web UI. And I'm not sure how they did it. But like,
41:29 the experience is the same. So like in the browser as using, I guess, like I'm not sure if they're QT or
41:36 GTK, I can check. Sure, sure.
41:37 But yeah, I'm a big fan of Deluge. I highly recommend checking out their repo.
41:41 Yeah, all these have links to the repo, it's a demo to their homepage, things like that. Another one is
41:46 the Lixire, a featureful file host and link shortener with API. So kind of like Bitly, maybe.
41:53 This one's an interesting one. Like it seems to be used by like, a community I don't fully
41:57 understand. But like, it gets a lot of use. So yeah, this is a recent addition. I'll probably
42:03 explore it a little bit more later. But yeah, it is like sort of like an imager combined with
42:09 like a Bitly, which sort of makes sense. Like if you're, you know, pasting things into an IRC chat
42:14 or something like that, it's got everything you need.
42:16 Yeah, super. The next one people might have heard of it's called Reddit.
42:19 Yeah. Yeah, we don't need to say a lot more about that other than this is sort of a snapshot from
42:24 2017. But I think it's still super cool that this is here because here's a high scale application. I
42:29 don't know exactly where it ranked, but it was in the top 20 sites at some point for destinations,
42:35 right? Definitely. I'm not sure where it's at now either. But yeah, definitely a major destination,
42:40 especially I'm sure for many listeners, whether you like it or not, you're going to like probably
42:44 find some useful stuff through Google search or something.
42:46 Yeah. But one interesting thing about Reddit is that they have a very interesting approach to their
42:51 like schema design. If you dig in, they're using what's called an OAV pattern or an EAV pattern. It's
42:57 an entity attribute value. And it's kind of like, it's sort of like object storage or document storage built
43:03 on top of relational database. And so some might consider it an anti-pattern to use it this to this
43:09 extent, because you're losing out on the constraints of a relational database. But it managed to work for
43:15 Reddit. So it can't be all bad. Anyways, theory meets reality, right?
43:21 Yeah, exactly. And some might say like, oh, it's very slow compared to, you know, an optimized schema
43:26 with better indexing and so forth. But one, like databases have done some degree of optimization for
43:33 this pattern. It has a Wikipedia page. It's a pretty well known pattern. But also Reddit is fast because of
43:39 caching. That's the like other main learning there. If you look in there, it has layers and layers of caches.
43:44 A very interesting, like, you know, exemplar to learn from.
43:47 Yeah, I would definitely think so. All right, next category is audio. And I grabbed two of those out
43:51 of there. One's called Exile, something like this, which is a cross platform audio player and library
43:57 organizer tag editor type thing that looks pretty interesting.
44:00 I found this because I went on, everyone's familiar with Wikipedia. There's also this thing called
44:06 Wikidata. And Wikidata allows you to use a Sparkle query, which is sort of like a, I don't know how to
44:11 describe it exactly. But it's sort of like SQL, but extensible for like graph networks, kind of.
44:17 So I ran a query for all of the software that had Wikipedia pages that was known via like, you know,
44:23 Wikipedia or Wikidata to have been written in Python. And so I actually found it through there. I'm not
44:28 a user myself, but yeah, it's actively developed. It had a release come out just a few months ago. And
44:34 it's sort of like, kind of like Amarok, if anyone remembers that, but it's got like Last.fm support and
44:39 plugins and so forth. Seems pretty cool.
44:41 Nice. Another one, Music Brains Picard, which automatically identifies tags and organizes
44:47 musical albums, which sounds pretty cool.
44:48 It's funny name, but just like amazing software. Let me just say, I don't use it as much as I used
44:54 to back when I used to like DJ and stuff, but basically the way that Picard works.
44:57 So there's a Music Brains Foundation that develops it. And they also have a sort of like, it's kind of
45:03 like a Wikipedia, but for album information, it's kind of like Discogs or something like that,
45:09 but open. And what it'll do is you can like take one of your CDs, make a backup, right? And that
45:15 comes through and it's just as track one, track two, track three, track four. But it'll take the
45:19 number of tracks combined with the length of each track to generate kind of a fingerprint,
45:25 match that against an internet database of like album listings and so forth, and then automatically
45:31 bring in all the fully filled out like ID3 tags. So everything shows up very searchable.
45:36 Oh, that's awesome. Yeah. With like album art and all that. Yeah.
45:39 Man, Michael, I used to spend, I mean, you're in the audio world now too. Like I'm sure you know,
45:43 I used to spend so many hours, like, you know, getting those tags just right and still have typos.
45:49 And then this sweeps through and I press one button, 20 seconds later, it's just all perfectly
45:56 organizable and tagged and so forth. It made me feel like a fool, but also made me feel pretty magical.
46:00 Yeah. Well, I'm sure you appreciated it more than if you hadn't done that stuff by hand.
46:04 And then one day I find out it's written in Python and I'm like, that's going on the list.
46:07 Yeah, it's definitely on the list. All right. Switching from a music audio to video,
46:11 we have two editors, Flowblade and Openshot. Flowblade does multi-track, non-linear video
46:17 editing for Linux and Openshot is cross-platform video editor for like many of the platforms.
46:22 Yeah, that one like sports free BSD and Windows as well. So with these ones, like, I think what's
46:27 interesting is that a video editor is a very large application. It's going to have to have a lot of
46:32 codecs and other things. And those things are very touchy based on what platform they're running on.
46:37 They're incredibly finicky. It's super painful to develop that stuff. I've done it.
46:42 Exactly. Talk Python training. So very finicky stuff. And one sure shot way to like, you know,
46:50 ruin it too, is like to miss a dynamic library in your package. So being able to go in there and look
46:57 at how they do their packaging. Sure, it's a little bit like, you know, dirty, but I'm sure that you're
47:02 already deep in the dirty if you're looking for how to do this kind of application freezing. So it's
47:07 really useful to have those to refer to. Those are great examples. For the graphics world,
47:11 we have a free CAD for a general purpose CAD editor. That sounds pretty intense. And it's
47:16 awesome that that's in Python. So with free CAD, part of me was a little bit like sort of skeptical,
47:21 right? Like, could an open source community come up with something like CAD? Because I mean,
47:26 when you're dealing with the BIMS world, right, like, you know, it's like construction and design and so
47:32 forth. It's, it's just so much. And who could possibly have the side resources to do this? But you know,
47:39 they're doing it, they have a wiki, and they don't seem to be giving up. It's pretty impressive. I read
47:45 some sort of third party reviews. And it basically says, like, look, this may not be like top tier,
47:52 right? Like it may not be ready for like full professional shop usage and so forth. But it's
47:57 surprisingly usable. I'm not much of a CAD user myself. So I'll just have to take their word for it.
48:02 Right? If somebody if somebody listening is a CAD user, let me know, let me know if I should expand my
48:06 description here, write a review, support Python CAD. Speaking of supporting, some of these have
48:10 fund next to them. Exactly. So the sorts of links that I collect for each ones, basically, I'll do
48:16 repo, home, Wikipedia, if they have it, GitHub, if they're not on GitHub, but they do have a GitHub
48:22 mirror, demo, if there's a version of it running that you can like just try out docs, which is like
48:28 usually a read the docs link or something similar. And I added fund, which is basically going to be like
48:33 a Patreon link or a PayPal link. If I can find a way to show that these people are making into this
48:41 current generation of like approach to open source, which is like, yes, please do pay me. So I can spend
48:47 more time on this, then I will absolutely link it here. Because they deserve all the help they can get.
48:53 Yeah, absolutely. I think it's great that you're highlighting that. The other one in graphics is
48:57 Kuru image server. So I guess you give it like a large image and you can say I would like it at
49:02 right now I wanted a 250 by 250 or whatever. And it just, you don't have to keep redoing it. It just
49:09 automatically regenerates and probably caches that. Yeah, exactly. You upload an image and then it can
49:13 scale it to all different sizes. They say they have all sorts of tweaks and optimizations to do that sort
49:18 of stuff efficiently and performantly. And yeah, it has caching and stuff too. That's a pretty tough
49:24 task. You know, like that stuff can be pretty expensive. So I was like, maybe if someone's
49:28 interested in performance tweaks for images and so forth, this would be a good project to go see
49:32 how they achieve that. All right, let's go through the games real quick. There's a game section. I'll
49:36 just quickly go through them. You can just give me maybe some general thoughts. Sure. There's frets on
49:40 fire X, which I'm guessing is like one of those sort of music things come down and you hit them and
49:44 then you've got the streaming platform is like a homemade Twitch, which is pretty cool. You got Lucas
49:50 chest and unknown horizons, which is a cool strategy game. The game section is one that I definitely
49:54 wish was larger. A game is again, a very finicky, large application and one that's tough to undertake.
50:00 A lot of also there aren't a ton of open source and free ones. Like I know that Python was used in
50:05 one or two of the civilization games for scripting, but like that's not open source. So, but unknown
50:10 horizons is pretty similar to that sort of thing. It is an RTS. It makes extensive use of Python. So I'm
50:15 happy to see that there frets on fire is a little bit older, but from what I can tell,
50:19 it still works for some people. So I guess like dance dance revolution never gets old or maybe,
50:24 I don't know. Is it a rock band type thing? It says frets. I don't know. It's something like that.
50:28 Something like that. Yeah. Eve, Eve online is another good one to throw in there,
50:31 but obviously it's not open source. So, you know, you can't, can't really do that. Put it in there.
50:35 What else was second life? If you consider it a game, maybe, maybe you consider it life.
50:41 That's right. Maybe you do. All right. Under productivity, the top one here that I grabbed
50:45 is actually something I use to manage all of my servers and my infrastructure. And I love it.
50:49 Just love it. Glances. Yeah. It looks really good. I guess I don't manage enough servers to really
50:54 like me. I'm just like a, an H top user, but it's like H top, but better. Yeah. You get all sorts of
50:59 cool stuff. Like if I just run glances real quick, I guess I don't, there you go. It gives you graphs for
51:05 CPU memory swap, how much Ram you're using, like server load over one minute, five minute,
51:12 15 minutes. You can sort by memory usage, by CP usage. It shows you like your disc IO rates,
51:18 your network rate, all just like all sorts of stuff and great little hot keys for it. Yeah. It's,
51:23 it's like a shop, but yeah, like all the stuff you want.
51:27 Yeah. I'll have to give it a shot.
51:28 It's pretty good. Let's see. We also have bleach bit for like privacy. So it cleans up some stuff out
51:33 of your system, but also I'm guessing it like rewrites every empty bit of space or something like
51:38 that. Just giving the bleach part. I don't know. I used to work way back in the day, like in high
51:43 school for a brief time at like a computer repair shop in South Dakota, no less, but basically people,
51:49 they would come in with malware, spyware, all this stuff. And, you know, my computer's slow and I
51:55 have all these pop-ups. Yeah, exactly. I only installed nine toolbars on my IE 5.5 or whatever.
52:02 Like a 50 pixel bar in the middle where the actual content is.
52:06 Exactly. Oh man. You're putting too much of a, of an age on us here, Michael. Keep this content
52:13 evergreen. Anyways. All right. But basically what bleach bit is, I didn't realize it was written in
52:17 Python, but it is one of these cleaners that will go through and like, not just empty out your temp
52:22 directories and so forth, but also like clean up your registry because a lot of this software will make
52:26 access registry rights. And I guess, especially back in the day that would slow down windows.
52:31 So yeah, it removes like tracking things like increasingly that's become the focus to like
52:37 sort of clean out your browsers of like, you know, any weird flash cookies and that sort of thing.
52:42 Cool. Yeah. That's great. Yeah. Last one on the productivity space is GM vault, Gmail backup.
52:46 I guess we've sort of foregone, I mean, I personally have foregone some of this stuff in my Gmail,
52:51 but probably I should back it up, you know, I never know when big G is going to like, you know,
52:55 do something uncouth. Yeah. You never know. It is kind of nice to have that. I actually went and found
53:00 us, found out that if you go to the, the Google doc, just the Google data export, you can say export my
53:09 docs. And one of the problems, you can run Google drive and stuff and it'll have like your docs from
53:15 Gmail in there, but they're just a hyperlink back to Google docs. Right. But if you run the export,
53:19 you can say export it as word documents, Excel spreadsheets and a PowerPoint and actually get
53:26 the content of it, not just links to it back in drive. It's pretty cool. Yeah. This sounds like
53:32 this kind of sort of in that realm organization, a lot of archiving stuff in this world and library
53:38 bits. There's a funny one as well. Archive Matica, digital preservation and then archive box,
53:44 which is like self-hosted sort of way back machine. Exactly. Archive Matica. It's an interesting one
53:49 because it's sort of like, I think it's sort of targeted at like libraries and actual like archives,
53:56 whereas archive box is a little bit more like gorilla, like Yahoo says they're going to delete
54:01 geo cities. Okay. Let's like, you know, pull it all down. Like it's sort of those two schools,
54:06 which are definitely adjacent, but a little bit different. Yeah. Similar to that, I guess we have
54:09 open library, which is a web application for like a library catalog. So if for some reason you,
54:14 you have a small little library, you know, you don't have to like start from scratch.
54:18 And you'd be surprised like, yeah, libraries, I was at a library recently, like it only was open a few
54:22 hours a week, but they had a tremendous collection and they could definitely use something like this
54:27 because they didn't have anything to search through. You know, you just had to walk the stacks.
54:31 That's crazy. And the last one, the one that I said was funny, it's called I hate money.
54:34 There are some people who are very into sort of like personal finance management,
54:39 like I guess people who like sort of picked up wanting to balance the checkbooks epigenetically
54:45 somehow passed down to them. And, but no, there's like this whole movement of people who do plain text
54:50 accounting and get version. They're like, you know, they're, they're actual financial books.
54:55 I think I hate money sort of comes from that domain combined with the self hosting domain.
55:01 Yeah. Interesting. They're like, we're not doing meant forget meant.
55:04 This one seems to be about like sort of shared budgeting and stuff too. So this is like Fava
55:09 is the one that I was thinking of that I added recently, but I hate money is like basically
55:12 like you have roommates and you just want to keep track of who bought what when,
55:16 and it's like a little bit better than like a Google sheet.
55:18 Yeah. Yeah. That's pretty cool. All right. So I'll, I'll speed through a few more here real quick.
55:22 So communication with ask bot, which is very similar to stack overflow.
55:25 Also quite interesting and secure drop, which is like a whistleblower submission system
55:30 for media organizations. These are cool.
55:32 If you're into self hosting or you have a team that you don't want to pay for stack overflow or
55:36 something like that, then you can like run your own ask bot. And secure drop is a really important
55:40 one. That one was originally written by like Aaron Swartz and it's managed by freedom of the press
55:44 foundation. They just, I think it came out with a new release and yeah, that's huge for journalism in our
55:50 time.
55:51 Yeah, absolutely. One that I think is probably going to be really a welcome one for,
55:55 a lot of teachers and professors out there would be NB greater. This is under the education system,
56:00 which is Jupyter based notebook, basically create assignments in there and it will grade them
56:05 automatically for you. That sounds wonderful.
56:06 Not quite that level of teacher myself, but I think that it's pretty cool idea. Like instead of having
56:11 just workbooks, you can actually give a live notebook, have them fill in some blanks, do some things,
56:17 and then like have them submit the IPyNB and grade that. Yeah.
56:22 Another one that I like to point out, I don't know if it's under education, but a lot of people
56:25 seem to know about it, even if they're not developers, but sort of like education related
56:29 is called Anki, A-N-K-I. And it's basically like a flashcard program. And I meet all these lawyers
56:35 and doctors. They're like, oh man, I would not have made it through school without Anki, right?
56:40 It's like as important to them as Wikipedia or something like that, because it's sort of like
56:43 spaced repetition memorization tool written in Python.
56:46 If you're going to do anatomy or something like that, right?
56:48 You got to memorize it all.
56:49 There's no reason to rhyme. It's just that's part of the bone. It's called that. So we're going to
56:53 learn that. Yeah.
56:55 There's probably some reason, but yeah, it's a lot of memorization.
56:58 You can't Wikipedia thing during a surgery, I imagine.
57:00 Just hold real still. I'm just, I'm researching. Yeah. So the last one, speaking of this kind of stuff
57:07 is science that I want to dig into. And then I think we're going to be probably out of time for
57:11 touching on them. But I felt when I looked at the science area, I felt like there's about equal
57:16 amount as some of the other categories, but this is like really polished, really serious stuff. So we
57:22 have Ascend, which is mathematical chemical processing modeling. We have Cell Profiler, which is
57:28 interactive data exploration of biological image sets. We have SageMath, which is a competitor to
57:34 Matlab and Mathematica. Like these are, especially SageMath, these are real things.
57:38 No, SageMath is kind of a triumph, right? Like if you're maybe a Matlab user or something like that,
57:44 you should definitely check it out if you haven't already. But yeah, all of these science applications,
57:49 they get a lot of usage from their like academic counterparts, student counterparts and so forth.
57:53 But we don't often think to go find them as exemplars for like applications we might want to build.
58:00 But yeah, there's some really interesting ones in here. Like the CCAN was one that I like
58:04 jumped out at me because it is a data management system. So it's sort of like data hub that you can
58:09 host yourself. And so if you are running an organization like university or government,
58:15 you know, and you want to do anything with open data, you're going to need some sort of data hub,
58:18 some sort of portal for people to find that data and you need to manage your data on there.
58:22 And there's an open source one written in Python.
58:24 Yeah.
58:24 And I've done there's another one that's called Orange. It's like kind of like a component based
58:29 data mining software for graphical interactive data analysis and visualization, but you can train
58:34 machine learning models graphically. And it sort of has like a signal processing type metaphor. So
58:40 like you have this thing, and then you drag like a little curve into that thing and make like a little
58:45 flowchart. And then once it's working, you can export a Python program from that. And I've been using
58:50 that since I think 2012, 2013. So it's like from somewhere in Eastern Europe. And I think the professor
58:57 who like leads the lab that writes it, like I think I have his book as well, itself, like kind of a triumph
59:02 too. It also uses QT4 and QT5. So if you're undergoing a QT transition with your application, might be an
59:09 interesting one to look at too.
59:10 Oh, wow, that is a super interesting angle.
59:11 Yeah.
59:12 Yeah. Okay, great. So I think we're probably out of time for diving into anymore. But there's so much more
59:16 to cover. So there's the CMS category, the ERP category.
59:19 Business software.
59:20 Yeah, oh my goodness. SAS and all that. So static sites. And then there's the dev super category,
59:26 I'm calling it because there's 129 items in there and a bunch of subcategories like source control
59:31 and stuff. And people can just go through the list and find it. I think this is great.
59:35 Yeah, I could go on for hours. I really could. Each one's more exciting than the last.
59:39 These are so good. All right. I think we should sort of leave it there and people can go and they can
59:45 explore the cover. There's probably about 250 we haven't even touched on at least. And then people
59:50 also out there who are listening, maybe they want to, they maintain one of these projects or they use
59:55 one of these projects. They want to recommend it to you. What's the story there?
59:59 I have a GitHub issue template. Jump in there, like, you know, make sure it actually fulfills the criteria
01:00:04 of being like maintained and so forth. Again, I'm not super, super strict on that. But like,
01:00:10 if one particular category is like really overpopulated and it's like the seventh link
01:00:15 shortener or something like that, I got to be a little bit decisive there. If it's something that
01:00:20 is pretty like undernourished category, right? Like, you know, more is better, the more merrier. So if you
01:00:25 have like a game that you know of written in Python, it'll probably make its way in.
01:00:28 Yeah, super. Maybe someday the game category will break up into like, tower defense and strategy or whatever. Who knows?
01:00:35 I really want to do a taxonomy, like sort of refactoring because some of these categories are
01:00:39 bursting at the seams. I recently added the storage category because I found all of these like,
01:00:43 database related things written in Python. I don't know, there's just so much.
01:00:48 Yeah, super cool. All right. Now before you get out here, the last two questions real quick.
01:00:53 Sure.
01:00:53 For you. When you write some Python code, what editor are you using these days?
01:00:57 Yeah, I'm still getting it done with Emacs. But you know, I'm open to experimentation. And one of
01:01:02 these days, maybe one of these other editors will get its hooks in me.
01:01:04 Yeah, cool, cool. Well, I'll keep asking you each time you come on the show.
01:01:07 And then notable PyPI package. You've got some good ones out there.
01:01:11 The thing that still dominates for me is Glom, right? So I have this Python package is called Glom.
01:01:17 You use it for deep getting into a dictionary, but also a variety of other things. It's sort of like
01:01:22 a data templating system. And more and more, it's becoming like a higher level programming
01:01:26 language almost. So right now we're building streaming support into it because a lot of
01:01:31 people have brought up like sort of the size of the data that they want to manipulate with Glom.
01:01:36 And so we got to have some sort of streaming metaphor in there. But I'll also point out that
01:01:41 a lot of these awesome Python applications are distributed through PyPI in some way. So I have
01:01:46 PyPI URLs on a bunch of them too. So shout out to them.
01:01:50 Yeah, yeah. Awesome. Awesome. All right. Well, final call to action. People want to get involved
01:01:54 in your project. What are the ways I already touched on some of it for submitting stuff,
01:01:58 but what are some of the ways people can get started or get involved?
01:02:01 Yeah, definitely check out the repo on GitHub. Check out the listing in the readme. Check out that
01:02:06 Jupyter notebook that's in there. Look for the RSS link as well. If you're an RSS reader user like me,
01:02:12 a bunch of interesting Python RSS readers. If you're not, you can just download one on that page too.
01:02:17 And yeah, then submit some more applications and also keep your eye out for the PyBay talk and probably
01:02:24 the PyGotham talk. All right. Excellent. Well, this was so much fun to talk to you about all this stuff.
01:02:28 And, you know, nice work on having this angle because I don't feel like it was covered and you've done a
01:02:34 really good job. It seems like it's taken off. It's got almost 10,000 stars on GitHub.
01:02:39 Yeah. I mean, and it's not really about that, but I am happy to see that it sort of gets out there
01:02:44 more. Speaking of giving it more coverage, I did technically start a YouTube channel. You can like
01:02:48 and subscribe if you'd like. It's yak.party, Y-A-K dot party. Maybe I'll post some APA stuff to it
01:02:55 in the coming future once I'm done with these talks.
01:02:57 That sounds great. All right. Well, thanks for being here as always.
01:03:00 Absolutely. Anytime, Mike.
01:03:01 You bet. Bye.
01:03:02 This has been another episode of Talk Python to Me. Our guest on this episode was Mahmoud Hashemi,
01:03:08 and it's been brought to you by Linode and Tidelift. Linode is your go-to hosting for whatever you're
01:03:14 building with Python. Get four months free at talkpython.fm/linode. That's L-I-N-O-D-E.
01:03:20 If you run an open source project, Tidelift wants to help you get paid for keeping it going strong.
01:03:25 Just visit talkpython.fm/Tidelift, search for your package, and get started today.
01:03:32 Want to level up your Python? If you're just getting started, try my Python Jumpstart by
01:03:36 Building 10 Apps course. Or if you're looking for something more advanced, check out our new
01:03:41 async course that digs into all the different types of async programming you can do in Python.
01:03:46 And of course, if you're interested in more than one of these, be sure to check out our
01:03:50 Everything Bundle. It's like a subscription that never expires. Be sure to subscribe to the show.
01:03:55 Open your favorite podcatcher and search for Python. We should be right at the top.
01:03:59 You can also find the iTunes feed at /itunes, the Google Play feed at /play,
01:04:03 and the direct RSS feed at /rss on talkpython.fm. This is your host, Michael Kennedy. Thanks so much
01:04:11 for listening. I really appreciate it. Now get out there and write some Python code.
01:04:14 you you