Learn Python with Talk Python's 270 hours of courses

#234: Awesome Python Applications Transcript

Recorded on Tuesday, Sep 24, 2019.

00:00 Have you heard of awesome lists? They're, well, pretty awesome, gathering up the most loved

00:05 libraries and packages for a given topic. While most lists cover awesome developer tools and

00:10 libraries, we don't have many examples of awesome applications, both for use and as examples to

00:16 draw from. That's why Mahmoud Hashemi decided to create awesome Python applications, and you're

00:21 about to dive headfirst into them. This is Talk Python to Me, episode 234, recorded September 24th,

00:27 2019.

00:28 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the

00:46 ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter,

00:51 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm,

00:55 and follow the show on Twitter via at talkpython. This episode is brought to you by Linode and

01:01 Tidelift. Please check out what they're offering during their segments. It really helps support the

01:05 show. Mahmoud, welcome back to Talk Python to Me.

01:07 It's good to be back.

01:08 Man, it's great to be back. It wasn't that long ago that you were on Python Bytes,

01:11 and we still didn't get a chance to catch up because you were covering for me.

01:16 Yeah, but it was a great time. Yeah, happy to fill in any time. But really,

01:19 who could fill in your shoes?

01:20 Oh, man.

01:21 Thanks for that. You've been on the show before. I think the first time was quite early in the show's

01:27 history. You came to talk about a really interesting topic, Enterprise Python, right? Python being used

01:34 within the enterprise. We talked about a bunch of examples of that. And I feel like this is the

01:39 open source equivalent of that story just a little bit.

01:42 I was thinking about that today. Yeah, you're right. It is kind of similar,

01:46 different in a lot of ways. I think a lot more fun, but we'll get to that part. But yeah,

01:50 that was back in 2016. Time flies.

01:52 Yeah, time does fly. We've been at this stuff for a while.

01:54 I was at PayPal.

01:55 Yeah, that's right. And yeah, I mean, I guess that's probably a good time to just ask you,

01:59 what have you been up to these days? What's going on in your world?

02:02 I sort of had my fill of moving around different teams at PayPal, seeing how that whole business worked.

02:08 And it was cool to work in the enterprise. And I wanted to sort of like stay with that. So

02:12 I wanted to like kind of grow my own autonomy, as well as like just sort of see how startups worked.

02:19 I am in Silicon Valley, after all, you know, and PayPal is a startup in many ways, but it started

02:24 in like, what, 1998. So it's mostly started.

02:27 Yeah, it's mostly started at this point. You'd be surprised in some ways. But yeah.

02:31 Anyways, but basically, yeah, so I went to like a Series C company, did that for a while. It was

02:37 pretty cool. And then there's not too many teams, you can move around there. And like,

02:41 you know, so once you've learned it, you kind of learned it. And then I was like, okay, well,

02:45 let's come to a Series A company. And so for the past couple years, I've been at Simple Legal

02:50 here in Mountain View. And yeah, it's been really good. When I joined, the team was like four people

02:57 and or rather than we had like four engineers. And then now I think we're at like a dozen,

03:01 got a small team. I'm like principal engineer, do a lot of like code review,

03:05 architecture review, but also a fair amount of coding. And yeah, just having a blast.

03:10 That's cool. That looks like a fun product or service to work on. Is there a lot of Python

03:14 happening?

03:15 Best part, right? Like it's, it's not like PayPal where you're jockeying, like, you know,

03:19 for different technologies and so forth. It's like what we say goes, right? And we pick the

03:23 best technology, Python, Postgres, everything that everything we love. So we got that autonomy down.

03:29 But like, yeah, just to be clear, it's not like the most exciting company on the outside.

03:34 But I always tell people this, like, try to go for a boring company and then make it fun. Do it in

03:40 like, like choose to a boring company so that you can do things in a fun way. If you choose something

03:46 that's too exciting on the outside, then your day to day is just going to be a blur of boring stuff.

03:51 So yeah, we've done a lot of open source here. And I could go on about it for hours. But we got

03:57 other things to talk about.

03:58 Okay, some awesome stuff to talk about for sure. Yeah, I think there's a lot to that,

04:02 you know, people see sort of the company as the exciting thing, right? Like working for Tesla

04:08 or something like that. But, you know, if what you end up doing is writing C code to do a bunch of

04:14 internal boring stuff, well, then like, it doesn't matter how exciting the company is. That's not that

04:17 much fun.

04:18 No, exactly. Just bash scripts to shovel around logs. That's not exciting to me.

04:22 It doesn't matter if the logs are...

04:23 You know, I don't care how exciting the brand is.

04:25 Yes, exactly. Exactly. Cool. All right. Well, let's talk about some awesome things. Now,

04:32 there's been this history of awesome lists. Do you want to maybe just tell people about this,

04:37 this idea of awesome lists? I don't know how much of the history you know, but I know that you

04:41 must be involved because you've created one.

04:42 I'm actually not that great of an expert on these things because sort of when they started popping

04:48 up, I was already pretty well versed in, for instance, awesome Python. It's a huge repo,

04:53 like tons of contributors, been around a long time, got tons of content, and it's quite awesome in many

04:59 ways. But I rarely like refer to it or find much new there because I was sort of like, you know,

05:06 already along my Python path. But I'm sure it's out. It's a great resource for people out there. I refer

05:10 people to it all the time. So yeah, but this awesome list thing is a phenomenon. It's like a

05:16 meme on GitHub where it's like, I'm going to make a list of links, which are awesome for some definition

05:21 of awesome.

05:22 Yeah, for some definition of awesome. Exactly. And I don't even think it originated with Python. I feel

05:26 like there was some PHP stuff going on.

05:28 No, definitely not.

05:29 Yeah, awesome Python is really good. People should check that out. That's interesting. There's a bunch

05:34 of good libraries there. And there's also more focused ones. Like there's, I recently ran across

05:38 awesome ASGI for async server, like basically for async web frameworks and other AIO type of things,

05:47 asyncio things.

05:48 Yeah, it's just the default architecture for like linked content on GitHub, you know, if you want it

05:54 to be approachable. But yeah, there's this guy like Cinder Sorhus and basically, yeah, I'm pretty sure

06:00 that he's like node guy. Yeah. Anyways, he like has this sort of awesome authority. And it's like the

06:06 meta list. It's the list of lists that point to all the other awesome lists.

06:10 Oh, how cool. I didn't know there was an awesome list of awesome lists. That's super meta.

06:14 Oh, well, that's why they all have this badge with the cool sunglasses on it. And you know,

06:19 anyways, that's got like 120,000 stars, which is like, you know, it's good. Like, like, frankly,

06:24 so much of how we learn as a community is driven through streams of content that has to be manually

06:31 republished out. I'm thankful for anything out there that sort of constitutes a reference,

06:36 you know, that like a sort of institution I can go and look at rather than refreshing a Twitter stream

06:41 and hoping that somebody like has said something of value to me. Awesome lists are in the end on like

06:48 net. Pretty awesome.

06:49 I would say, yeah, they're aptly named. And I do like them. I feel like they're a little bit

06:53 vetted, right? Not it's not just a Yahoo of all Python things, right? It's not like PyPI or whatever.

06:59 It's these are the things people found to be extra good in these categories. I guess it's kind of like

07:04 Yahoo a little bit anyway, early, you know, 1996 Yahoo. So maybe tell us about your awesome list and like

07:10 how you came to come up with it and things like that.

07:13 Sure. So yeah, I didn't really set out to create an awesome list. I sort of backed my way into it.

07:20 So I started a couple of years back, like in I think 2017 or so, I started like giving conference

07:24 talks. I was already going to meetups and it was sort of the natural next step. Start giving talks.

07:29 I recommend everyone do the same. It's a good way to branch out.

07:32 Giving talks is a great step in like raising your profile and just breaking out of the mold of I'm

07:38 sort of an anonymous programmer, right?

07:40 Yeah. And just like writing blog posts, for instance, like, you know, once you actually sit down to

07:44 cover a topic, it exposes a lot of your own gaps. And so you end up filling in a lot of your

07:49 own knowledge. Yeah.

07:50 Just for personal benefit, I recommend it. So basically I was giving these talks and I was

07:54 covering topics like performance and packaging and testing and plugins and a bunch of architecture

08:00 stuff. And the thing is that like, like I just sort of alluded to, oftentimes when I'm starting out with

08:06 this talk, like I have some degree of like expertise in the area, but also I'm doing a lot of learning

08:11 myself. That's one of the reasons I like doing them. And so afterwards, people would have all these

08:15 questions, you know, I'd take it as a good sign. They thought the talk was good. They thought I know what I'm

08:18 talking about. That's great. But the fact is that like, when they're asking me about packaging

08:23 or plugins or, you know, performance, it's like, well, I only have my 10 years of experience to draw

08:29 from and I won't know every single answer, you know, to apply to their situation. I want to,

08:36 I desperately want to, but that's just not possible.

08:38 You can't always go through it, right? Like if you're packaging up an app and you know,

08:41 well, I got CX freeze to work for me. So now I can make it.

08:44 It's like, well, why are you going to go try all these other ones when you have one?

08:49 Exactly.

08:49 You've got a life to live and other apps to build, right? But other people can share their

08:53 experience and say, Oh, did you know about Pyox dodger? You're like, wait, what's that?

08:57 I certainly want that a hundred percent completion like stat, but it's just not going to happen. So

09:01 one thing I sort of wished I had was basically a way to refer them to a known working example,

09:09 just because it was available at the, at the conference at PyBay a couple of years back,

09:13 like the Zulip team was there and Zulip is like this chat application written in Python,

09:18 sort of like, it's sort of like a Slack, but kind of like blends in some email features. It's a pretty

09:24 cool design and they have a fully working application with a great community, a good onboarding process,

09:28 good docs, all this stuff. And so I'm like, look, this is a great exemplar. If you're writing a Django

09:34 server application and you're interested in say, introducing typing, they recently did that process and

09:40 you can go look at how they did it and you can get a lot better answer from looking at exemplars. And

09:45 you can, you know, for me, just spur of the moment. Right. Or even just some kind of talk where people

09:50 just talk about the concept of type annotations or type hints, but here's actually the GitHub

09:55 conversation and the issues and the PRs. And then this is the before and after and the trade-offs. And

10:01 it's just like a white paper, real world example of that. Right. Yeah. And the living maintainers who

10:08 like could answer those questions too. There's so much to draw from there. And so I was like,

10:13 okay, I'm going to make a list of Python applications just for me so that I can easily

10:19 refer to them. Like just off the top of my head, I was able to get together around 20 that I considered

10:25 pretty awesome. Can you think of like, I mean, Python is the biggest language in the world right

10:29 now, basically. So like, can you think of a few?

10:32 You would think that I would be able to start naming them. The real tricky part is these have

10:37 to be the open source ones, not just things written in Python. So Wagtail, for example,

10:43 is a CMS in Django, which is pretty nice. Reddit, Reddit is one. Trying to not cheat and think about

10:51 what is on your list, but these types of things.

10:54 How about you, the listener? Can you think of a application that is written in Python?

11:00 Okay. Time's up. Well, it turns out that there are more than the 20 I could come up with.

11:07 And when I started looking, it was sort of hard to find them. But once I found them,

11:12 like I sort of found it for myself in this sort of addictive cycle, like just always looking for more

11:17 because there was so much creativity there and often sort of underappreciated. So something I became

11:23 aware of pretty early on when I was developing my own, like, you know, open source Python application,

11:26 there's this contest that's actually happening now. It's called Wiki Loves Monuments. And

11:31 Wiki Loves Monuments is like a photography contest around the world. It's like the Olympics of

11:36 photography for free culture. And they needed a tool that would allow them to judge entries. So

11:42 I made one of those and, you know, had a nice team. There was some stuff we had to figure out on our

11:48 own. But as I was like building this, I was finding some other applications and they

11:52 had like a couple dozen stars, you know, and it's because GitHub is a place for developers.

11:58 It's for people who want to build software, not necessarily use software. Like, I don't know if

12:03 you remember the old days with like Two Cows and CNET and Download.com.

12:07 Two Cows was awesome. You were like, I need a thing. I mean, it was sort of the free and open app store,

12:14 right?

12:15 Right. It was freeware. It was shareware. It wasn't really like free software in that,

12:19 you know, it's the beer or whatever instead of the, you know, speech. But yeah, anyway, so basically,

12:25 I sort of like got a little bit addicted to finding these awesome applications, because

12:30 a lot of them were sort of diamonds in the rough, not getting as much appreciation as they probably

12:34 deserved. I mean, their users loved them, but not necessarily GitHub users.

12:38 Right. Well, GitHub's not exactly. Yeah.

12:41 It doesn't have the right incentives or the right sort of connective structure there, because GitHub is

12:46 all about connect. I feel like it's all about connecting you with libraries, either code you work

12:51 on or libraries you want to put into your application, but not just end user things that you could study or

12:57 make copies of, right?

12:58 Exactly. And so I ended up with like this sort of five point rubric for the things that I was looking for.

13:04 Basically, like it had to be free software. It had to have an online source repository. It's not much of a

13:08 reference if you can't see the up to date code that is actually shipped. It had to be using Python for a

13:14 significant part of its functionality, not necessarily pure Python. Let's be real, like realistic,

13:19 pragmatic applications are going to have a mix, you know, JavaScript, HTML, CSS, going to have to have

13:25 them. So until browsers run Python anyways. So yeah, then it had to be well known, or at least

13:31 prominent in its identifiable niche. There's some cool applications out there for like neuroscience and

13:37 maybe neuroscience people love them, or something like that doesn't have to be world famous. But

13:44 basically, it has to be maintained. This is another like really important thing. A lot of old wiki pages,

13:50 they sort of get stale knowns checking the links, and it just ends up being sort of a sad graveyard

13:54 after a few years. So these projects actually have to be maintained, the links have to be up,

13:59 they have to be functional on relevant platforms at the very least.

14:03 I've seen some of these types of applications you're talking about. And I'm like, found it on

14:06 GitHub, like, Oh, why is no one talking about this? This is great. And it says click here for a live

14:10 demo. And you click it. And it's 500. Crash. You're like, exactly. This cannot be getting that much love.

14:16 I'm setting out to sort of like make a list that doesn't do that. I'll get to that in a second. But

14:21 most importantly, and you alluded to it earlier, it's just that they have to be shipped applications,

14:26 not libraries or frameworks. So one of the first things I did was I went on the awesome Python list.

14:31 I'm like, surely there's some applications on this list I can use as exemplars. And I think I found like

14:35 three, right? Like Jupyter notebook. And it's not a huge surprise that Jupyter notebook's written in

14:40 Python. But awesome Python is all about the libraries. And I wanted a list of applications,

14:45 hence, awesome Python applications.

14:48 It's really good. And there was not really anything out there like this. And I think it's great.

14:53 When I was looking through it, you know, I expected a lot of web applications, and I'm sure we'll find

14:59 them and we'll talk about some of them. But there's also desktop GUIs, terminal, sort of ASCII type of

15:06 app, you know, ASCII, what a curses type applications, a lot of variety there.

15:11 There's so many ways to taxonomize it. And I could get into that if you'd like, because I've spent hours

15:15 mulling over it. But basically, we have the basic topic, like, you know, breakdown. But then we also got

15:21 sort of server versus like, if it's a server, like software, or if it's sort of like client or GUI

15:27 software. And then as far as the console is concerned, right, you have CLIs, which is just a

15:32 command line. And then you have a TUI, which is a text user interface, somewhat lesser known term.

15:38 And then you have sort of the interactive console style. Those are the three that work within the

15:43 terminal. And then you have like sort of some other interesting architectures out there too,

15:47 which I'm like, yeah, we'll cover it later. But the good news is that when I launched it on accident

15:54 on Brian Okkens testing code, is that I basically had found 180. Like I was turning over rocks. I was

16:02 like looking on Bitbucket, Launchpad, you know, this isn't just GitHub and Git, it's like Bazaar,

16:08 Mercurial, there's like Calathea and Peugeot, and like, which I hadn't really heard of, but I found

16:14 applications hiding on there. And of course, like all these Git labs out there that people self host

16:19 too, because there are all these great like sort of sub communities, the Debian sub community,

16:23 the Fedora, Red Hat, Gnome, and they all have their own cool organizational support and community. And

16:29 it's really interesting to just not just find the application, but also that community. So yeah,

16:34 we got 180 when I accidentally launched it sort of at the beginning of this year, or maybe like a little

16:38 bit before, like Merry Christmas to the Python community, I guess. But yeah, so I published a

16:44 blog post once that sort of started picking up, and that's still up on the site. So if you go to my blog,

16:49 like sedimental.org, it's just up there. I'm sure if you search awesome Python applications, it'll come

16:54 up. For sure. And I'll link to it in the show notes. I haven't blogged since then, because pretty much all

16:58 of my like, free time for content creation has just been going toward curating this. And I've been

17:04 learning a ton. I need to find a better way to share it. I guess that's sort of like why I'm on

17:07 the show. But the point is that at the beginning of this month, I was at something like 250 ish. And

17:13 I guess I should say at the beginning of September 2019, I was at the top of around 250 applications.

17:20 And now I'm at like 312. I'm aiming to get to 350 by end of September 2019. I say 350, not because it's a

17:29 particular goal, but because I have like a list of candidates that I need to evaluate.

17:34 A lot of our community, like sort of submitted is really like very time consumptive to find these.

17:39 So if people want to, you know, help me out, that'd be great.

17:42 Yeah, that's cool. So yeah, I think I'm going to do a PR for Doc Assemble for you, which is a legal

17:48 interviewing software based, I think it's based on Django.

17:51 I just looked it up and it actually looks very impressive. It's exactly the sort of polished

17:55 application that I think needs some more open source love. Anyways, I mean, one might think besides

18:00 my personal addiction, like why, why do this? And I have sort of like three goals that I'm trying to

18:07 hit, or at least trying to fulfill. I'm not sure if they'll ever really be hittable, then maybe not

18:13 like measurable enough. But I really want to like goal number one is I really want a better development

18:20 cycle. You know, like someone ideates idea for an app, you know, they go basically do maybe Django

18:29 start project, or they open up a blank editor, and they just start writing it and they start searching

18:35 on Stack Overflow almost immediately. You know, maybe they find a tutorial that gives them a to do or

18:39 something to get just barely get off the ground. But beyond that, it's all first principles.

18:45 And I just really want people to sort of benefit from all of the other discovery that these like

18:52 application developers have created. The way I put it in my high bay talk, like last month was that

18:57 Python is the biggest language in the world right now. But how do you personally benefit from that?

19:01 Yeah, you know, there's all this stuff out there. I mean, there's a bunch of great libraries we

19:05 benefit from. Absolutely. I do think one of the differentiators between a beginning programmer or a

19:12 beginner in an ecosystem. And somebody who's very experienced, some form of expert, I guess,

19:19 is that a lot of times the beginner will start coding and think everything has to be created from

19:25 scratch. Right? I need to, you know, load CSVs. So let me just like read the text of the file and start

19:33 splitting on, you know, commas or like weird stuff like that. Right? Whereas the more experienced person's

19:39 like, well, you know, pandas will read it, or there's also a CSV module in the library, like,

19:44 right, just use that. And I feel like this, it kind of helps the beginners close that gap to say,

19:51 I'm looking like I want to build something like that. Let me see what they did, right? Let me see

19:55 how they structured their file system, and they organize their code. And are they even using celery?

20:00 I heard I have to use celery. Do I have to use celery? I don't know. That's sort of the thing. If

20:03 you go on awesome Python right now, sure, you're going to find like hundreds of libraries. But what

20:07 are you supposed to use them all? Like what all ingredients goes into making sort of a complete

20:12 version of the app that you're sort of trying to build of that architecture? Yeah, I think that

20:17 basically, like my longer term goal here is like to have sort of a decision tree type interface.

20:23 So you can say like, well, look, I want to build a web application, you know, and then like,

20:27 you can sort of say, oh, I wanted to support this many users, or I wanted to basically use,

20:33 say, SQLAlchemy, or have a Docker image or something like that. And you can find your way to,

20:40 like an application that looks enough like the thing that you want, that you can just pull things

20:45 wholesale from it. And basically, not only get an application sooner, but also learn the best

20:52 practices without having to go through dozens of hours of conference talks and blog posts.

21:21 Where your users are, there's a data center for you. Whether you want to run a Python web app,

21:26 host a private Git server, or just a file server, you'll get native SSDs on all the machines,

21:31 a newly upgraded 200 gigabit network, 24 seven friendly support, even on holidays,

21:36 and a seven day money back guarantee. Need a little help with your infrastructure?

21:40 They even offer professional services to help you with architecture, migrations and more.

21:44 Do you want a dedicated server for free for the next four months?

21:47 Just visit talkpython.fm/Linode.

21:51 Compare this to the cookie cutter library of templates for us.

21:57 Yeah. I mean, that's actually, I hadn't really thought of that, but that's a very good point.

22:01 In a way, cookie cutter is trying to codify these architectures as well. I guess not to put too

22:07 fine of a point on it, right? But I mean, I have looked through 300 of these. I mean,

22:10 I've looked through over a thousand applications and very few of them, I like still bear the marks of

22:15 having started as a cookie cutter application. I think that cookie cutter certainly has its place

22:20 for getting a kickstart, especially for people who maybe don't want to write their own setup.py or

22:26 something like that. But yeah, I'm not sure that it's going to be enough architecture to really get

22:31 you completely off the ground. Like there are emerging technologies and containerization around like

22:37 that pack and say app image and snap and so forth, especially for these GUI applications on Linux,

22:45 just as an example. And maybe there's a cookie cutter for something like that. I don't think it's been

22:51 done enough times for it to like show up on that radar. And so basically you'll have to look for the

22:58 existing community within the existing, I mean, it's very hard to call the community actually,

23:02 within the existing ecosystem, like who has actually adopted this. And so that's like,

23:09 you know, I think I'll get to that in a minute here, but I did clone every single repository,

23:13 clocked in at around 30 gigabytes for 250 repos. I sort of ran a bunch of analysis and technology

23:20 discovery within it to figure out who's doing what. That's cool. Yeah. Like, in fact, let's just like

23:24 jump right into it because I'm, I've got these numbers and I just got to sort of get them out

23:28 there. Sure. Let me throw one quick comment though, on the cookie cutter bits while you're pulling up

23:32 the numbers. I feel like cookie cutter is great to help people jumpstart. And the idea is like,

23:37 you kind of get this prepackaged app, which is great. Like here's a way to do flask that already

23:41 talks to a database and a queue, or it already sends email or something. Right. That is a long ways away

23:47 from here's an actual application, right? Even when, as a developer, when you're building an

23:52 application, you think, Oh man, I'm almost done. I'm 80% done. You're like less than half done,

23:58 right? There's all these little edges you have to smooth off and these like edge cases that you don't

24:02 think about. And in these, these are real applications that shipped, which means they're way more polished.

24:09 And they're just, I don't even know that how comparable it is. They're both good for people

24:13 starting, but I feel like this is a much different set of things in the catalog.

24:18 And to be clear, they're polished for the end user. What I think is really interesting about them too,

24:23 is like all the rough edges they have. That can be really good for a programmer's ego to be like,

24:27 look, this worked. Okay. Like, I don't need to overthink this. Like, I'm just going to say,

24:32 this is good enough. If it's good enough for them, it's good enough for me. Ship it.

24:36 Yeah. It's easy to get hung up on like trying to make it perfect or trying to set up like infinite

24:41 scalability. So it's like Google and you're like, you have no users yet. Forget scalability.

24:45 Just get it out. Right? Exactly. And so I got my numbers up. Let's hear them. So yeah,

24:50 the quick methodology here is I just cloned every repo that's Mercurial, Bazaar, Git,

24:55 pulled them all. I have, you know, ransom slot count and like other analytics over the version

25:01 control history. We're looking at 19 million lines of code, 2 million commits with around 50,000

25:09 committers. That's an insane amount of effort. Like that is huge. And this is Python code.

25:14 I think I added it up and it was like hundreds of years of like maintainership or whatever you want

25:20 to call it, you know, like sometimes you go on maybe steam or something like that. And it sort of

25:24 aggregates for you how many hours was played on the game. And you sort of feel like you feel a little

25:28 bit terrible about humanity perhaps, but we all got to have fun anyways. But this, like I had like sort

25:35 of the opposite feeling, right? Like this, I think I actually experienced the true meaning of an awesome

25:41 list. Like I was in awe at the amount of like, you know, stuff I was looking at. And I sort of like

25:48 broke it down. So real quick thing, like about a third of them are server software with 64% being

25:54 like sort of desktop or CLI and just sort of like a quick breakdown there. I think the scariest thing for

26:01 me was basically Python compatibility, because we hear a lot of like doom and gloom at times,

26:08 but like contrasted with like a lot of excitement about Python 3. And I was worried that like, you

26:14 know, these maintainers are stretched so thin, they have not had time to add Python 3 support. And here

26:19 we are with something existential facing the Python application community. And I was very pleasantly

26:27 surprised to find that two thirds of these applications already support Python 3, like they run on Python 3.

26:33 That's great. It was a huge relief. This isn't the same thing as a library supporting Python 3. This is

26:37 them having converted a code base that is often quite large to Python 3, many of them. They didn't start

26:43 this way, because I think the average or sorry, the median application on the list is something like 10

26:48 years old. And some of them much older than that, like Mailman, the thing that Python uses and many other

26:54 organizations use for managing their email lists, that's written in Python, and it's made the jump,

27:00 you know, it supports Python 3. And that's pretty impressive. And if you look at sort of the

27:06 compatibility over time, you do see the Python, like two applications kind of tapering off, you haven't

27:11 had another Python, like I think the most recent Python 2 application was started in something like 2017,

27:16 versus, you know, Python 3 applications, which are being started today, Right. There's probably a bunch more Python 3 apps starting now than there are Python 2,

27:25 I would expect.

27:26 You see a nice healthy trend there. Thankfully, I was super relieved.

27:29 I'm sure you'd either have to be crazy or depend on a library that is Python 2 only deeply to start

27:38 a Python 2 app these days.

27:39 And those are rarer and rarer. So it's a little bit harder to justify. There's still some interesting

27:43 case studies in here. So for instance, I think OilShell actually converted to Python 3 and then back to

27:49 Python 2. And he has a whole like, you know, blog post about why he did that, the lead maintainer

27:56 there. So that one's an interesting case study. And I think that's sort of what I hope to like,

28:01 one of the deliverables I hope from comes out of the list is just like all the interesting case studies.

28:06 I think we'll get to those too.

28:07 Yeah, there could be a bunch of stuff coming out of there. I think, you know, another thing I think would be valuable, not that you don't already have enough going on,

28:14 but would be some kind of newsletter where just the stuff that gets added that month or something

28:19 comes in and just gets thrown out. That'd be great.

28:21 I'm too much of an engineer for that, Michael. That's why I've added an RSS feed.

28:26 You know, we just pull the data ourselves.

28:28 Yeah, exactly. Just pull the data yourself, right? If you want to do a newsletter and all that,

28:31 I fully support that. But yeah, I'm going to take a technical approach to this social problem.

28:37 Okay. It would be good to get there. It would be good to get there once the list sort of maybe

28:41 starts like plateauing a little bit more. And also once, yeah, once I get some more stuff in place to

28:46 keep the quality high, you know, because some of these projects might go like into an unmaintained

28:51 status and I'm not going to be the one vetting 350 projects every month, you know? So I need to

28:56 automate some things first.

28:58 How much automation could you do? Like, for example, every one of these has a link to their repo.

29:03 Could you look for unresponded to open issues or the lack of a commit for like one year puts it onto

29:12 like a warning list and then some contributor could come in and go, this looks suspicious.

29:17 That's almost exactly what I have planned. And I sort of have, I have a little command line

29:20 application I built for managing awesome lists. It's surprising how many awesome lists are just

29:25 manually maintained via PRs. But yeah, all my stuff is in a YAML file. I think I have something like

29:30 a thousand links in there. All of them are going to need to be checked for 404s and stuff like that.

29:34 But yeah, I do sort of have a informal bar of you need to have a commit in the last year.

29:39 There are exceptions to that. So for instance, you mentioned Reddit earlier. Reddit is a massively

29:45 important Python application for the internet. And they only have an archival version of the site up

29:51 from around 2017. But still, I mean, like, it was still a very big site in 2017 with a lot of

29:57 like real lessons. It is such an important site that I think it clearly deserves to be on there.

30:01 It actually surprised me that it was there. There are a couple other surprises in there too,

30:05 I think. I'll just sort of refer the audience maybe to the repo where we have a Jupyter notebook that has

30:11 all of the graphs in them. Like some of the like numbers I'm going to cite to you now are going to

30:16 be out of date in just a week or two when I give a PyGotham talk because I'm going to run it on 350

30:21 applications instead of 250. But we'll keep the Jupyter notebook in the repo up to date. But suffice

30:28 to say for now, like there's some really interesting findings in there.

30:31 Oh, I'm looking at it now. This is super well done. I'm so glad you put this up as a notebook so it can

30:35 just like live. That's great.

30:37 The most surprising trend, it doesn't affect me very much, but it was surprising to me just how like stark

30:44 it was the increase of QT based GUI applications compared to GTK based GUI applications. When you

30:52 look at the graphs in there, remember all of these applications have had a commit, like 95% of them

30:57 have had a commit in 2019. It may look like, oh, this is an old application, but it's maintained,

31:03 it's used, it is awesome and deserves some respect there. But like people are starting QT applications

31:09 these days, not GTK.

31:10 Yeah, yeah. I'm looking at that list. It's almost 50% QT and then 2% Pi game, 3% Kivi,

31:15 17% WX and 30% GTK. Yeah, pretty interesting. There's a bunch of graphs like that. This is a,

31:21 I don't know how long this is, but the scroll bar is very small.

31:24 That's great.

31:26 I'll do credit there. Like I was on a huge time crunch coming off of speaking tour in Africa.

31:31 Let's just say I gave a keynote in Tunisia, nothing too mysterious, but basically I was super time

31:36 crunched. And so that notebook right there is basically all my wife's work. She's in the commit

31:42 log and so forth, but credit where credit is due. I did not have time to do that stuff. And she really

31:47 saved my ass.

31:48 Yeah, she came through. That's awesome. Congrats to her. That's good work.

31:51 Yeah. Anyways, I could go on about the stats, but I think that probably at this point, people are,

31:56 people have got to be curious what other case studies are hiding on this list.

32:00 Yeah, they've got to be. So maybe pull out some highlights that stand out for you. And then I grabbed a

32:05 couple that maybe we can go more quickly through like more of them, just skip across to give people

32:09 a flavor of what we're talking about.

32:11 Absolutely. So one of the ones that I highlighted was OpenEDX. I know that Ned Batchelder is a big

32:18 deal in the Python community, and this is where he works, last I checked. But the EDX platform,

32:23 edX platform has something like 51,000 commits at the time of writing. And it's really interesting to me

32:30 because it is a mono repo. And it represents like a whole team's work. And so there are all these

32:36 dynamics that you can see happening there. And it's one of only three projects where no one developer

32:43 has more than 10% of the commit history.

32:45 Wow.

32:46 Yeah. It's the third largest Django project on the list with 300 committers. And it powers all of edX.org.

32:53 Yeah. And I think MIT's OpenCourseWare and a bunch of others as well.

32:57 Yeah. And just for contrast, 41% of applications are mostly written by one committer. How's that bizarre for you?

33:05 It is bizarre. But, you know, thinking about it, it makes sense to me, right? It seems like

33:09 somebody just wants this app to exist, so they're going to go create it. But then you see the things like Reddit or

33:14 OpenEdX or some of these others, you're like, well, these are built by large groups of people.

33:18 I sort of want to find a way to highlight these organizationally supported, foundation supported,

33:23 corporate supported applications, because they definitely follow a different sort of path

33:28 compared to your average just sort of side project.

33:32 What do you think about using folks out there listening, maybe they're PhD computer science folks

33:37 or other types of researchers? Do you think that this list might put together some interesting set

33:44 of data for people to go look at how apps are built?

33:46 I think that if there's someone who's doing some sort of combination, computer science, ethnography,

33:51 like, you know, studies, social science studies, there's absolutely a ton to learn from here.

33:56 I mean, I didn't sort of pick and choose applications based on these findings. These findings were

34:01 like, you can't exactly call it randomly selected, especially not with the Python qualifier in there,

34:07 right? But it is still a very interesting cross section. And if somebody wants to do that sort

34:11 of thing, I'm all for it. Happy to support that.

34:13 Yeah, that'd be cool. All right. So tell us some of the other ones.

34:15 Yeah, absolutely. So another one that I highlighted was Sentry. And so Sentry is very interesting because

34:23 it's so big and often promoted on podcasts and conferences and so forth. A lot of people are surprised

34:31 that their entire code base is just sitting on GitHub. It's got like it's 26,000 commits since

34:36 2008. For those who don't know, it's a web service and front end for cross platform application monitoring.

34:42 So it's sort of like New Relic and stuff like that. A little bit different than Datadog, though,

34:46 because it focuses on error reporting. But we're looking at a million lines of Python. I think it's one of the

34:52 very few projects that actually crossed that million line threshold. Because, you know, a million lines of

34:56 Python, it's like 10 million lines, 100 million lines of less efficient languages.

35:01 Yeah, exactly. A million lines of Python. That's a lot of code.

35:04 That does include about 120,000 vendored lines. Just so you know that I went looking for that

35:09 possibility. Like I went looking for libraries that maybe got vendored and tweaked and just copied in.

35:14 Maybe define that for folks, because not everyone will know exactly what you're talking about.

35:17 Yeah, of course. So vendoring is what you do when, you know, a library does something you don't want,

35:21 like drop support for intercompatibility with something else. And you're like, okay, well,

35:26 I'm just going to create my own little mini fork of this inside of my repo.

35:30 Right. Like if I didn't want to depend upon requests, theoretically, I could just jam the

35:35 source code for requests into my app.

35:36 Yeah. Like you probably didn't pay for it, but it's called vendoring because the person who like,

35:42 you know, publishes the library is a vendor. It's kind of like that.

35:46 You've taken responsibility for it, right?

35:48 Right.

35:48 Okay. But still, that's still a lot of code that is like, it's 880,000 lines. That's not

35:54 vendored.

35:54 Yeah. The largest Flask app I found in comparison is Pagure, P-A-G-U-R-E. And that's what's called a

36:01 forge. It's like a GitLab or a GitHub, but written in Python and it's written in Flask. So it's almost

36:07 like 10X smaller. So Sentry, its code is sitting all out there. And it's also BSD3 license, which is

36:13 very permissive for a for-profit application. So that's kind of interesting.

36:17 Yeah. That's super interesting.

36:18 The quirkiest thing I found though, the quirkiest like case study was this application called Gannetti.

36:23 And that's G-A-N-E-T-T-I. And it had like, I don't know, 100 or 200 like stars on GitHub or

36:31 something like that. And it was just like, what even is this thing? Because it was written in like

36:40 half Python, half Haskell. So like the very pure functional thing makes it with the very pragmatic

36:49 sort of systems thing. And it had 16,000 commits since 2007. So very mature. It's a cluster management

36:56 tool focused on long lived VMs used for workloads that don't have built in redundancy. So like web

37:01 servers, like, you know, web services, you can shut down the worker and start up a worker. The other

37:06 workers will take over. Right. But if you need like a job to not fail, and it's going to be very long

37:11 lived, right, you might use this. And so it was actually developed at Google. And that was also

37:17 very strange because Python and Haskell, especially Haskell, like these are not like super common

37:23 languages to find at Google for, you know, something that appears to be this important. And it's pretty

37:28 widely deployed. I found like a nice discussion of Wikimedia, like just talking, like, should we use OpenStack for

37:34 this? Or should we use Gennedy for this? And I think they ended up going with Gennedy. So it's still

37:39 pretty commonly used. That one was an oddity. Another one along those lines is a thing called

37:44 LocalStack. And that's a developer tool. It's useful because it sort of does like a mock service that

37:51 you can run locally of like AWS. And so if you're doing DevOps code, you can run a mini AWS locally and

37:57 like write tests against it. And that one is like, I think, a third Java is one of these blended

38:02 dead stories, right? Yeah, yeah. So like hybridizing with JavaScript and HTML and CSS, everyone expects

38:08 for a web service, right? But like to see things mixing with like Java and Haskell, these are some

38:12 pretty interesting case studies that have a whole story of their own, I'm sure.

38:17 This portion of Talk Python to Me is brought to you by Tidelift. Tidelift is the first managed open

38:22 source subscription, giving you commercial support and maintenance for the open source dependencies you

38:28 use to build your applications. And with Tidelift, you not only get more dependable software, but you pay

38:33 the maintainers of the exact packages you're using, which means your software will keep getting better.

38:38 The Tidelift subscription covers millions of open source projects across Python, JavaScript, Java, PHP,

38:44 Ruby, Ruby, .NET, and more. And the subscription includes security updates, licensing, verification,

38:49 and indemnification, maintenance and code improvements, package selection and version

38:54 guidance, roadmap input, and tooling and cloud integration. The bottom line is you get the

38:59 capabilities you'd expect and require from commercial software. But now for all the key open source software

39:05 you depend upon, just visit talkpython.fm/tidelift to get started today.

39:12 So which ones did you like?

39:13 Well, you know, I went through and I just wanted to pull like a couple that stood out to me on each

39:18 of the categories. So if we go to the awesome Python application list, there's a ton of categories.

39:24 You sort of put it into major categories and then the developer space just became a meta category,

39:30 right? So you've got internet and audio and graphics and productivity and education and science.

39:35 I pulled out the developer stuff compared to the non developer stuff because I think that the developer

39:40 stuff in general is a little bit easier to find. Like you're going to find blog posts and higher GitHub

39:45 stars for developer focused applications in many cases. It's not a super, super stark difference when

39:50 I ran the numbers, but definitely like, you know, most of the listeners have heard of Ansible,

39:55 you know?

39:56 Yeah, right, right. Or OpenStack or something like that. Yeah, Exactly, exactly.

40:00 Yeah, cool. So there's a bunch of stuff under there. And I didn't, I didn't pull them out quite

40:03 like that. But maybe just to give people a sense of what's out there. Some of these are really small

40:07 and niche. And you tell they're like for somebody and others, they've been like the top 10 sites on

40:13 the internet.

40:13 Yeah, let's just go. I'll just go through the list. And we can just maybe give me some real quick

40:17 thoughts. So under the internet category, we have Deluge, which is a popular lightweight

40:23 cross platform BitTorrent client.

40:25 Deluge, which is how I pronounce it. I mean, I don't know how to pronounce it.

40:29 But Deluge, okay.

40:30 But well, I don't know. It doesn't come with the pronunciation guide last I checked. But basically,

40:34 it's a BitTorrent client. And it has a few very interesting things about it. One,

40:38 it's a GUI application, it uses Twisted. If you just look like for best BitTorrent client,

40:44 you'll find it making those rankings. And people aren't saying like, this one's great,

40:48 because it's written in Python. You know, it's just a good application, very solid,

40:52 like the UI, etc. And it happens to be written in Python. And so that was one of the ones that

40:58 I got off the top of my head when I made my list of 20 because I am a Deluge user proudly backing up

41:05 my Linux images. The Deluge thing also has a little like hidden gem, which is that they managed to have

41:11 a web UI that is almost identical to their GUI UI. But it just like sort of like allows for remote

41:19 administration, it's still classified as a desktop application, because it's mainly written for

41:23 single user use. But it does have a web UI. And I'm not sure how they did it. But like,

41:29 the experience is the same. So like in the browser as using, I guess, like I'm not sure if they're QT or

41:36 GTK, I can check. Sure, sure.

41:37 But yeah, I'm a big fan of Deluge. I highly recommend checking out their repo.

41:41 Yeah, all these have links to the repo, it's a demo to their homepage, things like that. Another one is

41:46 the Lixire, a featureful file host and link shortener with API. So kind of like Bitly, maybe.

41:53 This one's an interesting one. Like it seems to be used by like, a community I don't fully

41:57 understand. But like, it gets a lot of use. So yeah, this is a recent addition. I'll probably

42:03 explore it a little bit more later. But yeah, it is like sort of like an imager combined with

42:09 like a Bitly, which sort of makes sense. Like if you're, you know, pasting things into an IRC chat

42:14 or something like that, it's got everything you need.

42:16 Yeah, super. The next one people might have heard of it's called Reddit.

42:19 Yeah. Yeah, we don't need to say a lot more about that other than this is sort of a snapshot from

42:24 2017. But I think it's still super cool that this is here because here's a high scale application. I

42:29 don't know exactly where it ranked, but it was in the top 20 sites at some point for destinations,

42:35 right? Definitely. I'm not sure where it's at now either. But yeah, definitely a major destination,

42:40 especially I'm sure for many listeners, whether you like it or not, you're going to like probably

42:44 find some useful stuff through Google search or something.

42:46 Yeah. But one interesting thing about Reddit is that they have a very interesting approach to their

42:51 like schema design. If you dig in, they're using what's called an OAV pattern or an EAV pattern. It's

42:57 an entity attribute value. And it's kind of like, it's sort of like object storage or document storage built

43:03 on top of relational database. And so some might consider it an anti-pattern to use it this to this

43:09 extent, because you're losing out on the constraints of a relational database. But it managed to work for

43:15 Reddit. So it can't be all bad. Anyways, theory meets reality, right?

43:21 Yeah, exactly. And some might say like, oh, it's very slow compared to, you know, an optimized schema

43:26 with better indexing and so forth. But one, like databases have done some degree of optimization for

43:33 this pattern. It has a Wikipedia page. It's a pretty well known pattern. But also Reddit is fast because of

43:39 caching. That's the like other main learning there. If you look in there, it has layers and layers of caches.

43:44 A very interesting, like, you know, exemplar to learn from.

43:47 Yeah, I would definitely think so. All right, next category is audio. And I grabbed two of those out

43:51 of there. One's called Exile, something like this, which is a cross platform audio player and library

43:57 organizer tag editor type thing that looks pretty interesting.

44:00 I found this because I went on, everyone's familiar with Wikipedia. There's also this thing called

44:06 Wikidata. And Wikidata allows you to use a Sparkle query, which is sort of like a, I don't know how to

44:11 describe it exactly. But it's sort of like SQL, but extensible for like graph networks, kind of.

44:17 So I ran a query for all of the software that had Wikipedia pages that was known via like, you know,

44:23 Wikipedia or Wikidata to have been written in Python. And so I actually found it through there. I'm not

44:28 a user myself, but yeah, it's actively developed. It had a release come out just a few months ago. And

44:34 it's sort of like, kind of like Amarok, if anyone remembers that, but it's got like Last.fm support and

44:39 plugins and so forth. Seems pretty cool.

44:41 Nice. Another one, Music Brains Picard, which automatically identifies tags and organizes

44:47 musical albums, which sounds pretty cool.

44:48 It's funny name, but just like amazing software. Let me just say, I don't use it as much as I used

44:54 to back when I used to like DJ and stuff, but basically the way that Picard works.

44:57 So there's a Music Brains Foundation that develops it. And they also have a sort of like, it's kind of

45:03 like a Wikipedia, but for album information, it's kind of like Discogs or something like that,

45:09 but open. And what it'll do is you can like take one of your CDs, make a backup, right? And that

45:15 comes through and it's just as track one, track two, track three, track four. But it'll take the

45:19 number of tracks combined with the length of each track to generate kind of a fingerprint,

45:25 match that against an internet database of like album listings and so forth, and then automatically

45:31 bring in all the fully filled out like ID3 tags. So everything shows up very searchable.

45:36 Oh, that's awesome. Yeah. With like album art and all that. Yeah.

45:39 Man, Michael, I used to spend, I mean, you're in the audio world now too. Like I'm sure you know,

45:43 I used to spend so many hours, like, you know, getting those tags just right and still have typos.

45:49 And then this sweeps through and I press one button, 20 seconds later, it's just all perfectly

45:56 organizable and tagged and so forth. It made me feel like a fool, but also made me feel pretty magical.

46:00 Yeah. Well, I'm sure you appreciated it more than if you hadn't done that stuff by hand.

46:04 And then one day I find out it's written in Python and I'm like, that's going on the list.

46:07 Yeah, it's definitely on the list. All right. Switching from a music audio to video,

46:11 we have two editors, Flowblade and Openshot. Flowblade does multi-track, non-linear video

46:17 editing for Linux and Openshot is cross-platform video editor for like many of the platforms.

46:22 Yeah, that one like sports free BSD and Windows as well. So with these ones, like, I think what's

46:27 interesting is that a video editor is a very large application. It's going to have to have a lot of

46:32 codecs and other things. And those things are very touchy based on what platform they're running on.

46:37 They're incredibly finicky. It's super painful to develop that stuff. I've done it.

46:42 Exactly. Talk Python training. So very finicky stuff. And one sure shot way to like, you know,

46:50 ruin it too, is like to miss a dynamic library in your package. So being able to go in there and look

46:57 at how they do their packaging. Sure, it's a little bit like, you know, dirty, but I'm sure that you're

47:02 already deep in the dirty if you're looking for how to do this kind of application freezing. So it's

47:07 really useful to have those to refer to. Those are great examples. For the graphics world,

47:11 we have a free CAD for a general purpose CAD editor. That sounds pretty intense. And it's

47:16 awesome that that's in Python. So with free CAD, part of me was a little bit like sort of skeptical,

47:21 right? Like, could an open source community come up with something like CAD? Because I mean,

47:26 when you're dealing with the BIMS world, right, like, you know, it's like construction and design and so

47:32 forth. It's, it's just so much. And who could possibly have the side resources to do this? But you know,

47:39 they're doing it, they have a wiki, and they don't seem to be giving up. It's pretty impressive. I read

47:45 some sort of third party reviews. And it basically says, like, look, this may not be like top tier,

47:52 right? Like it may not be ready for like full professional shop usage and so forth. But it's

47:57 surprisingly usable. I'm not much of a CAD user myself. So I'll just have to take their word for it.

48:02 Right? If somebody if somebody listening is a CAD user, let me know, let me know if I should expand my

48:06 description here, write a review, support Python CAD. Speaking of supporting, some of these have

48:10 fund next to them. Exactly. So the sorts of links that I collect for each ones, basically, I'll do

48:16 repo, home, Wikipedia, if they have it, GitHub, if they're not on GitHub, but they do have a GitHub

48:22 mirror, demo, if there's a version of it running that you can like just try out docs, which is like

48:28 usually a read the docs link or something similar. And I added fund, which is basically going to be like

48:33 a Patreon link or a PayPal link. If I can find a way to show that these people are making into this

48:41 current generation of like approach to open source, which is like, yes, please do pay me. So I can spend

48:47 more time on this, then I will absolutely link it here. Because they deserve all the help they can get.

48:53 Yeah, absolutely. I think it's great that you're highlighting that. The other one in graphics is

48:57 Kuru image server. So I guess you give it like a large image and you can say I would like it at

49:02 right now I wanted a 250 by 250 or whatever. And it just, you don't have to keep redoing it. It just

49:09 automatically regenerates and probably caches that. Yeah, exactly. You upload an image and then it can

49:13 scale it to all different sizes. They say they have all sorts of tweaks and optimizations to do that sort

49:18 of stuff efficiently and performantly. And yeah, it has caching and stuff too. That's a pretty tough

49:24 task. You know, like that stuff can be pretty expensive. So I was like, maybe if someone's

49:28 interested in performance tweaks for images and so forth, this would be a good project to go see

49:32 how they achieve that. All right, let's go through the games real quick. There's a game section. I'll

49:36 just quickly go through them. You can just give me maybe some general thoughts. Sure. There's frets on

49:40 fire X, which I'm guessing is like one of those sort of music things come down and you hit them and

49:44 then you've got the streaming platform is like a homemade Twitch, which is pretty cool. You got Lucas

49:50 chest and unknown horizons, which is a cool strategy game. The game section is one that I definitely

49:54 wish was larger. A game is again, a very finicky, large application and one that's tough to undertake.

50:00 A lot of also there aren't a ton of open source and free ones. Like I know that Python was used in

50:05 one or two of the civilization games for scripting, but like that's not open source. So, but unknown

50:10 horizons is pretty similar to that sort of thing. It is an RTS. It makes extensive use of Python. So I'm

50:15 happy to see that there frets on fire is a little bit older, but from what I can tell,

50:19 it still works for some people. So I guess like dance dance revolution never gets old or maybe,

50:24 I don't know. Is it a rock band type thing? It says frets. I don't know. It's something like that.

50:28 Something like that. Yeah. Eve, Eve online is another good one to throw in there,

50:31 but obviously it's not open source. So, you know, you can't, can't really do that. Put it in there.

50:35 What else was second life? If you consider it a game, maybe, maybe you consider it life.

50:41 That's right. Maybe you do. All right. Under productivity, the top one here that I grabbed

50:45 is actually something I use to manage all of my servers and my infrastructure. And I love it.

50:49 Just love it. Glances. Yeah. It looks really good. I guess I don't manage enough servers to really

50:54 like me. I'm just like a, an H top user, but it's like H top, but better. Yeah. You get all sorts of

50:59 cool stuff. Like if I just run glances real quick, I guess I don't, there you go. It gives you graphs for

51:05 CPU memory swap, how much Ram you're using, like server load over one minute, five minute,

51:12 15 minutes. You can sort by memory usage, by CP usage. It shows you like your disc IO rates,

51:18 your network rate, all just like all sorts of stuff and great little hot keys for it. Yeah. It's,

51:23 it's like a shop, but yeah, like all the stuff you want.

51:27 Yeah. I'll have to give it a shot.

51:28 It's pretty good. Let's see. We also have bleach bit for like privacy. So it cleans up some stuff out

51:33 of your system, but also I'm guessing it like rewrites every empty bit of space or something like

51:38 that. Just giving the bleach part. I don't know. I used to work way back in the day, like in high

51:43 school for a brief time at like a computer repair shop in South Dakota, no less, but basically people,

51:49 they would come in with malware, spyware, all this stuff. And, you know, my computer's slow and I

51:55 have all these pop-ups. Yeah, exactly. I only installed nine toolbars on my IE 5.5 or whatever.

52:02 Like a 50 pixel bar in the middle where the actual content is.

52:06 Exactly. Oh man. You're putting too much of a, of an age on us here, Michael. Keep this content

52:13 evergreen. Anyways. All right. But basically what bleach bit is, I didn't realize it was written in

52:17 Python, but it is one of these cleaners that will go through and like, not just empty out your temp

52:22 directories and so forth, but also like clean up your registry because a lot of this software will make

52:26 access registry rights. And I guess, especially back in the day that would slow down windows.

52:31 So yeah, it removes like tracking things like increasingly that's become the focus to like

52:37 sort of clean out your browsers of like, you know, any weird flash cookies and that sort of thing.

52:42 Cool. Yeah. That's great. Yeah. Last one on the productivity space is GM vault, Gmail backup.

52:46 I guess we've sort of foregone, I mean, I personally have foregone some of this stuff in my Gmail,

52:51 but probably I should back it up, you know, I never know when big G is going to like, you know,

52:55 do something uncouth. Yeah. You never know. It is kind of nice to have that. I actually went and found

53:00 us, found out that if you go to the, the Google doc, just the Google data export, you can say export my

53:09 docs. And one of the problems, you can run Google drive and stuff and it'll have like your docs from

53:15 Gmail in there, but they're just a hyperlink back to Google docs. Right. But if you run the export,

53:19 you can say export it as word documents, Excel spreadsheets and a PowerPoint and actually get

53:26 the content of it, not just links to it back in drive. It's pretty cool. Yeah. This sounds like

53:32 this kind of sort of in that realm organization, a lot of archiving stuff in this world and library

53:38 bits. There's a funny one as well. Archive Matica, digital preservation and then archive box,

53:44 which is like self-hosted sort of way back machine. Exactly. Archive Matica. It's an interesting one

53:49 because it's sort of like, I think it's sort of targeted at like libraries and actual like archives,

53:56 whereas archive box is a little bit more like gorilla, like Yahoo says they're going to delete

54:01 geo cities. Okay. Let's like, you know, pull it all down. Like it's sort of those two schools,

54:06 which are definitely adjacent, but a little bit different. Yeah. Similar to that, I guess we have

54:09 open library, which is a web application for like a library catalog. So if for some reason you,

54:14 you have a small little library, you know, you don't have to like start from scratch.

54:18 And you'd be surprised like, yeah, libraries, I was at a library recently, like it only was open a few

54:22 hours a week, but they had a tremendous collection and they could definitely use something like this

54:27 because they didn't have anything to search through. You know, you just had to walk the stacks.

54:31 That's crazy. And the last one, the one that I said was funny, it's called I hate money.

54:34 There are some people who are very into sort of like personal finance management,

54:39 like I guess people who like sort of picked up wanting to balance the checkbooks epigenetically

54:45 somehow passed down to them. And, but no, there's like this whole movement of people who do plain text

54:50 accounting and get version. They're like, you know, they're, they're actual financial books.

54:55 I think I hate money sort of comes from that domain combined with the self hosting domain.

55:01 Yeah. Interesting. They're like, we're not doing meant forget meant.

55:04 This one seems to be about like sort of shared budgeting and stuff too. So this is like Fava

55:09 is the one that I was thinking of that I added recently, but I hate money is like basically

55:12 like you have roommates and you just want to keep track of who bought what when,

55:16 and it's like a little bit better than like a Google sheet.

55:18 Yeah. Yeah. That's pretty cool. All right. So I'll, I'll speed through a few more here real quick.

55:22 So communication with ask bot, which is very similar to stack overflow.

55:25 Also quite interesting and secure drop, which is like a whistleblower submission system

55:30 for media organizations. These are cool.

55:32 If you're into self hosting or you have a team that you don't want to pay for stack overflow or

55:36 something like that, then you can like run your own ask bot. And secure drop is a really important

55:40 one. That one was originally written by like Aaron Swartz and it's managed by freedom of the press

55:44 foundation. They just, I think it came out with a new release and yeah, that's huge for journalism in our

55:50 time.

55:51 Yeah, absolutely. One that I think is probably going to be really a welcome one for,

55:55 a lot of teachers and professors out there would be NB greater. This is under the education system,

56:00 which is Jupyter based notebook, basically create assignments in there and it will grade them

56:05 automatically for you. That sounds wonderful.

56:06 Not quite that level of teacher myself, but I think that it's pretty cool idea. Like instead of having

56:11 just workbooks, you can actually give a live notebook, have them fill in some blanks, do some things,

56:17 and then like have them submit the IPyNB and grade that. Yeah.

56:22 Another one that I like to point out, I don't know if it's under education, but a lot of people

56:25 seem to know about it, even if they're not developers, but sort of like education related

56:29 is called Anki, A-N-K-I. And it's basically like a flashcard program. And I meet all these lawyers

56:35 and doctors. They're like, oh man, I would not have made it through school without Anki, right?

56:40 It's like as important to them as Wikipedia or something like that, because it's sort of like

56:43 spaced repetition memorization tool written in Python.

56:46 If you're going to do anatomy or something like that, right?

56:48 You got to memorize it all.

56:49 There's no reason to rhyme. It's just that's part of the bone. It's called that. So we're going to

56:53 learn that. Yeah.

56:55 There's probably some reason, but yeah, it's a lot of memorization.

56:58 You can't Wikipedia thing during a surgery, I imagine.

57:00 Just hold real still. I'm just, I'm researching. Yeah. So the last one, speaking of this kind of stuff

57:07 is science that I want to dig into. And then I think we're going to be probably out of time for

57:11 touching on them. But I felt when I looked at the science area, I felt like there's about equal

57:16 amount as some of the other categories, but this is like really polished, really serious stuff. So we

57:22 have Ascend, which is mathematical chemical processing modeling. We have Cell Profiler, which is

57:28 interactive data exploration of biological image sets. We have SageMath, which is a competitor to

57:34 Matlab and Mathematica. Like these are, especially SageMath, these are real things.

57:38 No, SageMath is kind of a triumph, right? Like if you're maybe a Matlab user or something like that,

57:44 you should definitely check it out if you haven't already. But yeah, all of these science applications,

57:49 they get a lot of usage from their like academic counterparts, student counterparts and so forth.

57:53 But we don't often think to go find them as exemplars for like applications we might want to build.

58:00 But yeah, there's some really interesting ones in here. Like the CCAN was one that I like

58:04 jumped out at me because it is a data management system. So it's sort of like data hub that you can

58:09 host yourself. And so if you are running an organization like university or government,

58:15 you know, and you want to do anything with open data, you're going to need some sort of data hub,

58:18 some sort of portal for people to find that data and you need to manage your data on there.

58:22 And there's an open source one written in Python.

58:24 Yeah.

58:24 And I've done there's another one that's called Orange. It's like kind of like a component based

58:29 data mining software for graphical interactive data analysis and visualization, but you can train

58:34 machine learning models graphically. And it sort of has like a signal processing type metaphor. So

58:40 like you have this thing, and then you drag like a little curve into that thing and make like a little

58:45 flowchart. And then once it's working, you can export a Python program from that. And I've been using

58:50 that since I think 2012, 2013. So it's like from somewhere in Eastern Europe. And I think the professor

58:57 who like leads the lab that writes it, like I think I have his book as well, itself, like kind of a triumph

59:02 too. It also uses QT4 and QT5. So if you're undergoing a QT transition with your application, might be an

59:09 interesting one to look at too.

59:10 Oh, wow, that is a super interesting angle.

59:11 Yeah.

59:12 Yeah. Okay, great. So I think we're probably out of time for diving into anymore. But there's so much more

59:16 to cover. So there's the CMS category, the ERP category.

59:19 Business software.

59:20 Yeah, oh my goodness. SAS and all that. So static sites. And then there's the dev super category,

59:26 I'm calling it because there's 129 items in there and a bunch of subcategories like source control

59:31 and stuff. And people can just go through the list and find it. I think this is great.

59:35 Yeah, I could go on for hours. I really could. Each one's more exciting than the last.

59:39 These are so good. All right. I think we should sort of leave it there and people can go and they can

59:45 explore the cover. There's probably about 250 we haven't even touched on at least. And then people

59:50 also out there who are listening, maybe they want to, they maintain one of these projects or they use

59:55 one of these projects. They want to recommend it to you. What's the story there?

59:59 I have a GitHub issue template. Jump in there, like, you know, make sure it actually fulfills the criteria

01:00:04 of being like maintained and so forth. Again, I'm not super, super strict on that. But like,

01:00:10 if one particular category is like really overpopulated and it's like the seventh link

01:00:15 shortener or something like that, I got to be a little bit decisive there. If it's something that

01:00:20 is pretty like undernourished category, right? Like, you know, more is better, the more merrier. So if you

01:00:25 have like a game that you know of written in Python, it'll probably make its way in.

01:00:28 Yeah, super. Maybe someday the game category will break up into like, tower defense and strategy or whatever. Who knows?

01:00:35 I really want to do a taxonomy, like sort of refactoring because some of these categories are

01:00:39 bursting at the seams. I recently added the storage category because I found all of these like,

01:00:43 database related things written in Python. I don't know, there's just so much.

01:00:48 Yeah, super cool. All right. Now before you get out here, the last two questions real quick.

01:00:53 Sure.

01:00:53 For you. When you write some Python code, what editor are you using these days?

01:00:57 Yeah, I'm still getting it done with Emacs. But you know, I'm open to experimentation. And one of

01:01:02 these days, maybe one of these other editors will get its hooks in me.

01:01:04 Yeah, cool, cool. Well, I'll keep asking you each time you come on the show.

01:01:07 And then notable PyPI package. You've got some good ones out there.

01:01:11 The thing that still dominates for me is Glom, right? So I have this Python package is called Glom.

01:01:17 You use it for deep getting into a dictionary, but also a variety of other things. It's sort of like

01:01:22 a data templating system. And more and more, it's becoming like a higher level programming

01:01:26 language almost. So right now we're building streaming support into it because a lot of

01:01:31 people have brought up like sort of the size of the data that they want to manipulate with Glom.

01:01:36 And so we got to have some sort of streaming metaphor in there. But I'll also point out that

01:01:41 a lot of these awesome Python applications are distributed through PyPI in some way. So I have

01:01:46 PyPI URLs on a bunch of them too. So shout out to them.

01:01:50 Yeah, yeah. Awesome. Awesome. All right. Well, final call to action. People want to get involved

01:01:54 in your project. What are the ways I already touched on some of it for submitting stuff,

01:01:58 but what are some of the ways people can get started or get involved?

01:02:01 Yeah, definitely check out the repo on GitHub. Check out the listing in the readme. Check out that

01:02:06 Jupyter notebook that's in there. Look for the RSS link as well. If you're an RSS reader user like me,

01:02:12 a bunch of interesting Python RSS readers. If you're not, you can just download one on that page too.

01:02:17 And yeah, then submit some more applications and also keep your eye out for the PyBay talk and probably

01:02:24 the PyGotham talk. All right. Excellent. Well, this was so much fun to talk to you about all this stuff.

01:02:28 And, you know, nice work on having this angle because I don't feel like it was covered and you've done a

01:02:34 really good job. It seems like it's taken off. It's got almost 10,000 stars on GitHub.

01:02:39 Yeah. I mean, and it's not really about that, but I am happy to see that it sort of gets out there

01:02:44 more. Speaking of giving it more coverage, I did technically start a YouTube channel. You can like

01:02:48 and subscribe if you'd like. It's yak.party, Y-A-K dot party. Maybe I'll post some APA stuff to it

01:02:55 in the coming future once I'm done with these talks.

01:02:57 That sounds great. All right. Well, thanks for being here as always.

01:03:00 Absolutely. Anytime, Mike.

01:03:01 You bet. Bye.

01:03:02 This has been another episode of Talk Python to Me. Our guest on this episode was Mahmoud Hashemi,

01:03:08 and it's been brought to you by Linode and Tidelift. Linode is your go-to hosting for whatever you're

01:03:14 building with Python. Get four months free at talkpython.fm/linode. That's L-I-N-O-D-E.

01:03:20 If you run an open source project, Tidelift wants to help you get paid for keeping it going strong.

01:03:25 Just visit talkpython.fm/Tidelift, search for your package, and get started today.

01:03:32 Want to level up your Python? If you're just getting started, try my Python Jumpstart by

01:03:36 Building 10 Apps course. Or if you're looking for something more advanced, check out our new

01:03:41 async course that digs into all the different types of async programming you can do in Python.

01:03:46 And of course, if you're interested in more than one of these, be sure to check out our

01:03:50 Everything Bundle. It's like a subscription that never expires. Be sure to subscribe to the show.

01:03:55 Open your favorite podcatcher and search for Python. We should be right at the top.

01:03:59 You can also find the iTunes feed at /itunes, the Google Play feed at /play,

01:04:03 and the direct RSS feed at /rss on talkpython.fm. This is your host, Michael Kennedy. Thanks so much

01:04:11 for listening. I really appreciate it. Now get out there and write some Python code.

01:04:14 you you

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon