Monitor performance issues & errors in your code

#234: Awesome Python Applications Transcript

Recorded on Tuesday, Sep 24, 2019.

00:00 Michael Kennedy: Have you heard of awesome lists? Well, they're pretty awesome gathering up the most loved libraries and packages for a given topic. While most lists cover awesome developer tools and libraries. We don't have many examples of awesome applications, both for use and as examples to draw from. That's why Mahmoud Hashemi decided to create awesome Python applications. And you're about to dive head first into them. This is Talk Python to Me, Episode 234 recorded September 24th 2019. Welcome to Talk Python to Me, a weekly podcast on Python. The language, the libraries, the ecosystem and the personalities. This is your host Michael Kennedy. Follow me on Twitter where I'm @MKennedy. Keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter via @talkpython. This episode is brought to you by Linode and Tidelift. Please check out what they're offering during their segments. It really helps support the show. Mahmoud, welcome back to talk Python to me.

01:07 Mahmoud Hashemi: It's good to be back.

01:08 Michael Kennedy: It's great to be back. It wasn't that long ago that you were on Python Bytes and we still didn't get a chance to catch up because you were covering for me.

01:16 Mahmoud Hashemi: Yeah but it was a great time. Yeah, happy to fill in anytime but really who could fill your shoes?

01:20 Michael Kennedy: Ah man.

01:21 Mahmoud Hashemi: You know it's...

01:22 Michael Kennedy: Thanks for that. You've been on the show before. I think the first time was quite early in the show's history. You came to talk about a really interesting topic, enterprise Python right. Python being used within the enterprise. We talked about a bunch of examples of that. And I feel like this is the open source equivalent of that story just a little bit.

01:42 Mahmoud Hashemi: I was thinking about that today. Yeah, you're right. It is kind of similar, different in a lot of ways. I think a lot more fun but we'll get to that part. But yeah that was back in 2016, time flies.

01:52 Michael Kennedy: Yeah, time does fly. We've been at this stuff for a while.

01:54 Mahmoud Hashemi: I was at PayPal.

01:55 Michael Kennedy: Yeah, that's right. Yeah, I mean I guess that's probably a good time to ask you what have you been up to these days? What's going on in your world?

02:02 Mahmoud Hashemi: I sort of like had my fill of moving around different teams with PayPal seeing how that whole business worked. And it was cool to work in the enterprise. I wanted to sort of like stay with that so I wanted to kind of grow my own autonomy. As well as just sort of see how start ups worked. I am in Silicone Valley after all, you know. And PayPal is a start up in many ways but it started in what, 1998. So...

02:26 Michael Kennedy: It's mostly started.

02:27 Mahmoud Hashemi: Yeah, it's mostly started at this point. You'd be surprised in some ways but yeah. Anyways but basically yeah. I went to a Series C company, did that for a while. It was pretty cool and then there's not too many teams you can move around there. So once you've learned it, you kind of learned it. And then I was like okay. Well let's come to a Series A company and so for the past couple of years I've been at Simple Legal, here in Mountain View. And yeah, it's been really good. When I joined the team was like four people. We had four engineers and then now I think we're at a dozen. Got a small team, I'm like Principal Engineer. Do a lot of code review, architecture review. But also a fair amount of coding and yeah. Just having a blast.

03:10 Michael Kennedy: It's cool, that looks like a fun product service to work on. Is there a lot of Python happening?

03:15 Mahmoud Hashemi: Best part right. It's not like PayPal where your jockeying for different technologies and so forth. It's like what we say goes, right. And we pick the best technology, Python, Postgres. Everything we love. So we got that autonomy down but like yeah. Just to be clear, it's not like the most exciting company on the outside. But I always tell people this. Try to go for a boring company and then make it fun. Do it in like... Choose a boring company so that you can do things in a fun way. If you choose something that's too exciting on the outside. Then your day to day is just going to be a blur of boring stuff. So yeah, we've done a lot of open source here. And I could go on about it for hours but we've got other things to talk about.

03:58 Michael Kennedy: Alright some awesome stuff to talk about for sure.

04:00 Mahmoud Hashemi: Yep.

04:01 Michael Kennedy: I think there's a lot to that. People see sort of the company as the exciting thing, right. Like working for Tesla or something like that but if what you end up doing is writing C code to do a bunch of internal boring stuff. Well then it doesn't matter how exciting the company is. That's not that much fun.

04:18 Mahmoud Hashemi: No, exactly. Just batch scripts to shovel around logs. That's not exciting to me.

04:22 Michael Kennedy: It doesn't matter if a logs are...

04:23 Mahmoud Hashemi: I don't care how exciting the brand is.

04:25 Michael Kennedy: Yes exactly, exactly, cool. Alright well let's talk about some awesome things. Now, there's been this history of awesome lists. Do you want to maybe just tell people about this idea of awesome lists? I don't know how much of the history you know but I know that you must be involved because you created one.

04:43 Mahmoud Hashemi: I'm actually not that good of an expert on these things because sort of when they started popping up, I was already pretty well versed in, for instance, Awesome Python. It's a huge repo, tons of contributors. Been around a long time. Got tons of content and it's quite awesome in many ways. But I really refer to it or find much new there because I was sort of already along my Python path. But I'm sure it's a great resource for people out there, I refer people to it all the time. So yeah, this awesome lists thing is a phenomenon. It's like a meme on GitHub where it's like I'm going to make a list of links which are awesome. For some definition of awesome.

05:22 Michael Kennedy: Yeah for some definition of awesome, exactly. I don't even think it originated with Python. I feel like there were some PHP stuff going on.

05:28 Mahmoud Hashemi: No, definitely not.

05:29 Michael Kennedy: Yeah, Awesome Python is really good. People should check that out, that's interesting. There's a bunch of good libraries there and there's also more focused ones. I recently ran across awesome ASGI for Async. Basically for Async web frameworks and other AIO type of things, Async IO things.

05:48 Mahmoud Hashemi: Yeah, it's just a default architecture for linked contents on GitHub. You know if you want it to be approachable. But yeah, there's this guy like Sindre Sorhus and basically yeah. I'm pretty sure that he's a node guy. Anyways, he has this sort of awesome authority and it's like the meta list. It's a list of lists that points to all the other awesome lists.

06:10 Michael Kennedy: Oh, how cool. I didn't know there was an awesome list of awesome lists. That's super meta.

06:15 Mahmoud Hashemi: Well that's why they all have this badge with the cool sunglasses on it and you know. Anyways, that's got like 120,000 stars which is like, you know. It's good, frankly so much of how we learn as a community is driven through steams of content that has to be manually republished out. I'm thankful for anything out there that sort of constitutes a reference, you know. But an institution I can go and look at rather than refreshing the Twitter stream and hoping that somebody has said something of value to me. Awesome lists are in the end on net pretty awesome.

06:49 Michael Kennedy: I would say yeah, they're aptly named. I do like them, I feel like they're a little bit vetted. It's not just Yahoo of all Python things, right. It's not like PyPI or whatever. These are the things people found to be extra good in these categories. I guess it's kind of like Yahoo a little bit, anyway. 1996 Yahoo. So maybe tell us about your awesome list and how you came to come up with it and things like that.

07:13 Mahmoud Hashemi: Sure, so I didn't really set out to create an awesome list. I sort of backed my way into it. So I started a couple years back I think in 2017 or so. I started giving conference talks. I was already going to meet ups and it was sort of the natural next step, started giving talks. I recommend everyone do the same. It's a good way to branch out.

07:32 Michael Kennedy: Giving talks is a great step in raising your profile and just breaking out of the mold of I'm sort of an anonymous programmer right?

07:40 Mahmoud Hashemi: Yeah and just like writing blog posts for instance. Once you actually sit down to cover a topic. It exposes a lot of your own gaps and so you end up filling a lot of your own knowledge.

07:49 Michael Kennedy: Yeah.

07:50 Mahmoud Hashemi: Just for personal benefit I recommend it. So basically I was giving these talks and I was covering topics like performance and packaging and testing and plug ins and a bunch of architecture stuff. And the thing is like I just sort of eluded to. Often times when I'm starting out with this talk. I have some degree of expertise in the area but also I'm doing a lot of learning myself. That's one of the reasons I like doing them. And so afterwards people would have all these questions, you know. I'd take it as a good sign. They thought the talk was good. They thought I knew what I was talking about, that's great. But the fact is that when they're asking me about packaging or plugins or performance. It's like well I only have my 10 years of experience to draw from. And I won't know every single answer to apply to their situation. I want to, I desperately want to but that's just not possible.

08:38 Michael Kennedy: You can always get through it right? If you're packaging up an app and you know well, I got CX Freeze to work for me. So now I got...

08:44 Mahmoud Hashemi: Right, exactly.

08:45 Michael Kennedy: It's like well why are you going to go try all these other ones when you have one that works?

08:49 Mahmoud Hashemi: Exactly.

08:50 Michael Kennedy: You've got a life to live and other apps to build right. But other people can share their experience and say oh, did you know about Pyops dodging? Like wait, what's that?

08:57 Mahmoud Hashemi: I certainly want that 100% completion stat but it's just not going to happen. So one thing I sort of wished I had was basically a way to refer them to a known working example. Just because it was available at that conference at PyBay a couple years back. The Zulip team was there and Zulip is like this chat application written in Python. Sort of like a Slack but kind of blends in some email features, it's a pretty cool design. And they have a fully working application with a great community, a good onboarding process, good docks, all this stuff. And so I'm like look, this is a great exemplar if you're writing a Django server application and you're interested in say introducing typing. They recently did that process and you can go look at how they did it. And you can get a lot better answer from looking at exemplar. For me, just spur of the moment.

09:48 Michael Kennedy: Right or even just some kind of talk where people just talk about the concept of type annotations or type hints. But here's actually the GitHub conversation and the issues and the PR's and the this is the before and after and the trade off's. It's just like a white paper or real world example of that right.

10:06 Mahmoud Hashemi: Yeah and the living maintainers who could answer those questions too. There's so much to draw from there. And so I was like, okay. I'm going to make a list of Python applications just for me so that I can easily refer to them. Just off the top of my head I was able to get together around 20 that I considered pretty awesome. Can you think of. I mean Python is the biggest language in the world right now basically. So can you think of a few?

10:32 Michael Kennedy: You would think that I would be able to start naming them. The real tricky part is these have to be the open sourced ones. Not just things written in Python. So Wagtail for example is a CMS in Django which is pretty nice. Reddit, Reddit is one. Trying to not cheat and think about what is on your list. But these types of things.

10:54 Mahmoud Hashemi: How about you, the listener. Can you think of a application that is written in Python? Okay, times up. Well, it turns out that there are more than the 20 I could come up with. And when I started looking it was sort of hard to find them but once I found them. I sort of found it for myself in this sort of addictive cycle, just always looking for more. Because there was so much creativity there and often sort of under appreciated. So something I became aware of pretty early on when I was developing my own open source Python application. There's this contest, it's actually happening now. It's called Wiki Loves Monuments and Wiki Loves Monuments is like a photography contest around the world. It's like the Olympics of photography for free culture. And they needed a tool that would allow them to judge entries, so I made one of those. You know, I had a nice team. There was some stuff we had to figure out on our own. But as I was building this I was finding some other applications and they had like a couple dozen stars. And it's because GitHub is a place for developers. It's for people who want to build software not necessarily use software. Like I don't know if you remember the old days with Tucows and Cnet and download.com.

12:06 Michael Kennedy: Tucows was awesome. It was sort of the free and open app store, right.

12:15 Mahmoud Hashemi: It was freeware, it was shareware. It wasn't really free software in that it's the beer or whatever, instead of the speech. But yeah anyways, so basically I sort of got a little bit addicted to finding these awesome applications. Because a lot of them were sort of diamonds in the rough. Not getting as much appreciation as they probably deserved. I mean their users loved them but not necessarily GitHub users.

12:38 Michael Kennedy: Right, well GitHub's not exactly. It doesn't have the right incentive for the right sort of connective structure there. Because GitHub is all about, I feel like it's all about connecting with libraries.

12:50 Mahmoud Hashemi: Absolutely.

12:51 Michael Kennedy: Either code you work on or libraries you want to put in your application. But not just end user things that you could study or make copies of right.

12:58 Mahmoud Hashemi: Exactly and so I ended up with this sort of five point rubric for the things that I was looking for. Basically it had to be free software. It had to have an online source repository. It's not much of a reference if you can't see the up to date code that is actually shipped. They had to be using Python for a significant part of its functionality. Not necessarily pure Python. I like to be real, like realistic pragmatic applications are going to have a mix. You know Javascript, HTML, CSS. Going to have to have them.

13:25 Michael Kennedy: So, until browsers run Python

13:26 Mahmoud Hashemi: Anyways, so yeah. Then it had to be well known or at least prominent in its identifiable niche. There's some cool applications out there for neuroscience and maybe neuroscience people love them or something like that. Doesn't have to be world famous. But basically it has to be maintained. This is another really important thing. A lot of old Wiki pages, they sort of get stale. No one's checking the links and it just ends up being sort of a sad graveyard after a few years. So these projects actually have to be maintained, the links have to be up. They have to be functional on relevant platforms at the very least.

14:03 Michael Kennedy: I've seen some of these types of applications that you're talking about. I found it on GitHub link, oh. Why is no one talking about this? This is great and it says click here for a live demo and you click it and it's 500, crash.

14:13 Mahmoud Hashemi: Exactly.

14:14 Michael Kennedy: You're like, this cannot be getting that much love.

14:17 Mahmoud Hashemi: I'm setting out to sort of make a list that doesn't do that. I'll get to that in a second but most importantly and you eluded to it earlier. Is just that they have to be shipped applications not libraries or frameworks. So one of the first things I do was I went on the awesome Python list. I'm like surely there's some applications on this list I can use as exemplars and I think I found three, right. Like Jupyter Notebook and it's not a huge surprise that Jupyter notebook's written in Python. But often Python is all about the libraries and I want a list of applications. Hence, Awesome Python applications.

14:48 Michael Kennedy: It's really good and there was not really anything out there like this and I think it's great. When I was looking through it, you know, I expected a lot of web applications that I'm sure will find them and we'll talk about some of them. But there's also a desktop GUI's, Terminal. Sort of ASCII type of app, ASCII curses type applications. A lot of variety there.

15:11 Mahmoud Hashemi: There's so many ways to taxonimize it and I could get into that if you'd like cause I spent hours mulling over it. But basically we have the basic topic breakdown. But then we also got sort of server versus if it's a server like software. Or if it's sort of like clients or GUI software and then as far as the console in concerned right. You have CLI's which is just a command line. And the you have a TUI text user interface, somewhat lesser known term. And then you have sort of the interactive console style. Those are the three that work with in the terminal. And then you have sort of some other interesting architectures out there too. Which I'm like eh, we'll cover it later. But the good news is that when I launched it on accident, on Brian Ogden's testing code is that I basically have found 180. I was turning over rocks. I was looking on Bitbucket, Launchpad. This isn't just GitHub and Git, it's like bizarre, Mercurial. There's like Calathea and Pajore and like which I hadn't really heard of but I found applications hiding on there. And of course all these GitLabs out there that people self host too because there are all these great sub communities. The Debian sub community, the Fadora, Red Hat, GNOME. And they all have their own cool organizational support and community and it's really interesting to not just find the application but also that community. So yeah, we've got 180 when I accidentally launched it sort of at the beginning of this year or maybe like a little bit before. Like Merry Christmas to the Python community I guess.

16:43 Michael Kennedy: Yeah.

16:43 Mahmoud Hashemi: But yeah, so I published a blog post. Once that sort of started picking up and that's still up on the site. So if you go to my blog setimental.org, it's just up there. I'm sure if you search Awesome Python Applications, it'll come up.

16:55 Michael Kennedy: For sure and I'll link to it in the show notes.

16:56 Mahmoud Hashemi: I haven't blogged since then because pretty much all of my free time for content creation has just been going towards curating this. And I've been learning a better way to share it. I guess that sort of like why am I on the show? But the point is that at the beginning of this month I was at something like 250ish. And I guess I should say at the beginning of September 2019 I was at the top of around 250 applications. And now I'm at like 312, I'm aiming to get to 350 by end of September 2019. I say 350 not because it's a particular goal but because I have a list of candidates that I need to evaluate.

17:34 Michael Kennedy: Sure.

17:34 Mahmoud Hashemi: A lot of them are community sort of submitted is really very time consumptive to find these. So, if people want to help me out that'd be great.

17:43 Michael Kennedy: Yeah, there you go. That's cool, so yeah. I think I'm going to do a PR for Docassemble for you which is a legal interviewing software based, I think its based on Django.

17:51 Mahmoud Hashemi: I just looked it up and it actually looks very impressive. It's exactly the sort of polished application that I think needs some more open source love. Anyways I mean one might think besides my personal addiction, why do this? And I have sort of three goals that I'm trying to hit or at least trying to fulfill. I'm not sure if they'll ever really be hitable. Maybe not like measurable enough but I really want to. Goal number one is I really want a better development cycle. Like someone ideates an idea for an app. They go basically do maybe django start project or they open up a blank editor and just start writing it. And they start searching on Stack Overflow almost immediately, you know. Maybe they find a tutorial that gives them a to-do or something to just barely get off the ground. But beyond that it's all first principles. And I just really want people to sort of benefit from all of the other discovery that these application developers have created. The way I put it in my PyBay talk last month was that Python is the biggest language in the world right now. But how do you personally benefit from that?

19:01 Michael Kennedy: Yeah.

19:02 Mahmoud Hashemi: You know.

19:03 Michael Kennedy: There's all this stuff out there. There's a bunch of great libraries we benefit from.

19:05 Mahmoud Hashemi: Absolutely.

19:06 Michael Kennedy: I do think one of the differentiators between a beginning programmer or a beginner in an ecosystem and someone who's very experienced. Some form of expert I guess, is that a lot of times the beginner will start coding and think everything has to be created from scratch, right. I need to load CSV's so let me just read the text of the file and start splitting on, like commas. Or like weird stuff like that right. Whereas the more experienced person is like well you know. Panda's will read it or there's also a CSV module in the library.

19:44 Mahmoud Hashemi: Right.

19:45 Michael Kennedy: Just use that. I feel like this, it kind of helps the beginners close that gap to say I'm looking like I want to build something like that. Let me see what they did, right. Let me see how they structure their files system and they organize their code. Are they even using Celery? I heard I have to use Celery. Do I have to use Celery? I don't know.

20:03 Mahmoud Hashemi: That's sort of the thing. If you go on an Awesome Python right now sure, you're going to find hundreds of libraries but what. Are you supposed to use them all? What all ingredients goes into making sort of a complete version of the app that you're sort of trying to build of that architecture? Yeah I think that basically my longer term goal here is to have sort of a decision tree type interface. So you can say like, look. I want to build a web application. And then you can sort of say, oh. I want to support this many users or I wanted to basically use say SQLAlchemy. Or have a docker image or something like that and you can find your way to an application that looks enough like the thing you want. That you can just pull things wholesale from it. And basically not only get an application sooner but also learn the best practices without having to go through dozens of hours of conference talks and blog posts.

21:00 Michael Kennedy: This portion of Talk Python to Me is brought to you by Linode. Are you looking for hosting that's fast, simple and incredibly affordable? Well look past that bookstore and check out Linode at talkpython.fm/linode. That's L-I-N-O-D-E. Plans start at just five dollars a month for a dedicated server with a gig of RAM. They have 10 data centers across the globe. So no matter where you are or where your users are, there's a data center for you. Whether you want to run a Python web app, host a private git server or just a file server. You'll get native SSD's on all the machines. A newly upgraded 200 gigabyte network. 24/7 friendly support, even on holidays. And a seven day money back guarantee. Need a little help with your infrastructure? They even offer professional services to help you with architecture, migrations and more. Do you want undedicated server for free for the next four months? Just visit talkpython.fm/linode. Compare this to the Cookiecutter library of templates for us.

21:57 Mahmoud Hashemi: Yeah, I mean that's actually. I hadn't really thought of that but that's a very good point. In a way cookiecutter is trying to codify these architectures as well. I guess not to put too fine of a point on it, right. I have looked through 300 of these. I've looked through over 1,000 applications and very few of them still bear the marks of having started as a cookiecutter application. I think that cookiecutter certainly has it's place for getting a kick start. Especially for people who maybe don't want to write their own setup.py or something like that. But yeah, I'm not sure that it's going to be enough architecture to really get you completely off the ground. There are emerging technology in container, containerization around flat pack and say app image and Snap and so forth. Especially for these GUI's applications on Linux. Just as an example and maybe there's a cookiecutter for something like that. But I don't think it's been done enough times for it to show up on that radar. And so basically you'll have to look for the existing community within the existing. I mean it's very hard to call it a community actually. Within the existing ecosystem. Who has actually adopted this? And so that's like, I'll get to that in a minute here. But I did clone every single repository. Clocked in at around 30 gigabytes for 250 repos. I sort of ran a bunch of analysis and technology discovery within it to figure out who's doing what.

23:22 Michael Kennedy: That's cool.

23:23 Mahmoud Hashemi: Yeah, in fact let's just jump right into it because I've got these numbers and I've just got to sort of get them out there.

23:28 Michael Kennedy: Sure, let me throw one quick comment though on the cookiecutter bits...

23:31 Mahmoud Hashemi: Sure.

23:32 Michael Kennedy: While you're pulling up the numbers. I feel like Cookiecutter is great to help people jumpstart and the idea is you kind of get this prepacked app which is great. Like here's a way to do Flask that already talks to a database or a queue. Or it already sends email or something. Right. That is a long ways away from here is an actual application, right. Even as a developer when you're building an application. You think aw man, I'm almost done, I'm 80% done. You're like less than half done. There's all these little edges you have to smooth off in these edge cases that you don't think about. And these, these are real applications that shipped. Which means...

24:07 Mahmoud Hashemi: Right.

24:08 Michael Kennedy: They're way more polished and I don't even know the how comparable. They're both good for people starting but I feel like this is a much different set of things in the catalog.

24:18 Mahmoud Hashemi: And to be clear the polished for the end user. What I think is really interesting about them too is like all the rough edges they have. That can be really good for a programmers ego. To be like look, this worked. Okay, like I don't need to over think this. I'm just going to say this is good enough. If it's good enough for them, it's good enough for me, ship it.

24:36 Michael Kennedy: Yeah, it's easy to get hung up on trying to make it perfect or trying to set up infinite scalability. So it's like Google and you're like you have no users yet, forget scalability. Just get it out, right.

24:46 Mahmoud Hashemi: Exactly and so I got my numbers up.

24:49 Michael Kennedy: Let's hear them.

24:50 Mahmoud Hashemi: So yeah, the quick methodology here is I just cloned every repo that's Mercurial, Bizarre, Git, pulled them all. I ran some slot count and other analytics over the version control history. We're looking at 19 million lines of code. Two million commits with around 50,000 commiters.

25:10 Michael Kennedy: That's an insane amount of effort. That is huge. And this is Python code?

25:14 Mahmoud Hashemi: Yeah and it was like hundreds of years of maintainership or whatever you want to call it, you know. Sometimes you on maybe Steam or something like that. And it sort of aggregates for you how many hours was played on a game. I just sort of feel like, I feel a little bit terrible about humanity perhaps. But, you know. We all got to have fun. Anyways, but this I had sort of the opposite feeling, right. This, I think I actually experienced the true meaning of an awesome list. I was in awe at the amount of stuff I was looking at. And I sort of broke it down. So real quick thing. About a third of them are server software with 64% being sort of desktop or CLI. And just sort of a quick breakdown there. I think the scariest thing for me was basically Python capability. Because we hear a lot of doom and gloom at times. But like contrasted with a lot of excitement about Python 3. And I was worried that these maintainers are stretched so thin. They have not have time to add Python 3 support and here we are with something existential facing the Python application community. And I was very pleasantly surprised to find that two thirds of these applications already support Python 3, like they run on Python 3.

26:33 Michael Kennedy: That's great.

26:34 Mahmoud Hashemi: It was a huge relief. This isn't the same thing as a library supporting Python 3. This is them having converted a code base that was often quite large to Python 3 many of them. They didn't start this way because I think the average. Or sorry, the median application list is something like 10 yeas old. And some of them much older than that. Like Mailman, the thing that Python uses and many other organizations use for managing their email lists. That's written in Python and it's made to jump, you know. It's forced Python 3. And that's pretty impressive and if you look at sort of the compatibility over time. You do see the Python 2 applications kind of tapering off. You haven't had another Python. I think the most recent Python 2 application was started in something like 2017 versus Python 3 applications which are being started today.

27:20 Michael Kennedy: Right, there's probably a bunch more Python 3 apps starting now than that are Python 2 I would expect.

27:26 Mahmoud Hashemi: You see a nice healthy trend there, thankfully. I was super relieved.

27:30 Michael Kennedy: I'm sure, you either have to be crazy or depend on a library that is Python 2 only deeply to start a Python 2 app these days.

27:39 Mahmoud Hashemi: And those are rarer and rarer. So it's a little bit harder to justify. There's still some interesting case studies in here. So for instance, I think Oil Shell actually converted to Python 3 and then back to Python 2. He has a whole blog post about why he did that, the lead maintainer there. So, that one's an interesting case study. And I think that's sort of what I hope to. One of the deliverables I hope comes out of the list is just all the interesting case studies. I think we'll get to those too.

28:07 Michael Kennedy: Yeah, there could be a bunch of stuff coming out of there I think. You know, another thing I think would be valuable. Not that you don't already have enough going on. But maybe some kind of newsletter where just the stuff that gets added that month or something comes in and just on there, that'd be great.

28:22 Mahmoud Hashemi: I'm too much of an engineer for that Michael. That's why I've added an RSS feed, you know.

28:26 Michael Kennedy: We could just pull the data ourselves.

28:28 Mahmoud Hashemi: Yeah exactly, just pull the data yourself right. If you want to do a newsletter and all that. I fully support that but yeah. I'm going to take a technical approach to this social problem.

28:35 Michael Kennedy: Awesome, okay.

28:38 Mahmoud Hashemi: It would be good to get there, it would be good to get there. Once the list sort of maybe starts plateauing a little bit more and also once. Yeah, once I get some more stuff in place to keep the quality high, you know. Because some of these projects might go into an unmaintained status. And I'm not going to be the one vetting 350 projects every month, you know. So I need to automate some things first.

28:58 Michael Kennedy: How much automation could you do? For example, every one of these has a link to their repo. Could you look for unresponded to open issues or the lack of a commit for one year? Puts it onto like a warning list and then...

29:13 Mahmoud Hashemi: Exactly.

29:14 Michael Kennedy: contributor to come in and go this looks suspicious.

29:17 Mahmoud Hashemi: That sounds exactly what I have planned and I sort of have. I have a little command in my application I build for managing awesome lists. It's surprising how many awesome lists are just manually maintained via PR's. But yeah, all my stuff is in YMAL file. I think I have something like a thousand links in there. All of them are going to need to be checked for 404's and stuff like that. But yeah, I do sort of have an informal bar of you need to have a commit into last year. There are exceptions to that. For instance you mentioned Reddit earlier. Reddit is a massively important Python application for the internet. And they only have an archival version of the site up from around 2017. But still I mean it was still a very big site in 2017 with a lot of real lessons.

29:58 Michael Kennedy: It is such an important site that I think it's clearly deserves to be on there. It actually surprised me that it was there.

30:04 Mahmoud Hashemi: There was a couple other surprises in there too I think. I'll just sort of refer the audience maybe to the repo where we have a Jupyter Notebook that has all of the graphs in them. Like some of the numbers on the site to you now are going to be out of date in just a week or two when I give a PyGotham talk. Cause I'm 350 applications instead of 250. But we'll keep the Jupyter Notebook and the repo up to date. But suffice to say for now, there's some really interesting findings in there.

30:31 Michael Kennedy: I'm looking at it now. This is super well done. I'm so glad you put this up as a notebook so it can just like live, that's great.

30:37 Mahmoud Hashemi: The most surprising trend. This doesn't effect me very much but it was surprising to me just how stark it was. The increase of QT based GUI applications...

30:48 Michael Kennedy: Yes.

30:49 Mahmoud Hashemi: compared to GTK based GUI applications. When you look at the graphs in there. Remember all of these applications have had a commit. Like 95% of it had a commit in 2019. It may look like oh, this is an old application. But it's maintained, it's used. It is awesome and deserves some respect there. But people are starting to choose these applications these days, not GTK.

31:10 Michael Kennedy: Yeah, yeah. I'm looking at that list. It's almost 50% QD and then 2% Pygame, 2% Kivy, 17% WX and 30% GTK. You know, pretty interesting. There's a bunch of graphs like that. I don't know how long this is but the scroll bar is very small. That's great.

31:27 Mahmoud Hashemi: All do credit there. I was on a huge time crunch coming off of a speaking tour in Africa let's just say. I gave a keynote in Tunisia, nothing to mysterious. But basically I was super time crunched and so that notebook right there is basically all my wife's work. She's in the commit log and so forth. But credit where credit is due. I did not have time to do that stuff and she really saved my ass.

31:48 Michael Kennedy: Yeah she came through, that's awesome. Congrats, that's good work.

31:52 Mahmoud Hashemi: Yeah anyways, I can go on about the stats but I think that probably at this point people have got to be curious what other case studies are hiding on this list.

32:00 Michael Kennedy: Yeah, they've got to be. So maybe pull out some highlights that stand out for you and then I grabbed a couple that maybe we could go more quickly through, like more of them. Just skip across to give people a flavor of what we're talking about.

32:11 Mahmoud Hashemi: Absolutely, so one of the ones that I highlighted was Open edX. I know that Ned Batchelder is a big deal in the Python community. And this is where he works last I checked. But the edX platform, edX platform has something like 51,000 commits at the time of writing. And it's really interesting to me is it is a mano repo and it represents a whole teams work. And so there are all these dynamics that you can see happening there. And it's one of only three projects where no one developer has more than 10% of the commit history.

32:45 Michael Kennedy: Wow.

32:46 Mahmoud Hashemi: Yeah, it's third largest Django project on the list with 300 committers. And it powers all of edx.org.

32:53 Michael Kennedy: Yeah and I think MIT's open courseware in a bunch of others as well.

32:57 Mahmoud Hashemi: Yeah and just for contrast. 41% of applications are mostly written by one committer. How's that bizarre for you?

33:05 Michael Kennedy: It is bizarre but thinking about it. It makes sense to me right. It seems like... Somebody just wants this app to exist so they're going to go create it. But then you see the things like Reddit or Open edX or some of these others. You're like these are built by large groups of people.

33:18 Mahmoud Hashemi: I sort of want to find a way to highlight these organizationally supported, foundation supported, corporate supported applications. Because they definitely follow a different sort of path compared to your average just sort of side project.

33:32 Michael Kennedy: What do you think about using, folks out there listening. Maybe they're a PhD computer science folks or other types of researchers. Do you think that this list might put together some interesting set of data for people to look at how apps are built?

33:47 Mahmoud Hashemi: I think that if there's someone who's doing some sort of combination computer science ethnography. Like you know study social science studies. There's absolutely a ton to learn from here. I mean I didn't sort of pick and choose applications based on these findings. These findings were, you can't exactly call it randomly selected. Especially not with the Python qualifier in there right. But it is still a very interesting cross section. If somebody want to do that sort of thing. I'm all for it, happy to support that.

34:13 Michael Kennedy: Alright so tell us some of the other ones.

34:15 Mahmoud Hashemi: Yeah, absolutely. So, another one that I highlighted was Sentry. And so Sentry is very interesting because it's so big and often promoted on podcasts and conferences and so forth. A lot of people are surprised that their entire code base is just sitting on GitHub. It's got like it's 26,000 commits since 2008. For those who don't know, it's web service and front end for cross platform application monitoring. So it's sort of like New Relic and stuff like that. A little bit different than Datadog though cause the focus is on error reporting. But we're looking at a millions lines of Python. I think it's one of the very few projects to actually cross that million line threshold cause you know. A million lines of Python is like 10 million lines, hundred million lines of less efficient languages.

35:01 Michael Kennedy: Yeah, yeah exactly. A million lines of Python, that's a lot of code.

35:04 Mahmoud Hashemi: That does include about 120,000 vendored lines. Just so you know I went looking for that possibility. I went looking for libraries that maybe got vendored and tweaked and just copied in.

35:14 Michael Kennedy: Maybe define that for folks cause not everyone will know exactly what it is you're talking about.

35:17 Mahmoud Hashemi: Yeah, of course. So vendoring is what you do when a library does something you don't want. Like drop support for intercompatibility with something else. And you're like okay, well I'm just going to create my own mini fork of this inside of my repo.

35:30 Michael Kennedy: Right, like if I didn't want to depend upon requests. Theoretically I could just jam the source code for requests into my app.

35:37 Mahmoud Hashemi: Yeah, you probably didn't pay for it but it's called vendoring because the person who publishes the library is a vendor. It's kind of like that.

35:46 Michael Kennedy: You've taken responsibility for it right?

35:48 Mahmoud Hashemi: Right.

35:49 Michael Kennedy: Okay but still, that's still a lot of code. That is, it's...

35:51 Mahmoud Hashemi: Huge amount.

35:52 Michael Kennedy: 880,000 lines, that's not vendored.

35:55 Mahmoud Hashemi: Yeah, the largest Flask app I've found in comparison is Pagure, P-a-g-u-r-e. And that's what's called a forge. It's like a GitLab or a GitHub but written in Python and it's written in Flask. So, it's almost like 10X smaller. So Sentry, it's code is sitting on out there and it's also BSD3 license. Which is very permissive for a for profit application. So that's kind of interesting.

36:17 Michael Kennedy: Yeah, that's super interesting.

36:18 Mahmoud Hashemi: The quirkiest thing I found though. The quirkiest case study was this application called Ganetti and that G-a-n-e-t-t-i. And it had like, I don't know. 100, 200 like stars on GitHub or something like that. And it was just like what even is this thing? Because... Because it was written in half Python half Haskell. So the very pure functional thing mixed with the very pragmatic sort of systems thing. And it had 16,000 commits since 2007, so very mature. It's a cluster management tool focused on long lived VM's used for workloads that don't have builtin redundancy. So like web servers, like web services. You can shut down a worker and start up a worker. But the other workers will take over right. But if you need a job to not fail and it's going to be very long lived, right. You might use this and so it was actually developed in Google. And that was also very strange because Python has Haskell. Especially Haskell, these are not super common language is defined, Google for something that appears to be this important. And it's pretty widely deployed. I found a nice discussion of Wikimedia just talking like should we use Open Stack for this or should we use Ganetti for this? And I think they end up going with Ganetti. So, it's still pretty commonly used and that one was an oddity. Another one along those lines is a thing called LocalStack. And that's a developer tool. It's useful because it's sort of does a mock service that you can run locally of AWS. And so if you're doing dev ops code you can run a mini AWS locally and write tests against it. And that one is, I think, a third Java.

38:01 Michael Kennedy: It's one of these blended stories right.

38:03 Mahmoud Hashemi: Yeah, yeah so hibernizing with JavaScript and HTML and CSS. Everyone expects for a web service, right. But to see things mixing with Java and Haskell. These are some pretty interesting case studies that have a whole story of their own I'm sure.

38:17 Michael Kennedy: This portion of Talk Python to Me is brought to you by Tidelift. Tidelift is the first managed open source subscription. Giving you commercial support and maintenance for the open source dependencies you used to build your applications. And with Tidelift, you not only get more dependable software. But you pay the maintainers of the exact packages you're using. Which means your software will keep getting better. The Tidelift subscription covers millions of open source projects across Python, JavaScript, Java, PeachP, Ruby, .Net and more. And the subscription includes security updates, licensing verification, indemnification, maintenance and code improvements, package selection and version guidance, roadmap input and tooling and cloud integration. The bottom line is you get the capabilities you'd expect and require from commercial software. But now for all the key open source software you depend upon. Just visit talkpython.fm/tidelift to get started today.

39:12 Mahmoud Hashemi: So which ones did you like?

39:13 Michael Kennedy: Well you know I went through and I just wanted to pull a couple that stood out to me on each of the categories. So if we go to the Awesome Python Application list. There's a ton of categories. You sort of put it into major categories and then the developer space just became a meta category. So you've got...

39:31 Mahmoud Hashemi: Oh yeah.

39:32 Michael Kennedy: And audio and graphics and productivity and education and science.

39:35 Mahmoud Hashemi: I pulled out the developer stuff compared to the non-developer stuff because I think that the developer stuff in general is a little bit easier to find. Like you're going to find blog posts and higher GitHub stars for developer focused applications in many cases. It's not a super, super start difference when I ran the numbers. But definitely like most of the listeners have heard of Ansible, you know.

39:56 Michael Kennedy: Right, right, right or Open Stack or something like that, yeah.

39:59 Mahmoud Hashemi: Exactly, exactly.

40:00 Michael Kennedy: Yeah cool, so there's a bunch of stuff under there and I didn't pull them out quite like that. But maybe just to give people a sense of what's out there. Some of these are really small and niche and you can tell they're for somebody and others, they've been the top 10 sites on the internet.

40:13 Mahmoud Hashemi: Yeah.

40:14 Michael Kennedy: Let's just go, I'll go through the rest and we can just maybe give me some real quick thoughts. So under the internet category we have Deluge which is a popular lightweight cross platform BitTorrent client.

40:25 Mahmoud Hashemi: Deluge which is how I pronounce it. I mean I don't know how to pronounce it. Well I don't know. It doesn't come with a pronunciation guide last I checked but basically it's BitTorrent client and it has a few very interesting things about it. One, it's a GUI application. It uses Twisted, if you just look for best BitTorrent client. You'll find it making those rankings. And people aren't saying like this one is great cause it's written in Python, you know. It's just a good application, very solid. People like the UI, et cetera. And it happens to be written in Python and so that was one of the ones that I got off the top of my head when I made my list of 20 because I am a Deluge user. Proudly backing up my Linux images. The Deluge thing also has a little hidden gem which is that they manage to have a web UI that is almost identical to their GUI UI. But it's just like sort of allows for remote administration. Still classified as a desktop application cause it's mainly written for single user use. But it does have a web UI and I'm not sure how they did it but the experience is the same.

41:32 Michael Kennedy: Wow.

41:33 Mahmoud Hashemi: In the browser as using I guess like. I'm not sure if their QT or GTK, I can check.

41:37 Michael Kennedy: Sure, sure, sure.

41:38 Mahmoud Hashemi: But yeah, I'm a big fan of Deluge. I highly recommend checking out their repo.

41:41 Michael Kennedy: Yeah, all these have links to repo, to demos, to their home page, things like that. Another one is Lixire, a feature full file host and link shortener with API. So kind of like Bitly maybe.

41:53 Mahmoud Hashemi: This one's an interesting one. It seems to be used by a community I don't fully understand. But it gets a lot of use. So yeah, this is a recent addition. I'll probably explore it a little bit more later. But yeah, it is sort of like an imager combined with a Bitly. Which sort of makes sense. If you're pacing things into an IRC chat or something like that. It's got everything you need.

42:16 Michael Kennedy: Yeah super, the next one people might have heard of, it's called Reddit.

42:20 Mahmoud Hashemi: Yeah.

42:22 Michael Kennedy: We don't need to say a lot more about that other than this is sort of a snapshot from 2017. But I think it's still super cool that this is here because here's a high scale application. I don't know exactly where it ranks but it was in the top 20 sites at some point for destinations right?

42:36 Mahmoud Hashemi: Definitely, I'm not sure where it's at now either. But yeah, definitely a major destination. Especially I'm sure for many listeners. Whether you like it or not. You're going to probably find some useful stuff through Google search or something.

42:47 Michael Kennedy: Yeah.

42:48 Mahmoud Hashemi: But one interesting thing about Reddit is that they have a very interesting approach to their schema design. If you dig in, they're using what's called an OAV pattern or an EAV pattern, Entity Attribute Value. And it's kind of like, it's sort of like objects storage or document storage built on top of relational database. And so some might consider it an anti-pattern to use it to this extent. Because you're loosing out on the constraints of a relational database. But it managed to work for Reddit so it can't be all bad. Anyways...

43:20 Michael Kennedy: Theory meets reality right.

43:21 Mahmoud Hashemi: Theory meets reality right. Yeah exactly and some might say like oh, it's very slow compared to an optimized schema with better indexing and so forth. But one, databases have done some degree of optimization for this pattern. It has a Wikipedia page, it's a pretty well know pattern. But also Reddit is fast because of caching. That's the other main learning there. If you look in there it has layers and layers of caches. Very interesting exemplar to learn from.

43:47 Michael Kennedy: Yeah, I would definitely think so. Alright, next category is audio and I grabbed two of those out of there. One's called Exile, something like this. Which is a cross platform audio player and library organizer tag editor type of thing that looks pretty interesting.

44:00 Mahmoud Hashemi: I found this because I went on. Everyone's familiar with Wikipedia. There's also this thing called Wikidata and Wikidata allows you to use a Sparkle query which is sort of like a. I don't know how to describe it exactly. But it's sort of like SQL but it's sensible for graph networks kind of. So I ran a query for all of the software that had Wikipedia pages that was known. You know with Wikidata to have been written in Python. And so I actually found it through there. I'm not a user myself but yeah. It's actively developed. It has a release come out just a few months ago. And it's sort of like, kind of like Amarok if anyone remembers that. But it's got last fm support and plugins and so forth. Seem pretty cool.

44:42 Michael Kennedy: Nice, another one. Musicbrainz Picard which automatically identifies tags and organizes musical albums which sounds pretty cool.

44:48 Mahmoud Hashemi: It's a funny name but just amazing software, let me just say. I don't use it as much as I used to back when I used to DJ and stuff. But basically the way that Picard works. So there's a Musicbrainz foundation that develops it. And they also have a sort of like. It's sort of like a Wikipedia but for album information. It's kind of like Discogs or something like that but open. And what it'll do is you can take one of your CD's, make a backup right. And that comes through and it just says track one, track two, track three, track four. But it'll take the number of tracks combined with the length of each track to generate kind of a fingerprint. Match that against an internet database of album listings and so forth. And then automatically bring in all the fully filled out ID3 tags. So everything shows up very searchable.

45:37 Michael Kennedy: Oh, that's awesome. Yeah, with Album Art and all that yeah.

45:40 Mahmoud Hashemi: Man Michael, I used to spend. I mean you're in the audio world now too. I'm sure you know. I used to spend so many hours getting those tags just right.

45:48 Michael Kennedy: Yes.

45:49 Mahmoud Hashemi: And still have typos and then is where we press one button. 20 seconds later it's just all perfectly organizable and tagged and so forth. It made me feel like a fool but also made me feel pretty magical.

46:01 Michael Kennedy: Yeah, well I'm sure you appreciated it more than if you hadn't done that stuff by hand.

46:04 Mahmoud Hashemi: And then one day I found out it's written in Python and I'm like that's going on the list.

46:07 Michael Kennedy: Yeah, that's definitely on the list. Alright, switching from music to audio to video. We have two editors, Flowblade and OpenShot. Flowblade does multi track non-linear video editing for Linux. And Openshot is cross platform video editor crop for media platforms.

46:22 Mahmoud Hashemi: Yeah, that one likes support OpenBSD and Windows as well. So with these ones I think what's interesting is that a video editor is a very large application. It's going to have to have a lot of Codecs and other things. And those things are very touchy based on what platform they're running on.

46:37 Michael Kennedy: They're incredibly finicky, it's super painful to bit develop that stuff. I've done it.

46:42 Mahmoud Hashemi: Exactly, Talk Python Training. So very finicky stuff and one sure shot way to ruin it too is to miss a dynamic library in your package. So, being able to go in there and look at how they do their packaging. Sure it's a little bit dirty but I'm sure that you're already deep in the dirty if you're looking for how to do this kind of application freezing. So, it's really useful to have those to refer to.

47:09 Michael Kennedy: Those are great examples. For the graphics world we have FreeCad for the general purpose cad editor. That sounds pretty intense and it's awesome that that's in Python.

47:17 Mahmoud Hashemi: So with FreeCad part of me was a little bit sort of skeptical, right. Couldn't an open source community come up with something like cad because when you're dealing with the bins world right. You know it's like construction and design and so forth. It's just so much and who could possibly have the side resources to do this? But you know, they're doing it. They have a Wiki and they don't seem to be giving up, it's pretty impressive. I read some sort of third party reviews and it basically says like look. This may not be top tier right. It may not be ready for full professional shop usage and so forth but it's surprisingly usable. I'm not much of a Cad user myself so I'll just have to take their word for it.

48:02 Michael Kennedy: Yeah right.

48:03 Mahmoud Hashemi: If somebody listening is a Cad user, let me know. Let me know if I should expand my description here. Write a review support Python Cad.

48:09 Michael Kennedy: Speaking of supporting, some of these have fund next to them.

48:12 Mahmoud Hashemi: Exactly, so the source of links that I collect for each one. Basically I'll do repo, home, Wikipedia if they have it. GitHub if they're not on GitHub but they do have a GitHub mirror. Demo if there's a version of it running that you can just try it out. Docs which is usually a read the docs link or something similar. And I added fund which is basically going to be like a Patreon link or a PayPal link. If I can find a way to show that these people are making it into this current generation of approach to open source. Which is like yes, please do pay me so I can spend more time on this. Then I will absolutely link it here because they deserve all the help they can get.

48:53 Michael Kennedy: Yeah absolutely, I think it's great that you're highlighting that. The other one in graphics is Cru image server. So I guess you give it a large image and you could say I like it at right now I want to do 250 X 250 or whatever. And you don't have to keep redoing it. It just automatically regenerates and probably caches it.

49:11 Mahmoud Hashemi: Yeah exactly, you upload an image and then you can scale it to all different sizes. They say they have all sorts of tweaks and optimizations to do that sort of stuff efficiently and performantly. And yeah, it has caching and stuff too. That's a pretty tough task, you know. That stuff can be pretty expensive. So I was like maybe someone is interested in performance tweaks for images and so forth. This would be a good project to go and see how they achieve that.

49:34 Michael Kennedy: Alright, let's go through the games real quick. There's a game section. I'll just quickly got through them. You can just give maybe some general thoughts.

49:39 Mahmoud Hashemi: Sure.

49:40 Michael Kennedy: There's Fret's on Fire X which I'm guessing is one of those sort of music things come down and you hit them. And then you've got...

49:45 Mahmoud Hashemi: Exactly.

49:46 Michael Kennedy: Streaming platform is like a homemade Twitch which is pretty cool. You've got Lucas Chess and Unknown Horizons which is a cool strategy game.

49:52 Mahmoud Hashemi: The game section is one that I definitely wish was larger. A game is again, a very finicky large application and one that's tough to undertake. Also there aren't a ton of open source and free ones. I know that Python was used in one or two of the civilization games for scripting. But that's not open source.

50:10 Michael Kennedy: Exactly.

50:11 Mahmoud Hashemi: But Unknown Horizons is pretty similar to that sort of thing. It is an RTS, it makes extensive use of Python. So I'm happy to see that there. Fret's on Fire is a little bit older but from what I can tell it still works for some people. So I guess Dance Dance Revolution never gets old. Or maybe, I don't know. Is there a rock band type thing, it says Fret's. I don't know, it's something like that.

50:28 Michael Kennedy: Something like that yeah. EVE, EVE online is another good one to throw in there but obviously it's not open source. So, you know you can't really do that, put it in there.

50:35 Mahmoud Hashemi: What else was Second Life, if you consider it a game. Maybe you consider it life.

50:42 Michael Kennedy: That's right, maybe you do. Alright, under productivity. The top one here that I grabbed is actually something I used to manage all of my servers and my infrastructure and I love it. Just love it, Glances.

50:51 Mahmoud Hashemi: Yeah, it looks really good. I guess I don't manage enough servers to really need it. I'm just like an HTop user but it's like HTop but better.

50:58 Michael Kennedy: Yeah, you get all sorts of cool stuff. If I just run Glances real quick, there you go. It gives you graphs for CPU memory swap. How much RAM you're using? Like server load over one minute, five minute, fifteen minutes. You can sort by memory usage, by CPU usage. It shows you your disk IO rate, your network rate. Just all sorts of stuff and great little hot keys for it yeah. It's like HTop but, yeah. Like all the stuff you want.

51:27 Mahmoud Hashemi: Yeah, I'll have to give it a shot.

51:28 Michael Kennedy: It's pretty good. Let's see, we also have BleachBit for privacy. So, it cleans up some stuff out of your system but also I'm guessing it rewrites every empty bit of space or something like that. Just giving the bleach part, I don't know.

51:41 Mahmoud Hashemi: I used to work way back in the day in high school for a brief time at a computer repair shop in South Dakota no less. But basically people, they would come in with Malware, Spyware, all this stuff. And you know...

51:55 Michael Kennedy: My computer is slow and I have all these popups.

51:57 Mahmoud Hashemi: Yeah, exactly. I only install nine toolbars on my IE 5.5 or whatever.

52:03 Michael Kennedy: I's like a 50 pixel bar in the middle where the actual content is.

52:07 Mahmoud Hashemi: Exactly, oh man. You're putting too much of an age out. Everybody's going keep this content evergreen. Anyways, Alright but basically what BleachBit is. I didn't realize it was written in Python but it is one of these cleaners that will go through and not just empty out your temp directories and so forth. But also clean up your registry because a lot of this software will make excess registry wrtes. And I guess especially back in the day that would slow down Windows. So, yeah it removes tracking things. Increasingly that's become the focus too. Sort of clean out your browsers of any weird flash cookies and that sort of thing.

52:42 Michael Kennedy: Cool yeah, that's great. Last one on the productivity space is Gmvault, GMail Backup.

52:46 Mahmoud Hashemi: I guess we sort of forgone. I personally forgone some of the stuff in my Gmail but probably I should back it up. I never know when big G is going to do something uncouth.

52:57 Michael Kennedy: Yeah, you never know. It's kind of nice to have that. I actually went and found out that if you go to the Google Doc, just the Google data export. You can say export my docs and one of the problem, you could run Google Drive and stuff. And it'll have your docs from Gmail in there but they're just a hyperlink back to Google Docs right. But if you run the export you could say export it as Word documents, Excel spreadsheet and PowerPoint. And actually get the content of it, not just links to it back in Drive, it's pretty cool.

53:31 Mahmoud Hashemi: Yeah.

53:32 Michael Kennedy: This sounds like this is kind of sort of in that realm. Organization, a lot of archiving stuff in this world and library bits. There's a funny one as well. Archivematica, digital preservation and then Archivebox which is like self hosted sort of way back machines.

53:47 Mahmoud Hashemi: Exactly. Archivematica, it's an interesting one because it's sort of like. I think it's sort of targeted at like libraries and actual archives. Where as Archivebox is a little bit more like gorilla. Like Yahoo says I can't delete geo cities, okay. Let's like you know, pull it on down. It's sort of those two schools which are definitely adjacent but a little bit different.

54:08 Michael Kennedy: Yeah, similar to I guess we have Open Library which is a web application for a library catalog. So for some reason you have a small little library. You don't have to start from scratch.

54:18 Mahmoud Hashemi: You'd be surprised, yeah. Libraries, I was at a library recently. It only was open a few hours a week but they had a tremendous collection. And they could definitely use something like this because they didn't have anything to search through. You know, you just had to walk the stacks.

54:31 Michael Kennedy: That's crazy and the last one, that one I said was funny is called I Hate Money.

54:34 Mahmoud Hashemi: There are some people who are very into sort of like personal finance management. I guess people who sort of picked up wanting to balance the checkbooks epigenetically somehow passed down to them and... But no, there's this whole movement of people who do plain text accounting. And get version there like the actual financial books. I think I Hate Money sort of comes from that domain combined with the self hosting domain.

55:01 Michael Kennedy: Yeah, interesting. They're like we're not doing Mint, forget Mint.

55:04 Mahmoud Hashemi: This one seems to be about sort of like shared budgeting and stuff too. So this is like Fava is the one that I was thinking of I added recently. But I Hate Money is basically like you have roommates and you just wanted to keep track of who bought what when. And it's like a little better than a Google sheet.

55:18 Michael Kennedy: Yeah, that's pretty cool. Alright so I'll speed through a few more here real quick. So communication with Askbot which is very some other stack of flow quite interesting. And SecureDrop which is like a whistleblower submission system for media organizations, these are cool.

55:32 Mahmoud Hashemi: If you're into self hosting or you have a team that you don't want to pay for Stack Overflow or something like that. Then you can then run your own Askbot and SecureDrop is a really important one. That one was originally written by Aaron Swartz and it's managed by the Press Foundation. I think they came out with a new release and yeah. And that's huge for journalism in our time.

55:51 Michael Kennedy: Yeah absolutely, one that I think is probably going to be really a welcome one for a lot of teachers and professors out there would be MB Crater, this is under the education system which is Jupyter based notebook. Basically create assignments in there and it will grade them automatically for you. That sounds wonderful.

56:07 Mahmoud Hashemi: Not quite that level of teacher myself but I think that it's a pretty cool idea. Instead of having just workbooks you can actually give a live notebook. Have them fill in some blanks, do some things. And then have them submit the IPYMB and grade that.

56:22 Michael Kennedy: Yeah.

56:23 Mahmoud Hashemi: Another one that I'd like to point out. I don't know if it's under education but a lot of people seem to know about it, even if they're not developers. But sort of like education related. It's called Anki, A-N-K-I. And it's basically like a flashcard program. And I meet all these lawyers and doctors that are like oh man. I would not have made it through school without Anki, right. It's like as important to them as Wikipedia or something like that because it's sort of like based repetition, memorization tool written in Python.

56:45 Michael Kennedy: So you can do anatomy or something like that right?

56:48 Mahmoud Hashemi: You got to memorize it all.

56:50 Michael Kennedy: There's no reason or rhyme. It's just that's part of the bone, it's called that so you can learn that. There's probably some reason but yeah, it's a lot of memorization.

56:58 Mahmoud Hashemi: You can't Wikipedia thing during surgery I'd imagine.

57:02 Michael Kennedy: Just hold real still, I'm researching. Yeah so, last one speaking of this kind of stuff is science that I want to dig into. And I think we're going to be probably out of time for touching on it. But I felt when I looked at the science area I felt like there's about equal amount as some of the other categories. But this is like really polished, really serious stuff. So we have Ascend which is a mathematical chemical processing model. Then you have CellProfiler which is interactive data exploration of biological image sets. You have SageMath which is a competitor to MATLAB and Mathematica. These are especially SageMath, these are real things.

57:38 Mahmoud Hashemi: No, SageMath is kind of a triumph, right. If you're maybe a MathLab user or something like that. You should definitely check it out if you haven't already. But yeah, all of these science applications. They get a lot of usage from their academic counterparts, student counterparts and so forth. But we don't often think to go find them as exemplars for applications we might want to build. But yeah, there's some really interesting ones in here. Like the C Can was one that jumped out at me cause it's a data management system. So it's sort of like DataHub that you can host yourself. And so if you are running an organization like a university or government. And you want to do anything in open data. You're going to need some sort of DataHub, some sort of portal for people to find that data. And you need to manage your data on there. And there's an open sourced one written in Python.

58:24 Michael Kennedy: Yeah.

58:25 Mahmoud Hashemi: There's another one that's called Orange. It's kind of like a component based data mining software for graphical interactive data analysis and visualization. But you can train machine learning models graphically and it sort of has a signal processing type metaphor. So like you have this thing and then you drag a little curve into that thing. And make like a little flow chart and then once it's working. You can export a Python program from that. And I've been using that since I think 2012, 2013. So it's like from somewhere in Eastern Europe and I think the professor who leads the lab that writes it. I think I have his book as well. Itself like kind of a triumph too. It also uses QT4 and QT5. So if you're undergoing a QT transition with your application. It might be an interesting one to look at too.

59:10 Michael Kennedy: Oh wow, that is a super interesting angle.

59:11 Mahmoud Hashemi: Yeah.

59:12 Michael Kennedy: Yeah okay great. So I think we're probably out of time for diving into any more. But there's so much more to cover. So there's this CMS category, the ERP category...

59:18 Mahmoud Hashemi: Business software.

59:20 Michael Kennedy: Yeah, oh my goodness. SAAS and all that. So static sites and then there's the dev super category I'm calling it because there's 129 items in there. And a bunch of sub categories like source control stuff. And people can just go through the list and find it. I think this is great.

59:34 Mahmoud Hashemi: Yeah, I could go on for hours, I really could. Each one is more exciting than the last.

59:39 Michael Kennedy: Doing so good, Alright. With that I think we should sort of leave it there and the people can go and they can explore that.

59:45 Mahmoud Hashemi: Please.

59:46 Michael Kennedy: There's probably about 250 we haven't even touched on at least. And then people also out there who are listening. Maybe they maintain one of these projects or they use one of these projects. They want to recommend it to you. What's the story there?

59:59 Mahmoud Hashemi: I have GitHub issue template. Jump in there, make sure that she fulfills the criteria of being maintained and so forth. Again, I'm not super super strict on that but if one particular category is really over populated. And it's the seventh link shortener or something like that. I got to be a little bit decisive there. If it's something that is pretty under nourished category, right. You know more is better, the more the merrier. So if you have a game that you know of written in Python. It'll probably make its way in.

01:00:28 Michael Kennedy: Yeah, super. Maybe someday the game category will break up into tower defense and strategy or whatever, who knows.

01:00:35 Mahmoud Hashemi: I really want to do a taxonomy sort of refactoring because some of these categories are bursting at the seams. I recently added the storage category because I found all of these database related things written in Python. I don't know, there's just so much.

01:00:48 Michael Kennedy: Yeah, super cool. Alright, now before you get out of here. The last two questions real quick...

01:00:53 Mahmoud Hashemi: Sure.

01:00:54 Michael Kennedy: For you. When you write some Python code what editor are you using these days?

01:00:57 Mahmoud Hashemi: Yeah, I'm still getting it done with Emacs but you know. I'm open to experimentation and one of these days maybe one of these other editors will get his hooks in me.

01:01:04 Michael Kennedy: Yeah, cool well I'll keep asking you next time you come on the show.

01:01:07 Mahmoud Hashemi: Sure.

01:01:08 Michael Kennedy: And then notable PyPi package. You've got some good ones out there.

01:01:11 Mahmoud Hashemi: The thing that's so dominates for me is Glom. So I have this Python package, it's called Glom. Use it for deep getting into a dictionary but also a variety of other things. It's sort of like a data templating system. And more and more it's becoming like a higher level programming language almost. So, right now we're building streaming support into it because a lot of people have brought up sort of the size of the data that they want to manipulate with Glom. And so we got to have some sort of streaming metaphor in there. But I'll also point out that a lot of these awesome Python applications are distributed through PyPI in some way. So I have PyPI URL's on a bunch of them too. So, shout out to them.

01:01:49 Michael Kennedy: Yeah, yeah awesome, awesome. Alright well final call to action. People want to get involved in your project. What are the ways? I already touched on some of it for submitting stuff. But what are some of the ways people can get started or get involved?

01:02:01 Mahmoud Hashemi: Yeah, definitely check out the repo on GitHub. Check out the listing in the ReadMe. Check out that Jupyter notebook that's in there. Look for the RSS link as well if you're an RSS user like me. A bunch of interesting Python RSS apps. If you're not you can just download one on that page too. And yeah, then submit some more applications and also keep your eye out for the PyBay talk and probably the PyGotham talk.

01:02:25 Michael Kennedy: Alright, excellent. Well, this was so much fun to talk to you about all this stuff. And nice work on having this angle cause I don't feel like it was covered and you've done a really good job. It seems like it's taken off. It's got almost 10,000 stars on GitHub.

01:02:40 Mahmoud Hashemi: Yeah I mean, it's not really about that but I am happy to see that it sort of gets out there more. Speaking of giving it more coverage. I did technically start a YouTube channel you can like and subscribe if you'd like. It's Yak.party, Y-a-k.party. Maybe I'll post some APA stuff to it in the coming future once I'm done with these talks.

01:02:57 Michael Kennedy: That sounds great. Alright well, thanks for being here as always.

01:03:00 Mahmoud Hashemi: Absolutely, anytime Mike.

01:03:01 Michael Kennedy: You bet, bye. This has been another episode of Talk Python to Me. Our guests on this episode was Mahmoud Hashemi. And it's been brought to you by Linode and Tidelift. Linode is your go to hosting for whatever you're building with Python. Get four months free at talkpython.fm/linode, that's L-i-n-o-d-e. If you run an open source project Tidelift wants to help you get paid for keeping it going strong. Just visit talkpython.fm/tidelift. Search for your package and get started today. Want to level up your Python? If you're just getting started try my Python Jumpstart by Building 10 Apps course. Or if you're looking for something more advanced. Check out our new Async course that digs into all the different types of Async programming you can do in Python. And of course if you're interested in more than one of these be sure to check our our Everything Bundle. It's like a subscription that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for Python. We should be right at the top. You could also find the iTunes feed at /itunes. The Google play feed at /play and the direct RSS feed at /rss on talkpython.fm. This is your host Michael Kennedy. Thanks so much for listening, I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon