Monitor performance issues & errors in your code

#382: Apache Superset: Modern Data Exploration Platform Transcript

Recorded on Monday, Sep 19, 2022.

00:00 When you think data exploration using Python Jupyter notebooks likely come to mind, they are excellent for those of us who gravitate towards Python. But what about your everyday power user? Think of that person who is really good at Excel but has never written a line of code. They can still harness the power of modern Python using a cool application called Superset. This open source, Python based web app is all about connecting to live data and creating charts and dashboards based on it using only UI tools. It's super popular, too, with almost 50,000 GitHub stars. Its creator, Max Bushman, is here to introduce it to us all. This is Talk Python to Me episode 382 recorded september 19, 2022. Welcome to talk Python to Me. A weekly podcast on Python. This is your host, Michael Kennedy. Follow me on Twitter, where I'm @mkennedy and keep up with the show and listen to past episodes at Talk Python.FM and follow the show on Twitter via @talkpython. We've started streaming most of our episodes live on YouTube, subscribe to our YouTube channel to get notified about upcoming shows and be part of that episode.

01:20 Did you know one of the best ways to support the show is by taking one or more of our courses? In fact, we have one of the largest libraries of Python courses out there, with over 240 hours of videos.

01:32 Before we get to the conversation, I want to quickly let you know that we just released three new ones, Django Getting Started, Getting Started with Pytest and Python Data Visualization. All three are excellent courses, and their landing pages each have a video introducing the course. Visit Talkpython.FM and click on courses in the NAV bar to learn more. Thank you for making Talk Python training part of your career. Journey now onto the show.

01:58 Max, welcome to Talk Python to Me.

02:00 Well, thank you. Excited to be here. Love to talk about Apache superset. So hit me up.

02:04 Yeah, it's quite the thing that you've created, and it looks like it's really going strong. So we're going to talk about tools for data exploration in general, and then we'll dive in and focus on Superset, which is what you've created. So I'm really excited to do that.

02:19 Excited to do it too. I've been kind of baiting and kind of swimming in this old world of data orchestration exploration visualization for the past 20 years or so. That's been really my focus. So I should have a lot to say about everything related to this space.

02:35 Yeah, fantastic. And you've got a lot of experience at many of the big tech companies that people would think of as having lots of interesting data to look at. So we can dive into that just a bit at the beginning here. Before we get any of those things, let me kick it off with the beginning questions. How did you get into programming and Python and all these things.

02:52 Okay, yes, I did a decline of an associate degree back around in the late 90s. So that kind of says about how long I've been doing this, but I never finished it too. So I never finished and got my actual diploma for it too. So I got into an internship to join a company called Ubisoft. It's a billion company, as one of the major video game companies out there. And I went on to my first internship and never looked back and never finished a program. So that's where my career started.

03:22 That's awesome.

03:23 This program was like very it's called A Technician on the informatics. So I'm from Quebec City originally, so I grew up speaking French, and that was the program in French. And it's a technical program. The goal of the program was to send technicians, as they call them, but people are very technical, really focused, and then give them the skills that they need to be effective joining companies. So some programming, some data modeling, a little bit of SQL, and then really the skills that you need to get started and start coding. Not necessarily thinking about computer science and data structures, like much more.

03:59 What do you need to get started?

04:01 Let me interrupt you for just a second there, and we can maybe talk a bit about this. I feel like a lot of people looking in from the outside feel like, oh, I need a computer science degree in order to do X, Y, or Z, whatever it is, create APIs, create a business, do data science or whatever. And so much of the focus of CS degree seems to be on algorithms, on operating systems. And while those are really good to know, they're not necessarily the skills you sit down and go, let me remember my algorithms. Things like you just call a function on a data structure or let me remember my operating system, stuff like, you just run the code. It's helpful to have that, but I don't feel like it's that necessary. I don't want people out there listening to think, oh, I've got to go get a CS degree or I'm not going to do anything. Right?

04:44 Yeah, I think the bootcamp I've been in toiled, like, flip that upside down to say like, oh no, all you need is the technical skills to get started to build an app, and then you don't need those fundamentals or maybe the premise that you need them later. There needs to be some balance there. So the CS approach, let's start with the foundation and how we got here, and then the rest should follow. I don't think that's right. To me, you don't really have that curiosity about how you got there until you've been a practitioner. So to me, I'm like, hey, teach people the skills they need to be successful and useful to employers. It seems like the way that university in general or education should be oriented towards teach people the skills they need to be to contribute effectively to the market. And then I think maybe the CS constructs is something you would learn that with them, you would build as you learn.

05:41 So you pick it up and some of you are like, I really need to know. I've been doing this for a year now. I need to know how this thing works. And you'll dive into it, right?

05:47 Yeah.

05:48 But when you're motivated and you have that experience already yes.

05:52 It's like when you have to solve the problem, there's certain problems. And then maybe at that point, I don't know if you're writing a bunch of SQL and you're building a lot of data structure, maybe you need to understand, like, data modeling construct. And that's a good time to go and understand the history of the different approaches to data modeling. But maybe you don't start from the theory, right?

06:12 Yeah.

06:13 So going back to your question. So then I joined so I was a web developer, kind of building internal apps for like a year. And then very quickly I got into data and got into using billing data warehouse at Ubisoft and then using the business intelligence toolkit to build all sorts of reports, dashboards, and self service things so people could consume data. So very quickly got into that, and it's a little bit later that I learned I started doing more scripting. So when I joined Yahoo in 2007, I believe that was like the birth of Hadoop, and Yahoo had some perl. So I learned a script a little bit and kind of interpreted languages more. And then by the time I think I started building more website for personal projects so learn a bunch of Python there, played a lot with Django, and by the time by the time I joined Facebook in 2012, I knew Python very well. And then that became kind of my main language, and that's really what we use and thats really for a lot of things at Facebook, and it just became more and more the established language for everything that related right around that time.

07:25 What a cool set of experience you are at was it Lyft Airbnb?

07:30 Yeah. Facebook? Yeah.

07:32 Facebook and ubisoft. Yeah. So ubisoft is interesting. They're a Canadian company, right?

07:38 They are a French company. So their headquarters is in Montreal next to Paris. They were actually from Bratings, so Brittany somewhere.

07:48 Okay.

07:49 They're a French company. They have a huge studio in Montreal, though. There's like amazing tax breaks in Quebec, in Canada, and they decided to build one of the biggest, if not the biggest video game studio in the world in Montreal. So that's where I started my career.

08:02 Well, the reason I bring it up is I want to ask you about what it's like working at a game company versus a more traditional I don't know if you call it Yahoo traditional, but like, standard. A lot of people dream of being at these game companies, and that's even maybe why they got into programming and I don't know. Tell us what your experience was like there.

08:20 Yeah, it's a dated experience. Right. So I don't know what it is. I left Ubisoft in 2007, so it's like a pretty dangerous 15 years ago. I can say about what it was like at the time. It's a mix of super fun. It was, like, super young, a bit Brook in a lot of ways, masculine environment also.

08:39 Some of it is because it's 15 or 20 years ago. I think it was a slightly different world, and a lot of things that were maybe dubious back then are definitely not okay anymore. I think there's that culture people talk about. I think electronic cars has been famous for that. And a lot of the big video game companies is having these work environments that were really dubious in some ways. But I think Ubisoft was a great place to be, I think, at the time, and I think maybe one of the better ones going to bring it back to where it should be ahead of its time, perhaps. But my experience Ubisoft was so interesting because it's difficult for me to talk about what is Ubisoft? Because I work at three different studios montreal, Paris. I was in Montreal for about a year until I was in Ubisoft. Paris for about three years, and then Ubisoft San Francisco for another three years. So the three different offices were vastly different. And I think the things that kind of plagued the video game business long hours, kind of low pay I think.

09:43 That's like grinding people out sort of thing.

09:47 Yeah, kind of like there's never enough a lot of crunch time all the time, and then kind of a great place, maybe, to start your career. But then as people mature, they tend to go other than change so much. Like, the whole world culture has changed a lot.

10:04 Yeah. As you have relationships and families and you want to see them, things like that might revolve.

10:12 Yes. Seriously. Age out of working out time or just by necessity too.

10:17 Yeah, exactly. Let's kick off our conversation focusing on data exploration. I think so. When I think about data exploration, not from a developer or data science, but in the super broad sense, I don't know what comes to mind for you. But Excel, I feel like most people are like, I've got some data I need to maybe think about it a little bit more analytically than just a bunch of numbers. Let me throw in Excel and see what I can do with it.

10:44 Yeah, I think Excel is like a super open. If you think about, like, Excel as a playground or as a framework, it's super open ended. You can do so much in there, and there's not a lot of constraints. Right. The constraints that exist in an Excel file, the ones that you make for yourself. And then maybe one constraint is like it used to be. Like, I forgot what it was, but it's like 65,000 rows for a long time, and now there's no such limits anymore. But there's still like a limit of how much your laptop is going to be able to in terms of the size of a Pivot table, the past companies where I was at, there's no way you could bring the dimensionality and the raw data that you need in Excel. So you need to kind of prepare and extract the stuff you're going to play with. First you got to be in Excel, and then there's like, things that you know by historically has not been really good at is what if analysis creating different scenarios, like forecasting, things like that. There's an area where spreadsheet dominate, will keep dominating. Right. If you want to hide certain things to variables and change the numbers and see how other charts and models are. So modeling kind of is a really good case, I think, for Excel. Then the downside is, like, how do you collaborate on these things? And diversion is kind of a mess where you end up with, like, SharePoint.

12:03 The files are binary, so you can't dip them or anything easily, right?

12:07 Oh, yeah. They're not in source control. And then you don't know. There's no introspection as to how things got there. They're a mix of data that are from source and then kind of made up stuff sometimes like, I'm going to tweak this, I'm going to change that. So you don't know what is the list of all the changes that were applied to the source data? Yeah, I think it's a good tool, but it's definitely incomplete. Right. It's part of always be I don't.

12:32 Bring it up as a recommendation. I bring it up as I feel like a lot of people are starting here. And so how can we look around and see maybe what is a better option out there?

12:41 Yes, I think if you've used Excel a lot in your organization, personally, I think people discover kind of hit their head on the limitations and the problems that come with such an open framework.

12:54 This portion of Talk Python to Me, is brought to you by Sentry. You know, Sentry as a long time sponsor of this podcast. They offer great air monitoring software that I've told you about many times. It's even software that we use on our own web apps. But this time, I want to tell you about a fun conference they have coming up. Sentry is hosting DEX Sort Of Madness, the conference for every developer to join as we investigate the movement and trends for a better and more reliable developer experience. What is this Madness, you ask? It's the never ending need to deploy stable code quickly come to Dex to engage with developers who will share their epic fails and their glorious saves. Since we can't fix the madness, but they can start sorting through it with you. Register today to join in San Francisco or attend virtually on September 28 at Talkpython.Fm/dex. That's talkython.Fm/dex. The link is in your show notes. Thank you to Sentry for supporting Talk Python. To me on the audience, all he says my local data extraction people default to Excel and they seem limited by the number of sheets available in a workbook.

14:02 Yeah, well, I guess now that it's not the number of lines in a.

14:06 File, I guess no worries. That's right. So sort of stepping up a level from this, I feel like maybe heading down a more structured way. But one of the problems with Excel is how do I talk to databases and APIs and how do I bring in other more live data is really limited. I know there's like bi stuff, but not really.

14:27 What do you think? Is that Jupiter? What's the next level here?

14:31 We're talking about consumption now in some ways, but I feel like in a lot of ways we should be talking about data engineering too. So where is your data is the first question.

14:42 Your data is not. Or maybe some of the data lives in Excel, but that's not where your data lives nowadays. And the SaaS application you use the modern like just even any startup or company uses hundreds of SAS applications, CRM applicant tracking systems, GitHub, and just a million different data sources. And it feels like one of the first things you need to do is to bring that data together right, in a central place or into some sort of like inside either data march or data warehouse, I think like an early construct that you need as an organization because data is most useful when it's put alongside the other data you have in your organization.

15:24 It does make sense to hoard all this data and bring it all to a central place if you want to do consumption. Otherwise consumption is going to be kind of a stitching story, right. So let's say you're in Excel or you're in the local database or whatever it might be. The first thing you have to do is bring the things that are related in one place so you can do that visualization consumption analysis, right.

15:47 How do you join on a thing that's partly in an API and partly an air table or something, right?

15:53 Let's say we take a notebook. So super open ended, right? What is the notebook? It's just like the script with repel and where you can run chunks of the script sequentially and you have a persistent kernel or interpreter kind of supporting what you're doing at any point in time. But the first thing, if you don't have a data warehouse or your data all in one place, you're going to try to do some data engineering is probably the first thing you're going to do within your notebook is to say, how do I get the data that I need the source or sources that are interesting to me. And the notebook will enable you for sure to do this. But then can other people build on top of the work that you do in notebook? Probably not, or not as easily as you'd want them to take the data warehousing kind of approach of saying like, hey, let's bring data that we need in our organization to a central place and try to stitch it together there so it can then best be used for consumption.

16:50 Analysis is still very important step in the process.

16:54 Sure I totally agree. And Jupiter and Jupiter Lab gets a lot of the mindset, but there are many, many choices. I interviewed Sam Lao and he did a research project where they categorized over 60 different notebook environments where Jupiter was one of them. It's off the hook, so there's a lot of choices out there and so on. But let's focus on Superset.

17:17 I'd love to talk about why do we need to set 60 different notebooks. I feel like I missed that evolution of notebooks. I'm very familiar with Jupiter, deployed Jupiter hub at Airbnb a while ago, but then followed Hex a little bit that's one of the players in space also followed. So at Lyft, we kind of built our own little notebook service, right? So we had a Queries cluster, which kind of say like, I want this docker image base for my notebook. You'd pick like, I want the AIML package, or I want basically what's the base for your notebook? And then you could pick some hardware like I need GPUs or I need a big machine or small machine. Then we'd spin off these environments for people. But try to understand why is there 60 notebooks and what are the different flavors they all differentiate from each other is a dubious question.

18:10 It was crazy. I was kind of blown away by this. And if you look, it seems like it always differs on some axis. Like, well, we want more collaboration like Google Docs, or we want it to run into a different place like Pi Iodide. We want that to run in the front end rather than with some sort of Python in the browser. And there's just all these crazy variations.

18:31 So I think there's a lot I just kind of only highlight that to point out. It's not just Jupiter. There's like a ton of these things where Jupiter is the main environment that kind of lives in a web browser where people go and explore data. And I feel like Super Set is a pretty modern, interesting player in that space of many choices.

18:51 Yeah, happy to talk about Super Set too, and trying to introduce it in the context of what we're talking about before. Yeah, but think about superset. Right?

19:01 Tell us about Superset.

19:02 So Superset is essentially very much like a data exploration dashboarding visualization tool that's very much like catering to organization, right? So we superset solves challenges for the problems face of data consumption for entire teams. So we're not necessarily focused on people who know Python, or people who data scientists, or data analysts, or data engineers. We very much cater to the entire team and the idea there is a single place to explore data, visualize it, interact with it, share, create dashboard, and then we have a SQL IDE on top of that too, I think like on the GitHub page, I don't know here if we have good screenshots too. I think an image is worth 1000 word. And I know not everyone is looking at what we're looking at, but here we have the drag and drop kind of explore. I think the screenshots a little bit dated, there might be a little bit more recent on a GitHub page too, where you can see we have this drag and drop interface very similar to what people are familiar with in business intelligence, right? Like where you have access to your data set, you drag and drop your metrics and dimensions and pick your visualization type, get to the exact chart that you want. You can assemble these charts into interactive dashboards with dynamic filtering on the dashboard and expose that to business users, right? So they can explore on their own, they can create their own dashboard, they can answer their own questions.

20:32 This sort of thing, it lives in.

20:34 A really interesting space. And that's why I brought up Excel as well, is because Excel is not meant for programmers, but it's meant for people who are trying to do serious stuff with it. They kind of, well, maybe the right equals and they'll find a formula they can put in there, or they'll do like a Vlook up, or they're kind of trying to go more than just like I need a grid of stuff. And while Jupyter those things are awesome, superset feels like it caters a little bit more to a power user type of person that has Python extension capabilities. But you don't have to start as a Python developer to get into it, is that right?

21:09 Actually not right. So the premise is you don't need to have any Python skills. The skills that may help if you want to go deeper inside Superset is knowing some SQL.

21:20 Knowing SQL is not a requirement.

21:23 If you think about Tableau, people familiar with Tableau or Looker, that's really the space that we're in. So it's platform me in a sense that okay, you access your database connection, you interact with data sets, but then think about the experience of someone just consuming a dashboard. You open a dashboard, a collection of chart, maybe it's titled like Financial Forecast for 2023. And you really need the technical skills to use, you need business knowledge mainly to consume. These dashboards are interactive, so that means you'll be able to apply a filter on specific quarter specific customer type of market and then interact with the dashboard in that way, but primarily the dashboard interface caters to the business user or anyone that is trying to understand.

22:13 I see almost like a more of a bi type of user person rather.

22:18 Than it is super set. It is a bi tool to be there. It's a bi tool that maybe is modern in many ways and assumes if you want to get the way you get deeper, say in the explorer and I don't know if you can click on the upper left on the explorer. So here for context, we're looking at more the drag and drop place and superset where you pick metrics and dimension and visualization type. You want to look at your typical kind of tableau like interface and here you can essentially just drag and drop. But if you don't do know sql, you're able to create your own metrics and express them as SQL expressions, for instance, right?

22:57 Exactly.

22:58 You can have completed columns and aggregation and stuff like that, right?

23:02 Exactly. So you'll define metrics as SQL aggregate above expression. So some of this divided by the count distinguish of that and it has to be a valid SQL expression. But yeah, so for people who are a little bit more technical, maybe understand the data better and a little bit of knowledge of SQL, they don't have to, but they can use SQL as part of that exploration experience. For instance, if you pick a filter, you'll be able to pick a column, an operator like Customer ID in and then go pick the Customer ID setting. But you can also go to the little SQL editor in a filter, pop over and then write a more complex SQL expression if you want to. So we wanted to not necessarily bury SQL as we feel like more and more people are learning sequel. It's becoming the lingua franca of data. We feel like there's going to be a certain percentage of the workforce in the next decade that's going to become more data literate. And that's in part by learning SQL and understanding understanding data set data structures and what data sets in their particular organizations are and are made out of.

24:11 Right. And by using SQL it means you can connect to different data sources and you can connect to live data like some kind of export or whatever. You just connect to postgres or you connect or whatever and then go from there.

24:24 Exactly, yeah. So the way things work in Superset so you go and create your database connection or connections to whatever SQL speaking databases you use as a data warehouse, as a data store. Things are really popular right now, are the big cloud data warehouse snowflake BigQuery. But there's still a lot of postgres in MySql, even for analytical use cases, right, and people so you connect to that database and then you go and you have different ways to get started. One is to go and start exploring the tables that exist already, tables or views or you have the SQL ide that you're kind of pointing to. Now it's possible for you to go and step down to that level that's more interacting in the SQL level. And here you can also create data sets, right, and create what we call virtual data sets that are essentially views for people familiar with the database construct of view and that allows people to go and explore that data set of virtual data set, assemble dashboard, create, visualization, collaborate with others, share links on Slack and Annotate at comments.

25:31 Yeah, I want to dive into the data sources more, but I want to make sure we highlight this for people listening who don't know about superset. Two things and you've hinted pretty strongly at one already. First of all, when I go to Excel, I don't see a Fork me on GitHub at me. I'm looking, I don't see anywhere on this page it says Fork me on GitHub.

25:51 Over on Apache/superset. On GitHub. Yes, clearly right there you can so this is, one, it's open source. And two, very popular. It's almost got 50,000 stars and 10,000 forks. That's Django Flask level of popularity for people keeping score, I guess.

26:10 Yeah, depending on stars are just like some sort of proxy for hype or interest for good proxy for how many people have kind of wanted to play with the code, which is also a proxy for a different kind of hype and interest. But yeah, it's up there probably in the top 50 to 100 source projects of all time in terms of like value delivered and just which is like way beyond what I expected in 2015 when I started Same with Apache Airflow. I also started Apache Airflow. That's also very popular and used in tens of thousands of organizations. I think it's similar, it speaks to the scale and just like how the problem is, is super validated. Everyone needs to visualize data, explore data, create dashboard, write SQL, see results, visualize results. So very popular. Definitely the leading open source project in this space of call it business intelligence, data consumption and it's a very mature project, right? So it's used by thousands of people at places like Airbnb, Microsoft, Tesla, people at fork the project or use it super heavily internally in the wild section that you're pointing to, which is kind of trying to list out the people who use the product is very limited kind of version, the tip of the iceberg type thing of the people who self reported using the product.

27:35 So you have a link in the GitHub reposit called in the wild and I just list out under these different verticals. You'll find these companies using them, which is on one hand, it doesn't matter if these other companies are using it or not, but then if you're trying to sell it to your organization or just trying to decide if you can trust it, like, well, if you're. In education and it works for and it works for you. To me and the Wikimedia Foundation, maybe it'll work for you.

28:02 It's a bit of validation, right?

28:03 Yeah. And then especially looking at those are people that open to pull requests to add their name to this hidden file on the repo. It shows how the tip of iceberg it is. But I think one thing I've been telling people in the context of this podcast, it makes sense if you want to contribute to open source, there's a lot of ways you can contribute. The obvious one is to use the software, open a pull request. But the less obvious one is to let the world know the most basic and the very minimum, maybe when your organization is getting significant value from an open source project, just to be public about it, let the world know. If you work at Uber and you get tons of value from, I don't know, gatsby or whatever, let the world know that you do. And that's a vote of confidence and it speaks to the scale of the community and to worse for other chances going to work for you are much greater.

29:00 Yes. Another thing that's interesting about the GitHub repo source code really, I guess is what I'm thinking of two things here. One is it's super active, right. If you go in here and you look around like sometimes you'll see last change two years ago or whatever. Right. But last change 7 hours ago. A couple of days ago. Two days ago, right. There's a lot of activity here, right?

29:20 Yeah. It's super intense in terms of how many people work on a project. There's like a contributors tab they might be able to click on the right there, click contributors. So 832 people have contributed to it and that's just looking at code contributions. Irresponsible to see the history of who's contributed. Something that's interesting is like we distributed on PyPI and the project was largely Python code. It looks like we have too much data and the GitHub UI is struggling.

29:48 We're going to break GitHub, sorry, GitHub.

29:51 Yeah, right now, because there are too many contributions here, you can kind of see the scale contribution. You can also see how I've been selling into my CEO role unless a bunch of people contributed over time. But I was going to say we decided to distribute on PyPI and was largely a Python project from the get go. Like more and more if you look at the code distribution, a lot of the code is in TypeScript JavaScript now because the nature of the product is such a front end project. And something that's interesting about open source is we have seen less like application GUI type, up the stack type projects really succeeding at scale. And Superset is definitely one of those like very much a front end application type product that's open source and then succeeding at a massive scale too, where typically in open source, we see libraries, we see back ends and frameworks. Right. Like, being really massively successful. But that was part of the reason that I really wanted to I wanted to prove that super set and that Open source can succeed up the stack, too. And we've been working very actively on that in this community.

31:12 Yeah, it's a super good point, because it's clearly Open source is one on the frameworks in the libraries level, but there's fewer examples of it creating beautiful user interface experiences and types of applications.

31:26 Yeah, and pretty good theory on that, too. Like, why is it the case that we think open source has been very much playground for engineers? Right. Like, the tool set and GitHub and Git and source control on the full request. And issues like, all of these things have been, historically the way that engineers build software. And it's been a little bit hostile to PMS and designers. Hostile and, like, actively hostile, but it was not welcoming. Yeah, it's just built by engineers for engineers like GitHub. And Git was built by engineers for engineers. And we never really thought of how do we include product designers and product managers to the workflows there and then the interest? I think a lot of engineers add this great image of open source and see it as an outlet for their careers, and then they love the idea of working the open that does not exist, that drive of working in the Open with designer. So we've been thinking about how do we create, enlarge our community and open up our community to very much welcome PMS and product designers as part of this community? And it's been I think we've made some headway. We should blog about how we did this in the Superset project. But we opened up and we created some processes where we also do design review, we do product reviews that our PMS get together with other people in the community to kind of design beyond technical solutions.

32:54 Yeah, there's a ton of Visualizations here for people who haven't seen it yet. Just visit the website and you'll see right away there's primarily a Visual tool, the tool for visualizing data, right?

33:04 Yes. It is like a Gui tool in all ways. But I think what's interesting, too, it's a Gui tool first, right? It's a bi tool in the sense that a lot of what you do is point and click and drag and drop and hit a save button. But because we're open source, we also have we're pushing the APIs and SDKs very strongly too. So it's probably the most platformy bi tool around because of our open from the ground up. So, say Visualization is a plugin system so you can create your own visualizations and distribute them. The back end and pipe in is like the coverage of the API is like 100% it's, like, all over everything you can do in the GUI, you can do this code too.

33:47 Okay.

33:47 Yeah.

33:47 Right now the audience is asking, does it expose an API to your data?

33:52 Yes. And it should be in the docs. Right? So if you go to docs here, somewhere in there, there should be maybe it's API at the bottom there. I don't know how well documented it is here. It should be it looks like it's not rendering right on like 480 x 320 pixels.

34:10 Here we go. Outside.

34:11 There you go. So command minus, but yeah, exactly. So very good API coverage and well managed API behind the scenes.

34:21 Yeah. It looks like you even expose some directly some of the open API swagger type of documentation, which you could maybe even auto generates and stuff. Does it have like a library, a Python package that talks to the API? Anything along those lines? Or is it just http I think.

34:36 It'S open API and then Swagger. Right. I think I set up the first version of that a long time ago. But yes, it's self documenting things that if you put the right decorators and the right dock strings itself, I think we do marshmallow too, and other things to do, like Schema definition of what can come in and out. And that dictates. I think that's selfdocumenting too, in terms of the input and expected output schemas to satisfy Python 3 type annotations to get picked up properly, which is great. Beyond that, there's more like there's JavaScript stuff, there's a plug in. I think if you were to Google superset plug in examples, you'll find all sorts of resources maybe out of that.

35:19 There you go. Oh, there's even a whole collection of them. Yeah, look at that.

35:22 Yes. Manage a different report.

35:24 I didn't Google, I kagied it. I don't know what the Googling with kaggy is, but there you go.

35:28 Got it.

35:29 Yeah.

35:29 And then we have a good blog post on a preset blocks. If you go, we have like, how to get started and write your first super set plugin. Not much more like JavaScript. That's 100% text script, JavaScript, front end code plugin. It has to be, right?

35:46 You don't want to be in the back end trying to figure out lay things out or use the Python library to do interactive visualization. Just doesn't work super well. So the plugin framework is all front end code.

36:00 That makes sense.

36:02 There's more API than there's component libraries as part of Superset, and there's much more than the rest API for the back end. There's other SDKs and component libraries.

36:13 So the first thing I wanted to point out about the source code and the GitHub repo is just the popularity and all the contributors and whatnot there. The other is this. While not necessarily made for Python people the way that Jupyter would be made for bi users, but it is open source in Python, built on flask and tools like that. Right. And you talk about the extension on the back end and pieces along there. So maybe just talk about for people that want to dig in from a python side, what can they find?

36:44 Yeah, we could try to open a requirements folder because at this point it's not even a requirements TXT file here.

36:51 For people looking for setting it up.

36:54 Okay, requirements.

36:56 Yeah.

36:57 Are you guys using pip tools here? Nice.

37:00 I believe it's pip tools and a pip compile.

37:03 Yeah, I love working that way. That's my way these days. It's great.

37:07 Yeah. Because we need to pin the versions and we have people not familiar with it. You define an in file that's like your version ranges and then you can kind of pick compile your version and then that turns into kind of frozen libraries like specific numbers.

37:23 You can have everything pinned out. We use so much stuff here and we use stuff that uses a lot of stuff. So if you import just flask flask itself, it's likely to import a bunch of things. So once you get a recurring through that dependency tree and expand it, it's a massive dependency tree on the python side. It's also a massive dependency tree on the JavaScript side. Big application made out of hundreds of open source packages because we kind of need it all to build this application. So dependency management. There's a little bit of a struggle when you manage such a big piece of software that's connected to everything.

38:03 Yeah, there's no joke. There's a lot of dependencies here, but there are ways you can run it without worrying too much about that, right?

38:09 Yeah, I mean you can definitely just run the Docker container. You can pip install superset. There's somewhat straightforward way to set it up locally and get things going. Yeah, it's kind of interesting how like building application nowadays, if you think about the dependency tree that go behind any kind of solution or application that's not just a library, like library should have very minimal requirements kind of dependency trees. This should be self contained and kind of focused, I think. But here I think to build such a large scale application we just need to have a lot of dependencies and then these dependencies have a limit, a fair amount of dependency, I'm surprised to see. Now we're looking at click for people not necessarily looking, but just click itself. Probably adds a lot of its own sub packages now too.

38:56 Exactly. And there's a lot of things that one click to be into your dependencies here.

39:02 We'll talk about running in just a minute. And there's a lot of architectural layers at play here. You've got superset, but you've also got celery, you've got redis, you've got some database layers. There's a lot of technology that people would know working as a group that luckily Docker just takes care of for us. More like docker compose.

39:21 Yeah. Docker compose.

39:23 Docker, compose and help chart. So I think, I believe we have like a help chart too. It was always important for me to keep it such that you can kind of just pip install superset and run a few commands and get it running locally. So you don't need to have Redis out of the box and Celery out of the box. Similar to Airflow in that way, where on a table, like a very selfcontained thing at first. But then if you want to run any modern web app that does serious kind of work and solve some real problem, it's likely that you need to have web servers and application server because you need to have the whole front end stack, right, like something like Web Pack. And you probably have front end infrastructure just on how you build your front end. It gets pretty complicated quickly. Then you probably need Async workers. So all of a sudden you need something like Celery and something like Redis. That's a message to you to talk to the Async workers. Then you probably want to start caching some things. So you need a caching back in for certain things. And then you need to support an open source. You probably need to support different databases. So some people might want to use my Sql as a back end or postgres or some more other things. So then you need to optimally support these things through abstraction later. So it gets complicated fairly quickly.

40:42 Yeah, it was really cool though, that you can just pip install it and there's a more lightweight version without going through all the details. Let's talk about getting going, get it running, exploring it a bit and hosting it. But before we do, I said 15 minutes ago, two quick comments before we talk about databases. Let's just talk about the database thing real quick here. Sure. Over here near the bottom, obviously, where your data comes from. We opened this, I pointed out that Excel is bad at getting data from different data sources. People have operational data, they have data warehouses, they have data lakes, whatever you call them, things like this, right. So there's a lot of different places people are putting data. Maybe just touch a bit on the database integration here.

41:25 Yeah, and I think in the context of this Python podcast too. So for us, we use SQLAlchemy very heavily. So SQLAlchemy is a SQL toolkit first and then an orm built on top of it and probably much more than that. But the way that we support first, I would say the metadata database for Superset, right. In Superset, when you save Dashboard, save Visualization, save Queries, that goes to metadata database. And we tend to recommend Postgres and MySQL as the back end for the app just to keep the state of the app somewhere in a proper relational database.

42:02 And then we connect to all these databases to do analytics on them. Right, and that's what we're looking at here, the supported databases in the sense of like what can we build charts off of? And what can we enable the exploration around that? And then this is powered by Sql Alchemy. So that means that anything that has a DD API driver and a Sql Alchemy Dialect, and then maybe that's an opportunity to talk a little bit more about the database distraction and the Python world. Since we have a Python centric audience. DB API spec is one of the PEPs out there. I forgot the number of the DB APi Pep, but I was like just a common interface for all the databases in Python. So that's called Db api and then SQLAlchemy, the SQL toolkit, knows how to speak certain dialects and builds an ORM on top of things. And pep 249 So it came a little bit later. Story of Python. I don't know what's the latest Pep number?

43:01 They're pretty high these days, although they seem to be organized by concept. Let's see, we've got some of the.

43:08 8000 here because there's like some encoding and the numbers, there's some kind of grouping.

43:15 Yeah, I'm not sure exactly what it.

43:16 Is, but 3000 are for a specific thing in the 8000, I think.

43:20 So.

43:22 Anyhow, what you need in order to basically for Superset to connect to any labor database is a viable DB API driver. And once that's built, Sql Alchemy Dialect Alchemy dialects are fairly easy. Like, we've written a bunch of API drivers. SQAlchemy can be dialects in the past. They're not that hard to implement. So that means pretty much anything that speaks SQL out there we can talk to, essentially.

43:47 Yeah, so we've got the standard MySQL postgres. Microsoft SQL Server is probably a big one in the Bi space because a lot of enterprises are back with that. But it also has more unique ones like Snowflake and Druid and Google, BigQuery and Firebird and a lot of different places that people can talk to.

44:07 Yes, when we see the Superset community and the Preset customer. So I started a company three years ago that's essentially commercializing Apache Superset and offering a managed service so you don't have to run it. So we're on call. You're not on call. There's a premium too. So if you want to try Super Set, you can pip and sell Super Set and kind of struggle with Docker and all this stuff, or you can try it directly at Preset so you can just start for free and see if it works for you. Then you can kind of postpone the decision, do I want to run it on my own, or do I want to use a managed service and kind of pay as I go instead? But yeah, so what we see in terms of what our customers use, a lot of Snowflake, a lot of BigQuery, these cloud data warehouse is kind of no brainer nowadays. If you have true analysis workload, just put all your data and stuff like BigQuery, and then there's still some redshift and there's still like all sorts of database engines for whatever circle reasons people have or they have constraints on them to run something on premise or in their cloud and then redshift, right?

45:07 Absolutely. So because it's open source they can go and host it to their hearts content or they can go SASS style and work with you all that's, right?

45:15 So for us, we do offer the managed service as the freemium and pay as you go proceed, $20 per user per month. It's pretty straightforward and kind of easy to grow into and you pay as you go. Then we have something called manage private clouds. If you do want to run a managed service inside your cloud because you don't want your data to leave, maybe your data is not already the cloud data warehouse, maybe it's inside your VPC and you want to keep it there. So we offer a service, it's still a managed service with a centralized control plane, but it runs on your cloud. So we do offer this and then you're always free to run on your own, right?

45:52 And there the question is you have to think the math of running a piece of open source software versus running on your own. Versus paying a vendor like running Kafka or buying Fluent for running Spark or Databricks is whether you're interested in the bells and whistle that the vendor uses and then the constraints you have around like quality of service and think about total cost of ownership. So the reality is running something like Superset at scale in your organization, if you want the latest, greatest, secure, kind of patched up version of it, is that it's pretty expensive to the total cost of ownership of open source is fairly high. So often the vendors can do it at a better price and better quality.

46:39 To patch Celery and redis memcache and databases and your servers hosting them and keeping them all going, it's nontrivial. And then there's disaster recovery and failure and as soon as you are thinking well, maybe we should hire somebody to do this then all of a sudden a paid service starts to sound pretty appealing, right?

46:57 Oh, yeah.

46:59 When you think about what it really takes to manage a piece of software or collection of pieces of software like Superset and Kafka and Airflow and all these things and you want it to be state of the art latest greatest version and kind of secure compliance. If compliance is concerned and all this stuff generally. At least for smaller organizations. It makes sense to have sense which is the best people to run the software reliably is the people writing the software. Yeah, even on things like I preset we have a multi tenant version of Superset that we run where you can't really have that if you run out on your own. So that means how much we pay per cycle in terms of infrastructure cost is going to be much cheaper than what you can get to running on your own.

47:44 Sure. Not every user is asking an active bi question all the time. So you have extra resources to share.

47:51 And then you provision for peak. It's a little bit the same with infrastructure, right? If you run a database server on your own, you have to provision for peak access, where if there's a cloud service, then you have to provision the cloud vendor as to provision for the total peak across all the customers. There's tons of economies of scale there and we pass that on to our customers. Cool.

48:12 All right, well, let's talk about maybe getting started in just the first touch type of experience before we run out of time. Here you have a nice dock that says installing and using superset. And I went for the easy way. So on my Mac, I have Docker for Mac already set up, which means I have Docker and Docker compose. And so basically that's clone the repo. The superset repo go in there and then just run Docker Compose pull and then Docker Compose up on a certain definition file configuration file.

48:42 And then pray. There should be a comment that says pray and hope for the best. Fix yourself a copy.

48:49 One thing that's really interesting is I'm sure a lot of other open source leaders can kind of relate to that, is that no one agrees on the best way to run something for production use cases, for sandbox use cases, and even in developer mode. Right? So for me, I'm like, I hate Docker and Docker Compose because I don't have enough control and I'll tend to just kind of run my own, set up my own environment. I run team and I do my own bills. And I prefer adding more control instead of trying to understand that abstraction layer that Docker and Dockets impose is. So there's an alternative, I think, documentation somewhere, and there's a big contributing Nd that's more geared towards people like how do I run my set up if I want to actually develop on the tool. So somewhere on the superset repo, there's a contributing MD file that says if you want to develop with Docker, Docker Compose, you do this. If you want to develop using more kind of different, like more raw level.

49:52 Environment and go, yeah, that's it.

49:55 And some people use Pyenv.

49:56 M py?

49:57 Pyenv. Some people prefer using virtual and more directly. So it's really hard to come up with. We got a prescribed way to do it with a good documentation, but then half of the people are going to go their own way anyway. So Docker Compose here too is like a lot of people prefer helm charts for Kubernetes. So then we have helm charts, we have Docker Compose construct. Then we do have other documentation as to how to do it. It's been really difficult to have a very clear prescribed way to do it and then maintain the different ways individually and keep them all working.

50:30 Sure. So as much as I'm not a huge fan of developing code in docker, I do think this is a nice way for a low effort first touch experience. I just want to run it and log into the web app and see how it feels and play with it. And you get all the various moving parts are you get celery and redis and whatnot, which is pretty cool.

50:50 That's also kind of a map of how to run it on your own. Right. So maybe you're like, oh, I don't like docker compose. I prefer my own version of something else. But I'm going to look at the docker compose and see what it's doing. And that the recipe. That recipe is still very useful for people at different ways.

51:05 Sure. Or just knowing, look, there has to be or maybe it's good if there's a redis server. Okay, well, I have redis. Let me just set it up to connect to that one, for example.

51:14 Right, that's it. Yes. I'm just going to change that part of the recipe because I already have that ingredient run.

51:20 Yeah, exactly.

51:21 When you run the docker container, it says, wait a moment, it says everything works. Go over to local host 80 88 and log in. The super secure default password and username is admin. Admin. So you're going to change that.

51:36 But it's an easy way to get in there. And what you get is you get some example dashboards and some example charts. Right. You want to maybe tell us about the things we find when we get here so people know how to go explore when they get started.

51:47 Right. And you probably want to zoom out a little bit because like, the rendering here is going to look a little bit better. It's kind of interesting too, because you don't out of the box, you don't get the thumbnail back end. So you don't get the pretty thumbnails that will have a preset or that you can set up if you spend a little bit more time on setting up your salary back end and getting all the thumbnails to compute in the back end. Yeah. So what you get out of the box is a set of very small data sets and charts and dashboard built on top of that you can navigate and play with. If you really want to get value and get a real POC, you probably want to connect to your real data warehouse, probably not out of your local, but get to a slightly more or maybe you have a copy of your data warehouse or some data you want to play with. And you can connect here. If you were to look at the data sets are like coming from your database connections. So somewhere in the upper right you have settings.

52:38 I see.

52:38 Right. So you could connect database connections are here so you could create a new database connection in the upper right. If you click, you'll see just a screen to connect to your database. You pick the database you want to connect to your connection string. And then you can start playing with your own data. If you don't want to play with your own data, you can play with the data we provide. It's fairly limited.

52:57 It's been a lot of cycles work out adding the latest, most fun data sets to play with, the best dashboard examples that allows you to get started and get a sense for what superset can do.

53:07 Yeah, so we have a couple of major building blocks. We have dashboards, we have charts, we have data sets, and we have the sequel IDE thing that'd be right here. We'll pull up a sales dashboard. Nothing screams bi more than sales dashboard.

53:23 That's right. We had to have an example there, but it loads like a few bar charts and it's not like the best design dashboard. It shows that we support, but it looks good.

53:33 There's some beautiful stuff here.

53:35 Yeah, you can do so much more. I feel like our examples are dated. You can do so much better with a superset if you actually take a little bit more time. We should work on our examples as a community to have a really compelling data sets to play with, but it gives a good overview. And here if you click on the dot for any chart here for people can't see the visual support and click on edit chart. So that will send you to our explorer. So we're in the dashboard, we're looking at a specific chart, then we just moved to our chart editor. That's very much like your exploration. So here you can click on a metric, you can drag and drop different.

54:10 Metrics, change my sum to max and see what happens. There we go, look at that. Biggest sales.

54:15 Yeah. So you can update the price.

54:17 If you were to click on view all charts, I don't know if you see that somewhere at the top metal somewhere. There's all sorts of visualizations that are supported. Here we got a big list of all the visualization plugins that ship with superset today. So all your common charts, but also some geospatial stuff and some more advanced and complicated charts.

54:37 Nice.

54:41 Here maybe just to do a little bit of the flow of the demo. I apologize for people not watching and just look at it. Hit cancel and then click on the upper right.

54:52 Not settings, but the dot, dot, dot here. So you can say view the query or run in SQL lab will allow you to go a step deeper where now the SQL that happened to be running behind this chart, now you can alter and push your own analysis.

55:09 Yeah, cool.

55:10 So we went from a dashboard to kind of your exploration session and into a SQL IDE. You can go deeper here and just like run. Your own analysis. Big playground for data. Yeah.

55:20 And you can pull up your table or your SQLAlchemy model. Maybe it is. I'm not sure.

55:26 You call it like a schema navigator. In this case, it's very much like you're navigating your database and object. Right. So you can navigate your schemas and see the tables and the views. And then there's good autocomplete. That's very much an idea. If you start typing, it will auto complete the table names and then the column names.

55:45 Yeah, super cool. You also get a query history. It seems nice, but if you're playing around, you like five versions ago of typing in this, I had the picture I wanted, and I know where to go. Go, right?

55:55 Yeah, totally. I think that for the people who speak sql, they can go deeper and run more complex analysis.

56:01 Sure. Yeah. Very neat. All right, well, maybe let's close it out with a quick conversation on this, and then I know we're out of time. I picked on Excel for having very poor source control options.

56:13 What's the story here about versioning and sharing and collaboration?

56:17 There's this thing in Bi called Headless bi. It's the ability to manage Bi assets as code. At Preset, we have built a CLI on top of the Super API that allows you to import and export objects from into the Bi tool to the file system. So it's really easy to say, I want to store this dashboard or the set of dashboard or the set of objects and manage them as code. So there's a CLI that allows you to push and pull from the Bi tool from Superset into Get and GitHub.

56:47 All right, let me see if I got this right. So I might create a folder init that as a GitHub repo or a Get repo, rather than I would export all my stuff, commit that, and then I would just, like, write over it and keep committing. And those would sort of track my changes. And if I ever need to, I can reinstate or rehydrate that thing out of the file set into superset.

57:06 Yeah. So there's more to it than that, and I'm going to try to explain it well. But once you say, hit the Eject button, which would be exporting the Bi assets as code, then you get a collection of YAML files that represents your chart, your data sets, and your database connection definition. Right. So your dashboard and it represents code when you push things. Well, first you can template things so it's YAML. So you can use Jinja, which is a great Python package, to template files so you can inject some templates into your Bi tool. If you were to, say, broadcast this object to multiple superset instances, or to say, I'm going to do permutation of variation on a theme. You can do that through templating.

57:48 Okay.

57:49 And that's through the Preset CLI as you push. Then there's a flag, I believe the flag is on by default, where it will prevent people from updating the object in the GUI saying this object is managed as code. The source code lives here for reference, so you can click and go see the code on GitHub, but then you can't save it because it's essentially read only and managed by Source Control. I think in the future we're looking to have a companion for each superset workspace on Preset to be able to have the full history over time of what has changed. So you can go and restore assets as they were a while ago. There's always someone that's going to delete something or delete the dashboard or change it in a way that are destructive and people want to roll back. So it's possible to do that through the CRM.

58:36 It makes a lot of sense to have some kind of Source Control story, but at the same time, because it's kind of a SAS thing, either self hosted your little baby Sass or at Preset, it's kind of a shared asset that doesn't need to be synced and pushed and pulled and cloned as much to allow people to work on it, right?

58:54 Yeah, there's different things. I think the Google Docs approach, which is to keep a Gui revision history and being able to see who changed what went is also valuable. And sure, we're going to see that in the future of Superset to being able to say, I want to look at the history of the dashboard from a Gui perspective. So that's something that has been requested and will have in the future. So call it your Google Docs. Kind of GUI there.

59:21 The managing asset code is different use case. Right. If you have an embedded dashboard, if you publish a certain dashboard as part of your application, that's more the rigor. Like I want to have in Source Control, I want a version, I want.

59:34 It's kind of like having a DevOps team versus someone keeps the server running. Right. There's different levels of maturity around different things. And companies, yeah, people want to have.

59:45 Flexibility too, like infrastructure as code, for instance, is great, but that doesn't mean that everything should. If you and I want to go and create AWS account and spin up some resources, maybe we don't need to start with TerraForm Script and then you can generate the code later. Maybe you can say like, hey, AWS, can I generate the TerraForm code of all the stuff that I've done in the past three days so that GUI to code can flow. Right, and then you have the other way of code to GUI. But yeah, it's important for this sort of tools, managing critical assets, to have these workflows like GUI to code, code to GUI and be able to have the flexibility and the both worlds as you go up in your maturity lifecycle and your need for rigor makes a.

01:00:29 Lot of sense to me. All right, well, we are well out of time here, Max, so congrats again on creating such a cool project. And I guess with airflow as well. Not even the first one. So very popular, and it seems like it's definitely taken off. It's great.

01:00:43 Yeah, it's been super exciting, way beyond my expectation. And I think really often the original creators get too much kind of recognition and reward compared to the rest of the community. Right. So what it takes for something like Superset to exist is it takes 800 people contributing and it takes an entire slack community. And really often we give a lot of credit to the person who created the thing. But you should look at, like, how bad Superset was.

01:01:09 The first person, the only person working on it. And what it really took off is when we saw, like, a set of really good contributors coming in and pushing it to the next level. Sure.

01:01:20 I definitely got some people excited about it in the comments and the audiences. This project has me stupid excited, which is lovely.

01:01:27 Love to see that excitement. Like, a lot of the validation comes through usage and value and people getting excited contributing and more, like, just using. Here, we'd love to see people just say, hey, we're using this.

01:01:38 We're getting tons of views in the wild. Put your stamp on it. Right.

01:01:43 I get, like, the communistic together. We can build better things than vendors on their own can. It's just like open source is a better way to not only to collaborate and build software, it's a better way to discover software, adapt software, and just, like, get to solutions.

01:01:59 Very cool. All right, well, before you get out of here, final two quick questions. Write some code. What editor do you use?

01:02:05 I'm still a vim person. I feel like I need to modernize. I'm not like, a Vim is better than all the ide's. It's just muscle memory at this point. It's just very common line and very kind of into vim and like, my specific kind of tune up for them. And it's not because I think it's better, it's just like, habit.

01:02:26 Yeah. Cool. There's a funny guy who I think he's German. He does this YouTube series making fun of different programming languages and communities. And one is this guy talks about how he fought in the vim emax wars. Yeah, it's pretty good.

01:02:40 All right, so you're on the same side. Fantastic.

01:02:43 But I'm not on the side, too. That's what I use. But at the same time, I encourage people to find something that works for them. Then I talk about the power of muscle memory. Right. Once you really know your tool set and the shortcuts, it's like your computer becomes an extension of your brain and your muscles, and there's beauty in that. So it's good to have a tool that enables you and to have that self training. Like, I'm going to train my muscle memories. I can do the things that I do all the time without things.

01:03:09 Right? You think, I want this to happen and then it happens. And you don't have to be conscious of it happening right in your editor.

01:03:16 That's the way clicking around. I do the sequence of like six clicks. This thing's punished up all the time. Why can't you just do, like, command Shift R?

01:03:25 Exactly.

01:03:27 Just happens magically.

01:03:29 Yeah, absolutely. And then notable Pypi package or other library founding. Like, this is awesome. People should know about.

01:03:34 Yeah. So I wanted to talk about that.

01:03:38 We use SQL very heavily. As you saw, if you're a Data practitioner, write a lot of SQL. I spend quite a bit of time writing tons of SQL in Airflow, a little bit in DBT too. More recently, there's this SQL Linter that came out. It's called Sql fluff. It's been around for a little while, people. So check out Pipe, SQL luff. There it is. And it's a very configurable sql Linter fixer. So, you know, we all love Pep8 and things like pep8 that are very deterministic and opinionated. I think we're not there in the sql world. People have not agreed on our Pep8 equivalent of SQL. This is like highly configurable.

01:04:18 So you can agree with your team on the set of like, Linting rules for your repo, and then it can fix a lot of stuff for you. So I think it helps. We're going to manage mountains of SQL. I don't like SQL that much, but it seems like this generation of Data teams is going to rely a lot on a lot of SQL. Then having Linter helps. Making that a little bit more bearable.

01:04:40 Excellent SQL Very cool. All right, final call to action. People are excited about Superset. I want to get started. I want to play with it.

01:04:47 What do you tell super? I mean, come to the GitHub repo. Check out Super Set. Apache org We haven't talked about the Apache Software Foundation too, but we're supported by the Apache Software Foundation in many ways. And then you should be able to find tons of resources. It is a little bit harder to get started than other things because it has such a broad piece of software that's very layered. We have a slack.

01:05:12 I think there is a type of issue that's probably called like starter issues. I forgot the exact name of it.

01:05:19 And then we have a Slack to get involved. And I believe in slack. There's a way to introduce yourself and there's a bunch of channels that are more like, how do I get started? How do I contribute? So there should be outlets for anyone who wants to get involved to get connected. If you fail at doing that, you can probably reach out to me directly on Twitter or elsewhere in my give.

01:05:39 You some there's a few people commit to the project, so there's got to be a lot out there.

01:05:44 That's like a thing, though, when the project gets bigger and there's more contributors, that doesn't mean it's necessarily more welcoming and easier to get into. There's more people, but sometimes there's not as clear of A, if you don't have it, BDSL, sometimes it's a little bit harder to talk to a single person and get the exact pointer that you need. So I would say just get on the slack, talk to a few people, find thinking about how you want to get involved, too, and be clear about your intentions, and then we'll be able to connect you in the right place, the right person.

01:06:14 Fantastic. All right, Max, thank you for being here. Thank you for creating this cool project. Looks like tons of people are getting value from it.

01:06:20 Yeah, thank you for having me on the show, too, and I'm going to go and look back at the episodes, and I'm always looking for good content, too, and keeping in touch with the python community, too. So I'm going to go and dig in your archives there.

01:06:31 Right on.

01:06:31 And listen to a bunch of episodes.

01:06:33 Seven years, almost every single week. So there's a bunch of episodes back there. So yeah.

01:06:38 Thanks so much.

01:06:39 Yes. See you later.

01:06:40 Thank you. Take care.

01:06:41 Bye.

01:06:42 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show.

01:06:50 Join Sentry at their conference.

01:06:53 Sort the Madness, the conference for every developer to join as they investigate the movement and trends for better and more reliable developer experiences. Save your seat now at, when you level up your python. We have one of the largest catalogs of Python video courses over at Talk python. Our content ranges from true beginners to deeply advanced topics like memory and Async. And best of all, there's not a subscription in site. Check it out for yourself at training. Talk python.FM. Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the itunes feed at /itunes, the GooglePlay feed at /Play, and the Direct rss feed at rss on talkpython.FM.

01:07:38 We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon