Monitor performance issues & errors in your code

#343: Do Excel things, get notebook Python code with Mito Transcript

Recorded on Monday, Nov 8, 2021.

02:07 Yes.

02:08 You built some really neat tools to help people get started and get up to speed and just be more efficient than just solely writing Python code, but not excluding that either with your product Mito. So that's super fun. And we're going to talk about that. But before we do, let's just go around. And how did you get interested in data science and working on this Python tool? Aaron, want to go first?

02:29 Yeah.

02:29 So a little background. Jake and I, you can't tell it just by the first names, but we are twin brothers, so we've been working on projects together for a long time. Nate has been our best friend since middle school. I think I didn't get invited to his 8th grade birthday, so I think maybe starting in high school and then he also went to College with us.

02:47 Fantastic.

02:48 Yeah. We got our first taste of data science at ten. We all study the mix of computer science and business and in the business classes. You do a lot of Excel based mostly, Unfortunately, Excel based data analytics work and stat classes and finance classes and stuff like that. So I think that's really where we got our first taste of data analytics or data science work. And then we've each had some experiences through internships and jobs that we've had over the past few years in the data science space as well. But really, I think it all goes back to the beginnings of the courseworks that we did at Penn.

03:22 That's great that you all are able to stay together, obviously, brothers, but continue to work together on this project. Business schools, a whole business program just run on Excel, don't they?

03:33 It's kind of amazing the contrast, because Aaron and I specifically both got degrees in computer science and in the business school. And so there was this transition where you'd be hanging out in class in the engineering school and you'd be writing code, and then you would like, walk up campus into the business school, and it'd be like returning to the dark Ages in some ways, right?

03:51 Yeah.

03:51 And what's really cool I think about Excel generally, is that what we observed is that it left our peers in school and us as well kind of complete, really some amazing projects that we might have not been able to do with code because our skills we were still learning at that point. So it's really kind of this really beginner friendly, amazingly powerful tool for what it is. But then we would go back down to the engineering school and be like, oh, my God. There's all this tooling here. We could have superpowers, but we don't know how to use this stuff, right?

04:15 Yeah.

04:15 There is very direct contrast that we witnessed where there's very cool stuff happening all over the place. But the tooling differential is pretty dramatic between business school and computer science world. And I think that's kind of what initially made us interested in this space.

04:27 Sure.

04:28 You don't want to underestimate the power of just firing up Excel, selecting a section, throwing up a graph or two. And that's incredible. And all of the functions and stuff. But if you think of bad programming practices, one of the worst ones has got to be like, do these three things and then go to over here and then do a couple of things, then go over there and then get that thing and then go back. We've banished this type of programming from regular programming long ago. And Excel is like that without even being able to see where the go to point. No, it's really not very predictable, right?

05:01 Yeah. It's really amazing.

05:03 It's crazy, I think.

05:04 Because once you start thinking about spreadsheets, you quickly realize there's a couple of crazy things about spreadsheets that people don't really acknowledge. We talk about them internally because we're spreadsheet nerds at this point. And we like, hyping up spreadsheets and stuff. But the original killer applications of computers were spreadsheets, right? Oh, yeah, and more than that, spreadsheets are the most successful programming environments in the world. Hundreds of millions of people can program and Excel, the next leading programming language, which has 1020 million. You know what I mean? So it's really there's an order of magnitude difference in how well adopted these things are. But you're totally right. The number of full guns in Excel and the amount of, like, 50 megabyte insane models that we've seen where people are, like, 75 tabs and they're all linked to each other in some crazy circular way.

05:45 It's really slow.

05:46 I don't know why not sure what's happening.

05:49 But, yeah, we probably have built a couple of those ourselves, probably even worse ones.

05:54 We've seen that. And those are some of the problems of spreadsheets that we kind of initially we were like, maybe these are some things that we can try to help solve.

06:02 Yeah, absolutely. And if you would go from business school back to the computer science side, I guess, specifically to the business school, did your business peers look at you like, oh, these are the guys that have the power to make the thing happen. They can help us build the thing that we can't quite automate or can't quite pull off.

06:20 I think it's always trying to work in a group in an Excel spreadsheet is a miserable experience. I hope you haven't had to do it, but it's like, you upload, you have a Google drive, and then you end up uploading new versions of the Google drive, and then it's usually a Google drive paired with a text message, group chat. And it's like, I just finished this sheet. Why don't you go up and download it again or something? Yeah, exactly. Make sure you're not doing it at the same time. I'm doing it. And I don't think Nate and I may solve those collaboration problems over at Penn, but I think it was those experiences and our thinking we've had our programming as a superpower made us want to start doing this. I don't know if it was always recognized by everybody else, though.

07:01 I think the big thing was maybe we weren't the best at attending class. It was hard to be a good group member in the first place, but one day will be helpful.

07:08 Yeah. Group work was always hard for me as well. Jake, how about you? How did you get into this whole project here?

07:13 Well, mostly by bloodline. I'm like some sort of Covenant. I think comes through that. But no, we started working really in this, like, Excel collaboration space. My background. I worked at a software company during College, sort of on the project management side of some data science project. So I had that quite the business side of it yet, but at least one separate you from, like, the coding product side of it. We started working with collaboration issues, and we built a few other products before it had some modicum of success there. But they took a step back. At one point, we're looking at like, what are the biggest problems with spreadsheets? It's the speed. It's the inability of the large data sizes. And it's the lack of repeatability, though. This allows you to do repeatable processes in an efficient way. And so the place we found that does all those really well is some kind of very deleted is Python, so they'd be able to stick a spreadsheet interface on top of Python. And that was sort of like a sentence we had written down. We're like, okay, now we need to backtrack and realize what does that mean? How do we do that with that and opened a few thousand cans of worms doing so?

08:20 Yeah, I'm sure. But it's a super neat idea. There's a lot of things that you can automate with Python, but with what you guys built, we'll get to it in a little bit. But what you all built lets you interact in this spreadsheet way, and then it writes the Python code. It doesn't just allow you to make changes. And then you got to stay in your tool, right. You use the tool to write code that otherwise might be a little bit of a stretch for you.

08:45 Yeah.

08:45 I think I can talk more high level, and they can talk about exactly why and how we do that. But a lot of just from the business side, a lot of other tools will try and extract Python away. So we'll give you allow you to do the types of workflows that you would do in Python. But in a Gui individual environment, we are much more tethered to the Python or try to be more tailored to the Python and the notebook. It's really important staying in your Python environment, then you're not at a disadvantaged because you don't have the code.

09:12 The code is right there.

09:13 It's being generated real time. And that's important for yourself. If you're learning Python, if you're trying to use the code or if it's a communication layer, you want that code because you want to send that to a developer who's working in Python as well. So it's really important for us to be tethered there.

09:27 Right? Yeah. It probably allows you to bring more people into the actual project than before.

09:32 So yeah. You're much less siloed by being in the environment.

09:35 Yes. Absolutely.

09:36 That kind of like mentality of when you're building tools for beginners or people that don't know, maybe the professional software make it really point and click and kind of like hide a lot of the complexity. I think that's something that we've experienced with tools that we've used. For example, we use Stripe and Stripe creates a bunch of dashboards for you. But the problem is we have no idea what those dashboards. What is the nitty gritty details of how those numbers are calculated? And so we have all these metrics, and we have really poor understanding of, like, what is this actually telling us? And so I think something interesting that we definitely try to do is we give you people that are maybe less familiar with writing the syntax, yourself, the ability to, as you said, point and click and use the spreadsheet environment, and then generate the code. And then if you ever have questions about, oh, what is this Pivot table that I created? What does it really mean? Then you can look at the generated code and see exactly what's going on. And I think that kind of like understanding where users need help and where users want as professional as possible.

10:34 Nitty gritty details is a stratification that we've thought about. And I think have a somewhat unique approach to when it comes to this, like, no code, no code tools. Yeah.

10:43 Absolutely. So I kind of want to set the stage by talking about some of the different things people are doing with notebooks, because notebooks have really taken over in the data science space for good reason. I think we had IPython notebooks, and we had Jupyter. We had Jupyter lab, which is doing a little bit more than just Jupiter, and people really love them, I think. And while Jupyter lab is great, I think there's a bunch of creative things going on, trying to extend it and use it in different ways. And I feel like Mito falls in there. So I wanted to throw out a couple and just see if you all have heard of these, and if so, get your thoughts on them. One of them is this thing called JUT J-U-T jut. Maybe something like that. And what it allows you to do is it allows you to actually view notebooks in the browser. So have you guys are not the browser in the terminal. Have you guys seen this?

11:36 I haven't seen it, but I do a lot of sympathy for the unknown pronunciation of Chut or juice, because we get Mito and mido a lot. So my heart goes out to the two for sure.

11:46 Yeah, I'm sure I'm messing it up, but yeah.

11:49 So here's a way to say like, well, these notebooks are so popular. Let's see if we can show them in the terminal. Like if I'm SSH into a remote machine and it has IPI and B files, and I just want to see them, how do I look at them? So you just say Jut, I'm going to go with Jet. You say Jet and then the file name and then boom, it shows it right there using Rich, I think.

12:10 Super cool.

12:11 Yeah.

12:11 I think that this is I'm sure this comment will expand out as we see more of these, but I think it really demonstrates something that we've really observed, which is that notebooks are not like they're just, like, Excel in many ways. They're not a tool that's used by one person for one specific thing. Right. So we use a tool called Mix Panel, for example, it helps us track some metrics, for example, product analytics.

12:32 Yeah.

12:32 I use it before it gives you an insane amount of analytics of like, where did these people come from? How did they find my product and stuff like that?

12:39 Right.

12:39 Exactly. You can kind of imagine, though. In some ways, it's just for product analytics. Right. And the really interesting thing about a lot of our users is that one thing as we're trying to learn about our users and work with them to improve the tool. One thing we really realized is there's people from all over the place and they're interested in notebooks for 472 different reasons.

12:57 Right.

12:57 And so some people using notebooks are people who've never written any Python code before in their life. Some people have two weeks of experience, and some people are 75 year old developers who only use the terminal. And if you show them anything else, they'll try and fight you. Right. So I think this demonstrates that there's really an appetite for a wide range of ways of consuming these things and presenting these things and editing these things on these notebooks. Specifically.

13:21 Great analysis.

13:22 I totally agree that's another cool thing about tool like this or tools like this. More generally. I think what they sort of app to do for a product perspective is condense down. What are the really powerful things about notebooks? Because they're taking a notebook and bring it out of a notebook environment. And I think kind of what we're trying to do with the spreadsheet is like, what are the values of a spreadsheet? How can we bring a spreadsheet into other environments? So I think it's, like, cute is just trying to do that. It's an interesting way to think about product.

13:48 What is the estimates? What are the essential parts of a notebook? What are the essential parts of a spreadsheet? How do we translate those and bring that value to other environments? I'm interested in tools like that.

13:57 Yeah, for sure. This is an interesting one. Another one is they just came out with Jupyter Lab desktop version, and I suspect that you guys could even integrate with the Jupyter Lab desktop app. Right.

14:09 I hope so. Most likely. Yes. The answer is probably yes, but I'll be honest with you, candidly, if I spend one more minute on installation problems, I might chop my arms off or something. I don't know, but I'm sure I don't have to preach to you. I'm sure you've heard this a million times, but the Python installation ecosystem environment issues.

14:29 It's a massive blocker. I knew it was a massive blocker from an individual level.

14:33 Oh, I bet if you're trying to reach people who are going through it, I would just want to use Excel and have a little bit of code.

14:40 Conda Virtual Environments pip versions of Python. Yeah. You probably get a couple of questions about that every now and then.

14:51 Yeah.

14:52 Exactly. I think that this is a really cool example of I guess the Jupyter devs realizing that distribution is one of the primary problems here. Certainly with us, it's like the primary bottleneck in users trying our product is not they can't figure out how to use it. It's that they can't even get the thing installed in the first place, making that as easy as possible. I'm sure there's still work to do here, but really hats off to them and definitely something that we're interested in working with them in the future. If we don't already, there's probably some hack to do it currently.

15:18 No, I have not done anything in earnest with Jupyter and a desktop, but I have installed it and read it, and I think it comes pre assembled with Python and condo. You don't have to have it basically comes all set, and then it just hosts Jupyter Lab locally inside of an electron app, so it might even be better for you guys. I'm totally sure, but if you can make it work at all, I bet it's better.

15:41 I think what Nate is talking about is it's a trade off that any tool or product building as an extension to Python is going to face, especially with one that's trying to make parts and data science more accessible to a newer audience. If you're building it in the juvenile environment, there's a lot of freebies you get. There's a lot of great Nuggets, valuable things with these you get. But installation can be such a nightmare that you might be casting away a certain part of the funnel just by doing it.

16:05 One of the benefits I guess that you all will receive is people can go Google for help setting up notebooks and getting the notebook started and all that and you don't have to be part of that, right? Like there's a whole ecosystem of people running notebooks, people writing articles about using notebooks for beginners. And so you can just sort of level up on top of that and say once you go through all what they show you over here, here's how you go. Obviously you want to help people succeed, because if they can't get Jupyter going, they can't use Mito.

16:32 Yes, I was creating documentation for getting set up with Mito and I went to the Jupyter lab documentation and for things like creating a new sheet. I just grabbed the Jupyter Lab YouTube video and put it in our documentation free.

16:44 That's one of the things I like about your documentation and your site is it's sprinkled with screencast, like little examples of how to do stuff or how to demonstrate stuff. And I think more places should do that right. There's so many places or so many projects that I just don't understand, though. It'll be a UI framework and there won't be a single picture of anything.

17:06 It's about picture. The whole purpose is pictures. Give us a picture, at least like a Gallery or something. And similarly, with how do you use things and just hats off to you guys for putting those in there because I think it makes a big difference.

17:19 Yes.

17:19 One of the reasons I think we've done that and had some good videos is just in terms of just growing the tool. We partner with a lot of people in the YouTube data science community, and they're all really good at making demos. I think we learned a lot from them. People like the data Professor Krishna.

17:34 We work with a lot, but they sort of at least myself learn a lot from how to present a tool in a video, and it's obviously really valuable for going.

17:43 Yeah, I prefer to just fire up a two or three minute video and watch it instead of reading through and see what I really got to pay attention to.

17:51 This portion of Talk Python to Me is brought to you by 'Shortcut', formerly known as Clubhouse IO. Happy with your project management tool. Most tools are either too simple for a growing engineering team to manage everything or way too complex for anyone to want to use them without constant prodding. Shortcut is different, though, because it's worse. Wait, no, I mean it's better. Shortcut is project management built specifically for software teams. It's fast, intuitive, flexible, powerful, and many other nice positive adjectives. Key features include Team based workflows. Individual teams can use default workflows or customize them to match the way they work. Org wide goals and Roadmaps The work in these workflows is automatically tied into larger company goals. It takes one click to move from a roadmap to a team's work to individual updates and back. Height version control integration, whether you use GitHub, GitLab or Bitbucket Clubhouse ties directly into them so you can update progress from the command line keyboard friendly interface. The rest of Shortcut is just as friendly as their power bar, allowing you to do virtually anything without touching your mouse. Throw that thing in the trash. Iteration-planning, set weekly priorities, and let Shortcut run the schedule for you with accompanying burn down charts and other reporting. Give it a try over at 'Talkpython.FM/Shortcut. Again, that's 'Talkpython.FM/shortcut'. Choose Shortcut because you shouldn't have to project manage your project management.

19:20 Jupyterlab desktop is one of these sort of interesting things. Another one is Jupyter Lite, which is Jupyter written in WebAssembly that runs just in the front end of the browser. You guys check this thing out?

19:31 No, but this sounds amazing. I don't know if I saw the WebAssembly version, but I saw a Python compiled to JS version of this, etcetera, but it would be very cool if everything could happen in the browser because I feel like installation problems would evaporate, which is the goal. It's the goal at the end of the day. So this is super interesting. We'll definitely check this out.

19:49 Yeah, this might also be relevant to you guys because it's basically Jupyter running, and then they run in browser, WebAssembly language kernels, a limited set of CPython in WebAssembly in the browser, and probably also Julian R and all those things.

20:04 Maybe partially nontechnical audience.

20:08 What is the benefit here? I can take a shot here.

20:10 So how I would describe this is that normally the way Jupyter lab works is that it's a client server model. So your client is effectively the thing that you, as a user interact with. You have a notebook that you see you write code in the notebook, you press shift enter to run a notebook. So what actually happens when you press shift enter is that code gets sent to the server. The server, in this case, is a Python Kernel, and the Python kernel actually runs that code and then sends you back the results. So there's what you see on your web browser. And then there's a server running, usually on your command line or something, and you send code to it. It executes the code and sends you the results. The benefit of this thing is that you can get rid of that back end. So there's no more installation, and instead you only need this kind of notebook that lives on the front end. Web assembly is a tool that allows us to run the code on the front end. And so instead of taking code, sending it to the back end and getting it back, we can do that all in one location.

21:00 The challenges you've already been talking about are the challenges of setting up the back end to get Python to run on your machine, the environments and the dependencies. And this would, theoretically, in principle, at least avoid all that. Just load it up and it goes.

21:11 Yeah, it's super cool.

21:12 Now we'll definitely look into it because this is an area of continual research for us how to improve our installation process.

21:18 Yeah, very cool.

21:19 All right.

21:19 What else have I pulled up here?

21:21 Paper Mill lets you treat notebooks like almost like functions that you can execute a notebook and then get a value out for all sorts of things. This is probably the least relevant to you guys, the things that I pulled up here because it's exactly about making it not interactive and kind of skipping that. But just another interesting thing that people are doing to kind of add more or do more with these notebooks.

21:44 Yeah, it seems super cool. Arrogantism. Now I was just going to say we've seen people who have notebooks where they call functions and other notebooks and all of that syntax. The Panda syntax is confusing. But then, like linking notebooks and all that is confusing as well. So I think it's all interesting to see. I think just like spreadsheets have so many different use cases, and they touched on this earlier notebooks have an incredibly large number of different use cases, with people from top level data scientists to people just getting started. So the tooling ecosystem that's being created and it already exists is really quite diverse and really powerful.

22:18 Absolutely. I think I've got one more for you here. This might be the one that you heard of Nate called Notebook JS. No, this one parses them and then renders it down to HTML, which is, I guess I haven't one to open, but that's all right. So, yeah, that's pretty neat, that it'll basically allow you to turn notebooks into HTML.

22:39 And I think just to kind of draw a categorization model over what we've kind of seen before. I think there's a couple of areas of focus, and these are things that we focus on internally. And I think we'll themes that will come up throughout the rest of the conversation. There's things that we'll call, let's say presentation.

22:53 Right.

22:53 And presentation is looking at the outputs here in the terminal or looking at the outputs easily on a web page without running a server. Right. And presentation is something that we encounter. How do we present data? How do we present conclusions from analysis, etcetera. Then there's another thing which I think this paper mail thing that we looked at last is super relevant on which is this idea of repeatability, as anyone who's worked in notebooks knows, they're really great areas for being a scratch pad, it can get pretty confusing sometimes when you've got, like, all these cells lying around, you're executing things out of order.

23:23 That's the thing. You know, we have sort of solved the Excel. This cell refers to that cell, but you have this human aspect of notebooks, right? I can go. I want to try this. Try this, try this, go back and change this and then run this and the stuff below it. You can run them in different orders and even change them and then have some of the invisible changes stuck in another UN run, cell ads. Yeah. That could be a problem.

23:47 I mean, Aaron and I spend all day notebooks. And the amount of times where I'll go up to Aaron and be like, Erin, dude, look at this. This code makes no sense. What is this bug? I understand it. And Erin's, like, you fool, you've been one, three, four, not one, two, three. What were you thinking? Just the amount of times that even we, as people use notebooks all day or that is really dramatic. And so I think that area of reproducibility and repeatability is something that we spend a lot of time thinking about, mostly because a lot of our users are really interested in it, and it's problems that they struggle with. And it's something I think we'll definitely get into what we actually ended up building.

24:18 Yeah.

24:18 Awesome.

24:19 So we could get into it. Let's talk about Mito. So I think Mito lives in this realm of these interesting ways to do more with notebooks. And as we've already hinted, that basically take an Excel user interface, stick it into a notebook, and then allow people to interact with the data within the notebook in Excel style way. Right.

24:41 What's your elevator pitch?

24:42 Elevator elevator. Essentially, it's a spreadsheet interface for Python, so everything you do in the spreadsheet is going to generate the equivalent code for you in the code. So below, if you're watching the video right now, you're seeing a little demo happened. But we have features for exploratory analysis. We have features for data wrangling data munching. I've heard lots of different words for that type of process there and then grabbing as well. And also we have the ability to save and replay analysis sort of like a macro. And in terms of users, we have people who are newer to Python using it to sort of introduce themselves to the Python world. They're learning as they go. But the nice thing is that they're not held back by the syntax any point. So you're not Googling syntax not going to stack overflow you're in your notebook the entire time, sort of staying in a state of flow. And then we also have more advanced Python users and people that are maybe in the middle using it just to get their analysis done more quickly. It's really fast to do something usually like the code for graphing or Pivot tables or merging. Take a lot of time to type it out and get the syntax right. So in our tool, you just do it in an Excel interface that you're used to, and it's not the correct code for you.

25:45 Yeah, it's cool. People definitely need to see the little video. So if you haven't checked it out. Obviously, I'll put a couple of videos in the show notes, but that's the idea is you come in and you just basically inside the notebook as part of the sale. You might be familiar with having an interactive widget for some kind of graph, right where it's got some sliders and stuff. It's like that. But it's Excel ish, right.

26:08 Yeah.

26:09 Exactly.

26:09 Was this your first idea or where did this whole idea start?

26:12 We've worked on a lot of different spreadsheet related tools over, I guess a year and a half ago or two years. A long time ago. At this point, we started building like, a GitHub for Excel. We're building essentially, like difference detection, allowing you to merge, going back to those original problems that we talked about, that we experienced at school. So we were building kind of this GitHub Excel platform. We were primarily talking to investment bankers and people in private equity and these, like, really Excel power users. We built the tool, we've interacted with a lot of them eventually realized that wasn't maybe the most helpful space that we could be working in and ended up finding a way to this, whereas more Python based Jupyter notebooks and just making them more accessible in a stake said how many people across the spectrum from beginners to more advanced data analysts get their analysis done faster.

27:05 Sure. I think there's two angles here. One is just you could be really fast, right? You could say, okay, I want to do I see the data. I want to sort by this. Now. I want to drop these three columns, and then I want to do a computer column of sort of thing or maybe join two pieces of data like two data frames and create, like, a larger one out of that. And so on. You could be really quick with that. But I think maybe even more important to that is helping people who are just stepping into the data science side of the world. Right. They've been working on some other tools or no tools whatsoever, really. And then they hear, oh, I should go do Python and I should do notebooks. And then they're confronted with Pandas, which pandas is great. And it's not that hard to use. But there's a zillion things you can do with Pandas, and it's not super discoverable what it is you should choose. And also the notebooks don't really help as much as they could. I think, in presenting the features of an API, right. Like if I'm working in PyCharm or Vs code and I say dot like, Boom, there's a bunch of descriptions, and if I get the mouse near, they'll give me examples and documentation. And here it just lets you type unless you hit tab, and you explicitly asked for it. And a lot of people don't. I suspect if they're coming from economics, they don't know that tab has this magic to show me what I can do and stuff like that. Right.

28:21 Those people making the transition to the notebook environment are doing so at along a spectrum of willingness as well. Some people are doing it because they're really excited about up leveling their skills, learning Python, making their own analysis faster. And then some people are doing it because upper level management wants me to do that. And I think those people are really grateful for having a tool that will help them meet the workflow requirements that they're working in, but not necessarily having to. I'm sure you read, at some point the Pandas documentation, and it's one of those that has lots of great examples in fairness, but no pictures.

28:59 Probably not a lot of tutorial videos in there.

29:01 Yeah.

29:03 Got you. I would add a sort of a third category to what I was saying, which is like, it's people not forced by management, but it's not sort of out of their own interest in Python. It's more so they just sort of forced you based on what they're trying to accomplish. If they have a certain data set of a certain size, you simply can't do it. Excel at Google sheets. So I need a medium that's going to allow me to facilitate data analysis on this scale. And it's really frustrating if you're growing it depends to do that and spend all your time doing simple things that you're so used to doing in Excel, like adding columns, pivot tables, whatever it is using formulas. And now you're spending 90% of your time going through syntax and making sure that you're typing out and capitalizing the right letters in a code that can be really frustrating. So our tool is really trying to allow you to stay focused on the actual analysis the entire time and less about getting your code right.

29:48 Many people are Excel literate, right? And so kind of leverage that a little bit.

29:52 Yeah.

29:52 And then for people who aren't coming from Excel, but are just data scientists, they're aware it's not Excel literate, but they're sort of like visual literate.

30:00 That's a horrible phrase.

30:01 But what it means, essentially, is that they understand how to do this. They understand.

30:05 Okay.

30:05 These are the things I need to do if you're the but do here are the functions I could use. So it's still a really valuable environment for everyone, Excel or Python. Otherwise.

30:12 Nice. So question from the audience out there phone says, Hello, what are the limitations of Mito?

30:18 That's a great question. I think the honest answer is that might it is certainly work in progress. And there's a large portion of Pandas functionality that we don't currently support. So kind of I think a helpful context here is how we're actually developing this tool. So Mito is very much, I would say, a collaboration with our users. And what that practically means is that join our discord. There's a feature request channel, and anyone has the ability to show up and essentially say, hey, this is missing from my workflow, and we'll say, oh, that's on a roadmap or, oh, that wasn't on the roadmap. Allow that or help us engage with your workflow and understand where that's coming from. But essentially what's happening is over the past six to eight to nine months or so. We've been working really heavily with our users to kind of build out the core pieces of Pandas functionality that most of our users work with one by one and kind of investing their workflows back into the tool.

31:03 Yes. Which way do you take it? Do you go and say, These are the things people need to do in Pandas? How do we surface that in our interface? Or do you say these are the Excel type things people are happy with? How do I make that happen? And Pandas, which direction do you find yourself going?

31:18 It's a really great question. You can definitely speak to this more because this is kind of part of our process internally that's been evolving, but really kind of what we try and do at this point is we try and work very heavily with our users to understand at the highest of levels. What are they trying to accomplish? Someone really isn't usually trying to make a Pivot table usually what they're trying to do is conclude, should I tell a salesperson to do X or Y?

31:41 What state are we spending getting the most sales in? And all I have is zip codes or something like that.

31:46 Exactly.

31:47 A lot of people that we work with, and this isn't the only thing. But one of the things, for example, is they're operating the level of I'm looking to predict this feature or understand what affects this piece of my data. And so what we do is we kind of work with them to understand their workflow, and then we use that to internally figure out what features drive this, what does Panda support. And how can we provide an interface on that? That really lets people get this done as quickly as possible in a way that gives them as much flexibility at the end of the day, if they need to take this Mito and go run with the code somewhere else.

32:14 Yes.

32:18 I feel like in the beginning they were saying we were very focused on let's put Excel into Python. And it was like more about the Excel functionality. But I think over time we've come to think more about the questions. More what is the best visual interface for Python for data science?

32:35 Some of the things are from Excel. Some of the things are around creation. It's still a spreadsheet interface, but the goal is certainly not to take Microsoft Excel and give you all of that in Python.

32:45 Interesting. Yeah. Cool. I think maybe a good way to understand this. How this workflow works and get a peek inside of the features is over here somewhere.

32:56 I think in the documentation side of things right at the beginning. Here quick tutorial here. You've got this sort of. I don't know. I'm not bouncing around not finding quite where it was to tell you. But there's this example where you go and load up a couple of CSV files and then join them to create a Pivot table and stuff on that like that. Maybe give us a talk through what working with that data flow kind of looks like I talked about this idea of, like turning zip codes into States and that kind of stuff. What does work with Mito feel like it was a sense I can talk a little about that.

33:28 So I think working with Mito feels hopefully very fast and very intuitive, or I think two of those things and maybe robust or maybe like the three kind of things we strive for. I think in terms of workflows that we see people doing a lot of people kind of fit this, like feature creator and Automator use case that we think about. So it's people that have some business sort of question, like where all my sales coming from when I just have these zip codes. And they're probably trying to do that in Python because they've been doing it in Excel over and over again each month, and they're looking for a more robust and more automatic way of doing that. So I think a lot of ways that we see these workplaces play out is people either have a CSV file or they have, like, a Snowflake connection to get some data. And then they start by doing some simple EDA and trying to get a better sense of what their data actually looks like.

34:23 Right? Without Mito, it might be something like DF.Head or Sample and just kind of get a visual look of a grid of data. Right?

34:31 Exactly. Or even some people they want the more visual interface. So they're downloading it to Excel initially and doing some manipulation or something like that there. But then that workflow is very separate from their more automatic script that they're creating. And then they have to kind of reconcile any changes that they've made back into the Python workflow. But yeah. So Mito has a bunch of features. There's some here. If you scroll down in the documentation on the left hand side, you might see some stuff that might be helpful. So there's things like summary stats right there, which will show you like a graph, a distribution of the data in each column, a lot of those described functions, and has a really intuitive filtering. So things like filter by value. So you can see all the unique values in your column, and then you can toggle them in and out of your data set. Or you can add more customized filters, whichever you would like. But then once you move past this kind of like initial data cleaning, some people do write spreadsheet formulas, so we have a bunch of Excel's most popular formulas, like data manipulation and date parsing. Here's a bunch of here's a few of them. Once you kind of get a sense of what your data looks like, you can do some of these transformations, and then ultimately you'll end up with a script that you can use to run over and over again and never have to go back into the Excel world to fight through the manual process.

35:50 Again, this portion of Talk Python to Me is sponsored by Linode. Cut your cloud bills in half with Linode's Linux virtual machines. Develop, deploy, and scale your modern applications faster and easier. Whether you're developing a personal project or managing larger workloads, you deserve simple, affordable, and accessible cloud computing solutions. Get started on Linode today with $100 in free credit. For listeners of Talk Python, you can find all the details over at 'Talk Python.FM/Linode'. Linode has data centers around the world with the same simple and consistent pricing, regardless of location. Choose the data center that's nearest to you. You also receive 24/7, 365 human support with no tears or handoffs, regardless of your plan size. Imagine that real human support for everyone. You can choose shared or dedicated compute instances, or you can use your $100 in credit on S3 compatible Object storage managed Kubernetes clusters and more. If it runs on Linux, it runs on Linode, visit 'Talkpython.FM' and click the Create Free Account button to get started. You can also find the link right in your Podcast player Show Notes. Thank you to Linode for supporting Talk Python. To get data into this spreadsheet like front end. That is Mito. You basically just have to have a data frame, right? If you have a data frame and there's a ton of flexibility for doing that right, you could load a file, you could get it from the Internet. You could do even read HTML off of a URL and then go grab a table and then there's your data frame. Work with that, right. There are simplifications, I guess, like when you're in Mito, you can hit, like, file load equivalent. Right. And browse to the files and then it'll write the Pandas code, say, like PD, read CSV, whatever you selected, right?

37:42 Yeah. Exactly.

37:44 One of the things I think would be really interesting to talk with you about is generally in your kind of survey of the Python ecosystem. It's interesting because Python is code, right?

37:53 Yeah.

37:54 In some ways, we're on this boundary this flexible border between pure code and low code. No code tool. And there's been 100 million low code, no code tools that have existed over the past 25, 30 years, whatever. And some of them are around. And some of them are. And really the question for us that we kind of ask ourselves is, what unique do we bring to the table here and as a local tool, how do we differentiate ourselves? And it's that exact idea that you can also just pass a data frame we're not necessarily interested in. We don't want to stop you from writing Python code. We want to enable you to write Python code as easily as you possibly can. And really, that's how we see ourselves Manning that spectrum.

38:29 I think it's a really good place to be because my hesitation with all these low code, no code tools is they usually one lock you into their thing, which is often a SAS thing. So you're locked into having your data there and continuing to subscribe to a thing which don't get me started on subscribing to so many things. I suggested that I subscribe to my Internet Speed checker app, not pay for it once an hour.

38:56 You need to check right?

38:58 Yearly. I can subscribe to it if I pay yearly for my speed checker. What are you doing? Too many things that are subscribed to any. Not that subscriptions don't make sense for a lot of tools. But what I like about this is like you said, you bring the data from wherever and you could do this out of a Postgres database and you build up a data frame, and then you could throw into this visual place that speeds you up, and then what comes out the other end are more data frames. Right. So you could even do multiple, maybe tell me this is true. It seems like you could you could do some regular Python code, some Mito that generates a really interesting transformation, some more Python code, and then maybe another Mito block that takes another bit of output and then brings in more data. Does other stuff you can kind of mix and match throughout these, right. Throughout the notebook.

39:43 Yeah. Exactly. And actually, the other day we went into GitHub and we searched Mito sheet to see how people are using it. And you see people using it in that exact way where they're importing data. They then generate some code using Mito. And then one of the things admittedly that the tool is maybe not the best. That right now is graphing. We support creating basic graphs, but not changing the colors, changing the titles.

40:05 All that what UI frameworks do you support?

40:09 Matplotlib altar.

40:12 Okay.

40:12 Yeah. We generate Plotly code, which has great documentation.

40:16 You can go in there, we give you a link and you can go in and make your edits. But this is one of those places where generating Python code, and in this case, fontly code is super helpful for everybody involved. For us, it's helpful because we didn't have to recreate the Potley library, which is massive, but for everybody else, it's also helpful because we didn't generate the Plotlib. We didn't recreate the party's library. You are able to use the entire plot ecosystem, and you're not locked in at all. So for something like AlTrics, where you can use their graphic features, and if they don't have the graphs that you want to create, then you're kind of out of luck because you're locked in. And there's not really an easy customizability path, but you own your Python code that might have generated. So it's up to you to do whatever you want.

41:00 Yeah. I suspect you guys don't recommend this, but you technically could delete out in a lot of cases the mito bits after you've generated your Python code and keep going like, okay, this was really helpful, but we actually don't need this anymore. There are cases where some of the Excel like functions really come from you guys. Right. But a lot of it is what is writing is pandas and NumPy and Plotly code, right. Just speaking to the lock in or not the lock in side of story, right?

41:27 Yeah. I think we definitely support that in Mito. Actually, we have a button. Clear data analysis is all about very iterative building your understanding of what is useful and where you want to go. So one of the actually the most commonly done things in the tool is actually clearing all of the edits that you've made to your analysis and getting rid of Mito. Now that you have that understanding and wanting to be taken in another direction, I think we're big supporters in our own development process of cleaning your workspace. Nate is proud of how clean our code is, and it might always help for you right now. Definitely get rid of it. So you have an easy workflow, easy notebook to clean up and debug. Yeah.

42:05 I think that's a big contrast going to some other no code SAS system. There is no, I don't want to use this exactly anymore. Just let me carry on with my analysis. There's none of that in most of these tools and with yours, I think it's there to support you. And I can see it being incredibly valuable, but at the same time, it's not the essence of what you're doing. It's the UI on top of Python, as Jake likes to say.

42:26 Yeah, I was just going to say it's in the point of deleting out the Mito sheets. We have a society of secret Lighto users who are trying to convince their bosses that they're really good at Python to get.

42:36 So we support that workflow very well.

42:39 Exactly. So what are the challenges that you can have when you have a machine write code is that it writes bad code that's hard to understand. And Nate, it sounds like you might have had some input on this. That when you interact with Mito you do certain operations like I do a filter, I create a Pivot table, or I filter out certain things. It'll actually write step one, you did this. Step two, you did that and it writes what looks like pretty well formatted Python code with little bits of documentation like Pivot table, reset the column name and indexes and stuff like it'll even comment your code that it writes. You want to speak to that a little?

43:14 Yeah. Absolutely. I mean, this is a really interesting area that we've put a bunch of research time in, if you can call it that. So there's a couple of things that I want to I think highlight here, and I'd love to also hear your thoughts on as well. So you're totally right. Machine is writing code. It's all the rage these days, in some ways, with these fancy machine learning systems where you can write a little prompt and everything will get written for you might have taken a little bit of a different approach. Where exactly you say when you, for example, add a filter, it'll generate a line of code that corresponds to that filter. Of course, immediately the question becomes, this is not exactly how I want the code to be. I didn't mean to do that. I meant to do something else.

43:49 Really.

43:49 The way we think about it is giving the user that code in the cleanest way that we possibly can. So this is an example of this. If the user adds a column and then immediately after renames that column, a feature that we're actually releasing this week is that it's going to get collapsed into just the adding of the column with a new name. Right.

44:04 Nice.

44:04 That's ultimately what the user was intending to do. There's other more fancy things that you can do. You can start getting into kind of code optimization where it's like you made a Pivot table, then you overwrote the pivot table. And those are all things that we're definitely interested in and kind of are on the roadmap for improving. But generally, you're totally right. If we want users to be able to learn from this code, to be able to use this code, we need to generate clean, semantic Python that really works in the wild and is actually editable by the users. Otherwise, it's just a blob a mass that you can actually interact with.

44:33 Yes. What you get, at least from the examples I've seen. I would be happy to take that and then start writing directly on that. Even though you don't see it on the screen. It's like right above. I scrolled it right a little bit off there, but it says, don't edit the section. This is Mito code. I guess that's if you still want to be able to use Mito on it, don't mess it up.

44:51 Right. That's a feature we're working on this week is improving that communication. You can take Mito Code and edit it and change it however you want. The only problem is Mito might have trouble reinterpreting that if you try and then later replay an analysis. But that's also a work in progress, and ideally the kind of perfect version of what we envision long term, and maybe we'll get there. Maybe this is impossible, but one really cool thing that we're kind of thinking about is edit a spreadsheet to generate Python.

45:17 Yeah.

45:17 I was thinking if this could be bi directional, that would be fantastic.

45:21 Right.

45:22 It's this world where you're really fluidly writing code and editing a spreadsheet. And if something is easier in code, go write the code. If something's easier in the spreadsheet, go write the spreadsheet. And that's definitely a vision that we've seen ourselves kind of realizing over time currently and something that, as you mentioned, you can go Mito Python, Python and do that currently, but making that easier for our users to really use the code in the dynamic and real way as you would any other Python code that you write. It's definitely something that we're actively investing in right now and really trying to improve.

45:48 Nice.

45:49 Yeah.

45:49 I like the idea of being able to have it do some optimizations rather than create a variable and then overwrite the variable with some other things and do it all at once. I'm sure there's a ton of stuff in Pandas that could be done better rather than like a really naive straightforward, like multi step stage, right.

46:05 Yeah.

46:06 And the other thing I will say is the one benefit actually to generating this code is that and not to insult anyone, because I'm the biggest offender of this but most data science scripts you see in the wild are not the pinnacle of clean, well kept code. It's usually out of order notebooks where, because it's such a dynamic process, it's just very hard in practice to keep these things well organized. Actually, what we can do in practice is generate some documentation for what's happening and help users save and manage these scripts in a more linear and organized way, etcetera. And help users kind of adopt these best practices. And that's sort of the stuff that we've been exploring recently, and Aaron was actually working on some of this today, improving some of this code generation stuff, but it's not the highest of ours to meet, and we definitely think that we can continue to improve and surpass that and make sure that the code regenerating is really great stuff. You'd be happy to edit.

46:53 One thing that while I'm looking at it, is to throw this out of here as a piece of feedback with very little actual experience. So take it with a grain of salt. You've got like, step one, step two, step three. It would be cool if those were actually separate cells. So like, at the end of, say, step three, I could do, like, a Pivot table head or something just to sort of touch on it and explore it a little bit along there.

47:13 Yeah. No, definitely. And a similar thing that we've also certainly thought of that multi cell approach would enable is changing the order of steps, switching things around and saying, oh, I actually want to filter first and then Pivot versus Pivoting and then filtering. Yeah, certainly on the roadmap, but definitely something we want to do, giving users more options on how they actually export this code and what they do with it at the end.

47:32 Yeah. Super neat. I got a question out here that kind of leads into where I was going to go with this anyway, and I switched the order, so I don't cover your head. Jake, Bonnie also asked, does Miter support switching or switching from or to or using DASK?

47:47 Got you?

47:48 Yeah.

47:48 Not currently. No. So the one really cool thing is because of the way Mito works internally, which is something we can definitely get into depending on if you think your audience will be interested in it what the appetite is, but we really have the ability to kind of switch out. Let's say what the back end is of mito. What code do we actually end up generating is something that we can leave up in the air. And really our interface can be a more general thing than just Python code or just Pandas and Python code.

48:14 Yeah.

48:14 Especially for these frameworks that are near compatible API compatible with Pandas. Right.

48:20 Like Dask is almost exactly from most of the basic operations they've aligned on the Pandas UI or the Pandas API just for how to kind of handle things. So it's definitely something that in the future we could support. And if we have users who are working with huge data sets and need to Dask, that is definitely something we'd be interested in learning about.

48:40 Dask.

48:41 When I first think of dask, I think large clusters, massively scaled out data. But then at the same time, right over here I have my MacBook Pro Max, which has ten cores on it. And when I run Python code, I get 10% of that CPU, right? And Dask will allow you to scale across your CPUs, even on your local machine or scale larger than your memory and stuff. And so I feel like even for just making your local work feel better is actually probably under realized or underutilized that's super cool.

49:12 It's interesting that the range of data set sizes that we actually see in practice. It's very I would say, at least from my observations. And Aaron Jacob are topping on this, but it's very pymodal and that there's a lot of people hanging out with 100,000 rows, and then there's some people are like, hey, I have 100 million records I'm looking to analyze and we're like, Well, good luck on your 2012 MacBook. When I got.

49:35 Get a coffee.

49:37 Chill out, go to sleep, wake up tomorrow. Hope it hasn't crashed.

49:39 I think is the general strategy that does lead me towards my final two little areas I want to speak about before we run out of time here. Now, on one hand, this is just writing Python code, so it's performance, and its limitations are what effectively Pandas can deal with. On the other, it is showing that stuff and allowing you to sort it visually. So there might be some constraints on, like, amount of data you can work with. What's the data side story?

50:04 It's a great question. We have a release coming out within the next two weeks generally, but here's our motto. Obviously, we're providing a visual interface. There's going to be a little bit of overhead. But the way we like to think about what we do is that it's a tiny little bit of flat overhead, no matter how big your data is.

50:21 Okay, for example, you won't try to show the entire 100 million rows in a grid or like, you'll do some sort of virtual lazy load list or something like that.

50:29 Exactly a lazy load list. We actually have a lazy load of the entire data set. That's a feature coming out within the next two weeks or so. It's all written. We just got to test it a bit better in the wild, but yeah, effectively, we're a very thin wrapper on top of Pandas functionality. And in practice, what that means is anything you can do in Pandas, you should be able to do in Mito from a data set size perspective, something that was very important to us, especially because a lot of our users are in Python because of data set size limitations in the first place. I think Jake mentioned.

50:55 Yeah, that's a good point.

50:57 This is something that we spent a lot of time. We were previously using 80 grid as our actual display unit, and it just wasn't combination of probably us not implementing it 100% how they might implement it and them not optimizing for these ginormous datasets data sets. They in particular just spend a huge amount of time recreating the entire grid from scratch so we could have complete customizability over it and show as much data as possible.

51:24 Nice. Sometimes you got to do that, right. You're like this control is amazing, but we've outgrown it and bite the bullet and just do it right?

51:31 Exactly.

51:32 Yeah. Now interesting question from Samir out there. Can I use Mito in VS code and I'll sort of expand to that just a little bit. Can I use it in some of these other tools that are not exactly notebooks. So we've got VS code has its kind of own way of presenting and showing notes out books. We've got Py Charm and we've got Data spell, which is brands new data science IDE thing. What's the story with these environments?

51:58 So unfortunately, right now, Mito only works in Jupyterlab, but this question is something that comes up all the time. I think the places we hear the most interest are Vs code and Google Cloud, and we're definitely excited and really want to support those environments as well. And I think we've done a lot of work internally on how we design, how we design Mito to make it extendable.

52:20 We have a lot of these functionality that we're trying to pack into the tool and then handling these new environments is a decent amount of development work as well. So it's all prioritization game at this point.

52:31 Yeah, of course.

52:32 Short answer is we don't support them now, but in the future, we definitely will cool.

52:36 All right. Two other areas quickly. I want to touch on one. Tell us a little bit about how this is implemented internally.

52:43 I don't know how it's implemented, but I'm guessing that it's somewhat like a lot of Jupyter stuff. Like, I want to do Jupiter things for Python, but I got to write them in JavaScript. Is that the story here as well?

52:53 Actually, your earlier comment of imagine it's just like a sliding widget that you can use in a graph in Jupyter lab with spot on. The PY interactive Widget or whatnot? We are actually just a very fancy IPI interactive widget. Okay, in practice, how that actually works for your audience, if they're interested. Is there's two pieces to your code base? There's a JavaScript front end and there's a Python back end. The JavaScript front end is the sheet that you see it's, the buttons that you click. And what that actually does is it's a very thin wrapper that just then sends a message to this Mito Sheet Python package in the back and says, hey, I just clicked this add column button. You should add a column. Excuse me to this data frame, and then that Python processes that message and then responds to the front end and says, okay, great. Display the new data frame and also write this code to the cell below. And that's kind of the high level of what happens there. I guess, as you can say, more complex and nitty gritty. In practice, we're a react code base. We use TypeScript because we like strong typing and we kind of hate Python. We typing.

53:48 But do you use type annotations on your Python side?

53:50 We are gradually adding them to our code base. We don't currently type check. Actually, we mostly use them as IDE support to make things easier.

53:57 That's the main thing I use them for as well, because a lot of times the Ides will show you the errors if you make them anyway, right?

54:04 Yeah.

54:04 No, you get some support. But Pandas, I was like Pandas typing support. Obviously, Pandas is the main library we interact with. It's not perfect. In all cases. It's a very complex library, so for sure. But effectively, in those cases, things kind of break down and the errors that you get are maybe sometimes false positive, sometimes false negatives. And you can shoot yourself in the foot sometimes.

54:22 Yeah.

54:23 Interesting. Okay. So it's a blend. It's the JavaScript React type script front end and then Python back end.

54:28 Yes. Exactly.

54:29 That question of that. Let's say that stratification and that architecture. Is that's exactly what's going to have to evolve as we kind of move into other places, like Google Collab, vs. Code, et cetera. They all have slightly different extension architectures and so architecting our code bases. So these things are separatable and reconfigurable. And in the ways that other data science environments expect is something that we've kind of been trying to do. But a plea to everyone who's developing data science ide's settle on one extension environment, please. I know it's never going to happen, but it would be really nice for us to extend the developers. We'd love it for sure.

55:02 Right. Or making an adapter, right. Somebody created things that if you have a Jupitery UI and you want to put it in Google Colab, you just insert this thing and talk it and then magic happens.

55:12 Yeah. Shims when they work are great. So if someone's done that, let us know. Please reach out and we really appreciate it.

55:18 Yes. Super cool. All right. Another thing I do want to make sure that we touch on a little bit is up here at the top. I see plans. And so this is not for every possible use case of free tool. Right. You have a free version and you have a higher order paid version for teams if I'm understanding that correctly. Right.

55:36 So we have a free community tool. We have free users 91% of our user for free. Please download it. You can solve it. It's free. We work with some larger organizations in sort of more bespoke manner, doing cost development, cost integration, and those payments happening there. Sometimes some of those larger enterprises. What we're building out now is sort of that middle piece where we want to have sort of a plan for teams, maybe with better security, some free development hours, things like that. That business model is evolving, but we'll probably be one as a SaaS model there where you're paying $10 a month or something.

56:14 It does seem like some kind of online system. I mean, you go to Jupyter through a browser anyway, some kind of system that's really already configured because you're helping people come into Python who probably don't totally want to pip install and manage their path and activate virtual environments and all those kinds of things kind of, like, really help them there. Right.

56:34 Totally.

56:34 And that's one of the things you do when you work with some of our advertisers is help them with the set up of the Jupyter hub environment, help them get the package. They need help get lighter and sold, obviously. But I was going to say to viewers, we're definitely looking to partner with more organizations or teams. If I want to reach out to my email or something will be somewhere.

56:51 Yeah. We'll put your contact information in the show notes.

56:53 But yeah, we're definitely looking to work with Teams for now, as we have a really strong user base.

56:58 Yeah. Cool. So I'm glad you guys have some kind of business model, because a lot of these things, they come and then people kind of lose interest, and then they go and there's a real big difference of this is my job and my investment. So I'm really going to work on it versus this is the thing I'm kind of excited about for a few months. So it's cool you got a free plan for people to use that. That's awesome. It's also cool that there's a path to support to just make it better.

57:21 Yeah. We're here for the long haul.

57:23 And the other thing I'll add here is in no way mean this on a knock on the hundreds of amazing data science tools that are out there. But I think this level of Polish that we really feel a desire to reach with our tool. It's like that kind of when you're delivering a tool to a paying customer, there's often a different expectation that comes from the paying customer, and we do our very best to hold ourselves to the highest standard possible. But when somebody is paying you reaches out and says, hey, this button doesn't look the way it should. That extra level of Polish really kind of kicks the tool over the edge. And all of that feature development ends up getting pushed back to the individual users. And so really, the free users, I'll say. And most of the people use our tool. And so really, we're trying to build the best tool that we can here and making sure that we can do that, sustainably long term and really invest in what we're doing and build a team around. It is something that really is necessary if we're going to deliver on what we think we can and the promise.

58:13 Yeah.

58:14 Cool.

58:14 Another nice thing about the panel is just that we get to work in a much more close relationship with that. So there's a lot that's where we get to zone in on specific use cases, around financial services or around buyer research. We've got to work on specific features and specific workflows that we definitely would otherwise. I think it's going to really benefit everyone.

58:34 Huge. Yeah, very cool. I'm fascinated with the different ways people are working and operating in open source space or building on top of open source tools to create businesses. We've got Anacondas, we've got the MongoDB's and stuff out there, so yeah, good luck to you guys. I like to see you succeed here.

58:52 Yeah.

58:52 Thanks so much. And the last thing I'll say is we do our best to give back to open source tools as well, especially the ones that we work on. So you'll see me sometimes being annoying opening issues on GitHub, and I think that's another big piece of this is as we build an open source tools, making sure we contribute back to them in ways that are meaningful and actually helpful is certainly really important as well, and definitely something to say.

59:12 Something fantastic.

59:14 All right. Well, I think that's about it for the time that we have to talk about extending notebooks and mito all this really cool stuff that you all built before I let you out of here. There's the final two questions, however many in what order and whatnot you want to take this just jump on a notable PyPI package out there. Something you come across. This library is awesome. It doesn't get enough attention. Anything coming to mind?

59:34 Yes. Honestly, something that a lot of our users use, and Jake actually feel free to hop in after me, but I would say Pandas profiling. It's a tool that does somewhat similar stuff to us, but it's a super great tool for many of our users, and I don't think a lot of people know about it, and it works right into Jupyter lab as well.

59:48 So which is cool. I don't even know if it's actually still being supported or developed at all.

59:52 But I read lots of .

59:54 Blogs, wrestlers, you look up Lux, Python. It's cool. It does like automatic graph suggestions, so you can pass in a data frame and suggest, sort of give you options of visualizations, just click on and use. Something is a really quick thing. It's not the most likewise fully fledged package, but for what it does.

01:00:09 I think it's really good now these little things that people don't know about, they're like, oh, that's cool. I'm going to go check this out.

01:00:14 It might help.

01:00:14 Yeah.

01:00:15 Python API for Intelligent Visual discovery.

01:00:18 Not to be exact, but we're really close, and we love the Deep Note product. It's pushing notebooks forward. So trying to add collaboration, live collaboration like Google Sheets and Google Docs. It has a lot more potentially more friendly interface than Jupyter notebooks that are a little bit bare bones at times.

01:00:38 Okay.

01:00:38 There's also another notebook.

01:00:40 Yeah. Really cool. I've spoken to the Deep Note people just a little bit, and they're doing cool stuff for sure.

01:00:44 Obviously, there's another notebook called Hex. We have to talk to the founders recently about some potential collaboration, but they're doing cool stuff as well.

01:00:51 Okay. Fantastic. All right. And then if you're going to write some Python code notebooks, obviously, if anything else, what editor do you use?

01:00:59 We're big VS code users. Okay.

01:01:01 All three of you guys.

01:01:01 Jake, when he dabbles.

01:01:06 I was thinking about this yesterday recently. It's like, I wish at school someone taught a class called actually doing software development in the real world because I feel like I live my whole school life writing Java code in Eclipse and crashing my computer. And then I started using Vs code, and it was like this transcendent moment of bliss where I was like, oh, programming can actually be fun. And it turns out the tools I was using just made my computer heat up to 1000 burn my lap.

01:01:30 Yeah.

01:01:30 I really agree with that statement that there's a bit of a mismatch of what is taught in sort of computer science, and then what is expected of people when they get out there a world and it might not be as academically, highly valued, but really good. Like working with tools like Vs Code and PyCharm. Are these other tools that help you write code better, quicker. And some of the software engineering side I think that really could be valuable.

01:01:54 No, totally. And also, the other thing I'll say is for developers like us. Maybe who came out of school and moved into a startup and didn't have a ton of experience. Let's say, writing Python in production in the wild. The other thing I would highly recommend is continuous integration. You can set it up through GitHub or GitLab. I'm sure if you use that as well, testing your code automatically on a server huge productivity gains for us and really increase our confidence that we're able to deliver the best possible product. And that's something anyone ever told us about when we were in school.

01:02:24 No, go implement a database in Lisp. Okay, John, out in the audience has a quick, funny comment about the Editors. Neo vim. Of course, that's starting to get some attention lately as well.

01:02:35 Very cool.

01:02:36 I'm scared of anything with the word vim. It scares me, but I'm sure you're superhuman for using it.

01:02:40 But you know how you generate a proper random number or character set is you get a first year computer science student in the Vim and then you ask them to quit.

01:02:52 I mean, you've seen the most stack overflow question is like I'm trapped in Vin. How the hell do I get out of here?

01:02:58 So funny.

01:02:58 It's like half of what stack overflow does is answering just that question specifically.

01:03:02 Exactly. For sure. That's great. All right, so final call to action. You guys people are excited about this. How do they get started? Where do they go from here?

01:03:10 trymito.io, which is a documentation website. And if you're an organization enterprise, just reach out to me at my email Jake at sagacollab.Com, which will have the links here or to to go. The Plans page is a link there, but yeah, the documentation docs. Try mito, download it.

01:03:24 Yeah, right on. Also, I'll throw out there while you're on the docs, watch the videos. That's a quick and easy way to really see what it's about before we wrap it up here, Mr. Hypermagnetic, as a little comment like that, vim is the 8th.

01:03:38 It's weird, though, because it's not just the 8th of the deadly sins. It's the 8th of the deadly sins. That also like 10% of the population. Swears it's the great thing. Sliced bread. So it's like half the population. Like me. I'm terrified of the damn thing. But my father is like son, like my dad from the 80s is like, son, have you heard of this? I'm like, dad, please. I can't take this right now.

01:03:56 Yeah, it's amazing. My coast and Python bites right now. He's all about Vim. Everything is vim. It's great, but I did some Emacs and then I kind of did some other more UI oriented things. Awesome. All right. Well, Jake Aaron, it's been fun to have you here. Congratulations on this project. I think it's going to help a lot of people get into Python and data science quicker. More easy also.

01:04:17 Yeah.

01:04:17 Thanks for your time.

01:04:19 Talk soon.

01:04:20 Bye bye bye.

01:04:22 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show. Choose Shortcut, formerly Clubhouse .IO for tracking all of your projects work because you shouldn't have to project manage your project management. Visit 'Talk Python.FM/Shortcut', Simplify your infrastructure and cut your cloud bills in half with Linode's Linux virtual machines. Develop, deploy and scale your modern applications faster and easier. Visit 'Talk Python.fm/linode' and click the Create Free Account button to get started.

01:04:55 Do you need a great automatic speech to Text API? Get human level accuracy in just a few lines of code, visit 'Talk Python.FM/assemblyai'. Want you level up your Python, we have one of the largest catalogs of Python video courses over at Talk Python. Our content ranges from true beginners to deeply advanced topics like memory and Async and best of all, there's not a subscription in site. Check it out for yourself at 'Training.Talk Python.FM' Be sure to subscribe to the show. Open your favorite podcast app and search for Python. We should be right at the top. You can also find the itunes feed at /itunes, the Google Play feed at /Play and the Direct RSS feed at /RSS on 'Talk Python.FM' We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at 'Talk Python.FM/Youtube'. This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon