#217: Notebooks vs data science-enabled scripts Transcript
00:00 Michael Kennedy: On this episode, I meet up with Rong Lu and Katherine Kampf from Microsoft while I was at Build this year. We cover a bunch of topics around data science and talk about two opposing styles of data science development and related tooling. Using notebooks in Jupyter or Python code files in editors, like Visual Studio Code and PyCharm. The conversation was a lot of fun and I'm looking forward to sharing it with you. This is Talk Python To Me, Episode 217, recorded May 8th, 2019. Welcome to Talk Python To Me, a weekly podcast on Python. The language, the libraries, the ecosystem, and the personalities. This is your host Michael Kennedy. Follow me on Twitter where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm. And follow the show on Twitter via @talkpython. This episode is sponsored by Linode and Backlog. Please check out what they're offering during their segments. It really helps support the show. Rong, Catherine, welcome to Talk Python.
01:05 Panelists: Thank you. Thanks so much for having us.
01:06 Michael Kennedy: Yeah, it's great to be here with you and literally here with you at Microsoft Build at the conference. Yes. Doing some live, live recording. So, quite cool. How's the conference going?
01:17 Panelists: It's been great. I'm a little sad because it's so sunny out, but the stuff inside is great, too. Yeah, really exciting news happening here and see a lot of people getting excited, so.
01:27 Michael Kennedy: Yeah, is this an event that you've been planning for it for a long time?
01:30 Panelists: Every year. We have the back to back PyCon to Build, so it's been a lot of conferencing, a lot of fun announcements. A lot to keep up with, but awesome and a lot of great people.
01:40 Michael Kennedy: Yeah, so all of us, we were also at PyCon, right?
01:41 Panelists: Yes
01:42 Michael Kennedy: Yeah, that's pretty awesome, what did you think of PyCon?
01:43 Panelists: It's great, it was my first PyCon. So, pretty cool out there and then see the community and meet people, so. It was my first sort of community run conference. I've only been to Microsoft once in the past so it was very cool to see that and have more interaction with Python users and Notebooks users.
02:01 Michael Kennedy: Yeah, that's cool. I think both the conferences are pretty special and really nice and I enjoy going to both them but they're really different.
02:08 Panelists: Yeah, definitely. Oh yeah.
02:10 Michael Kennedy: For people who haven't gone to Microsoft Build, like, the experience walking around here is, like, all this cool stuff, and there's, like, robotics and other kinds of neat things going on, but it's mostly put on by the different Microsoft teams, right? There's a section or, the Visual Studio for Mac people, and you can go talk to the people that work on it and that's great. And you go to PyCon and it just doesn't have like, that centralized structure, right? It's just, like, a thousand flowers blooming and just, you know, whatever's happening's happening out there, so it's a little more chaotic, but a little more independent as well. I don't know, I think they're both fun and they're both a cool experience.
02:45 Panelists: There's some nice surprises as well at PyCon like, as I walk in, I walk through the the booth area, like, at Build you kind of expected all these Microsoft Teams there but at PyCon you'll see different companies showing up there because they use Python for development. And, like, really unexpected companies there, but it's fun.
03:06 Michael Kennedy: Yeah, it's super interesting.
03:08 Panelists: Very different.
03:08 Michael Kennedy: Yeah, you know, one of the parts I really like is the first two and a half days, like, the opening night and then the first two days are the expo hall days. We've got a booth. You all had a booth. Were you at the booth?
03:21 Panelists: Yeah, I was. Oh yeah. Many hours at the booth.
03:23 Michael Kennedy: Yes, the booth is a blessing and a curse. I was at a booth as well, have been for the last couple years. And it's great, you meet so many people, but also you don't get to experience as much of the rest of stuff going on. But then the next day, they take that down as the poster section, and then the job fair.
03:39 Panelists: Right.
03:40 Michael Kennedy: And the job fair, I think it's pretty revealing because you can walk along and see all these companies are hiring Python developers, data scientists, web developers, whatever. And it's super clear and obvious kind of like what they're up to. So you can kind of really put your finger on the pulse of a community by just walking through the job fair.
03:57 Panelists: Yeah, it was super crowded. So.
04:00 Michael Kennedy: Yeah, I think that's...
04:01 Panelists: The community is super engaging and super fun.
04:04 Michael Kennedy: Yeah, yeah, absolutely. I think that's for sure. Both of these events are super cool and I'm just glad we're all at them, that's great. So let's start with your story and I guess, Rong, I'll start with you. How did you get into programming and Python?
04:16 Panelists: I actually, about 10 years ago when I was in college, I actually majored in data mining. And then back then I learned C. But after I graduated, I joined Microsoft, I mostly worked on C# related developer tools, and then moved to C++. And just recently, I just moved from C++ team into the Python teams, like all these people are talking about Python, I got to see what's going on there. And then, of course, my own knowledge from 10 years ago about data mining, that no longer applies. Like so many new things happening in the AI data science world and that's this is super exciting.
04:52 Michael Kennedy: Sure.
04:52 Panelists: And that's why I got into the Python team. I was like hey, maybe I can use them on my own knowledge here but lot of learning for me actually. Given how things have moved.
05:02 Michael Kennedy: Yeah, I'm sure, I mean, it's so, changing so fast. What was your impression of coming into Python from these more statically typed, compile type languages and all that?
05:13 Panelists: Definitely feel like the level of experience required to get started is very different. Coming from a C++ world. And how I see all these users we have and even younger kids, they start to learn about programming and Python is the first language that they learn. And up to people are just using Python for various different projects for real life and that's super cool. And that's a lot of applications in real life that could change people in their daily life using the power Python like...
05:46 Michael Kennedy: Yeah, that's for sure.
05:46 Panelists: Earrings. Predicting different things.
05:50 Michael Kennedy: Yeah, did the white space drive you crazy at first?
05:53 Panelists: A little.
05:53 Michael Kennedy: And now?
05:56 Panelists: Now, getting better.
05:56 Michael Kennedy: Good, good.
05:59 Panelists: But yeah.
06:00 Michael Kennedy: Cool, Catherine, how about you?
06:01 Panelists: I kind of had a similar experience. I started programming just randomly took a course in my high school and then ended up majoring in computer science and that was on C++. And I remember, one of my years I ended up in an AI course. And the course was mainly around Python. And I remember I like hated Python at first. I was like, this feels too easy, what's going on here?
06:21 Michael Kennedy: Someone's tricking me, I don't know what's going on.
06:24 Panelists: The white space things. This is, no, like I want my C++. And then over the course of the semester, I just fell in love with Python. And then when I started at Microsoft, I was exposed more into the Big Data landscape. So using things like PySpark and getting exposure to Notebooks there, and then yeah, Python is very much grown on me. My friends from school always laughed because they're like, remember when you hated it and now you're on the Python team?
06:46 Michael Kennedy: Life takes a weird turn.
06:47 Panelists: Oh, yeah.
06:48 Michael Kennedy: Yeah, yeah, for sure. It's like no, I want to use a real language. Compiling and linking and headers, come on.
06:53 Panelists: Space doesn't matter, Sir Michael.
06:55 Michael Kennedy: That's right, I write it on one line. Yeah, that's pretty cool, I love it. So, Rong, you both work at Microsoft but you don't work on the same projects or the same team, right?
07:06 Panelists: We are the same team team. Same team. Different projects. Different projects.
07:09 Michael Kennedy: Different projects, okay. So, Rong, let's start with you, what do you do day to day?
07:12 Panelists: Yeah, so I am a program manager on the Python team at Microsoft. Primarily I'm focusing on building developer experience around the data science and machine learning inside Visual Studio Code. Visual Studio Code is our lightweight cross platform editor. And it's been pretty good for Python general development as already. But now we're adding a lot of new things around how to make it easier for people who do data exploration, analysis and machine learning. How do we bring all those power of like Jupyter Notebooks into an editor so you can use it like for everything you do? So there are people who prefer an editor versus notebooks. That has been an ongoing debate in the community and that's definitely their like personal preference. But we also have see a place where VS Code can be the place where you take your notebook file after you're done with your experimentation and you're ready to turn your code into a production code. And this is where you can kind of bridge the two world together and VS Code can play a role in that. So that's the new area that we're investing in, and that's what I've been working on since I joined this team.
08:23 Michael Kennedy: I think it's cool that you're working on it because I feel like there's a bit of a divide, not in a negative way, but there's certainly people that will open up a notebook in Jupyter, JupyterLab, something like that, to do exploratory data. And there's people who write production code that's structured with tests and coverage and all of that stuff. And you know, going between those, I don't know what people do today but it feels like well, you take the notebook and just copy bits over it. Is that pretty much how it goes?
08:50 Panelists: Right, yeah, so we're trying to hopefully minimize the manual copy and paste between the two. So when you have a notebook ready to turn into production, we can help you import a notebook into Python code, in VS Code directly, so you don't have to do all the work yourself. And then once you're in VS Code, of course, you get all the editing features that you expect. There's intellisense, refactoring, debugging, everything in there, so you can levying a developer tool to finish your developer work.
09:17 Michael Kennedy: Yeah, that sounds awesome. And Catherine, how about you? You're on the other side of the design? Do you fight?
09:22 Panelists: Yeah, we fight. She's on the notebook side Every day, eight a.m., Rong and I face off to see who's going to win that day. So I'm a program manager on the Azure Notebooks team. And Azure Notebooks is basically an Azure hosted, free, no setup required Jupyter Notebooks environment. So in my day to day I spend a lot of time talking to customers, learning about their experience, whether it be in an editor like VS Code or in notebooks and learning how we can make that better. And what our next generation of Azure Notebooks should look like to enable every set of scenarios from educators to enterprise data scientists. And learning about any features or things that they need to be happy on Azure Notebooks.
10:04 Michael Kennedy: Yeah, awesome, that sounds like a really fun project.
10:06 Panelists: It's a blast, yeah, I love it.
10:07 Michael Kennedy: Yeah, cool. So I want to talk a little bit about just, kind of notebooks in general and find out where you think they're really useful or what's sort of coming next in it. And I just want to give a quick shout out to an article on the Atlantic, I think, which I've talked about a few times, but it's called The Scientific Paper Is Obsolete and Here's What's Next. And it's got like this, sort of traditional paper, like literally on fire, this academic paper. Talking about how these computational notebooks, Jupyter notebooks and others, are really super important for reproducibility, and sort of interacting with science, even. And the paper's hypothesis is that the next thing is these notebooks, Jupyter notebooks and whatnot, for helping scientists write a paper that can be understood basically, right? Because if you don't publish the code but here's the graph, like have you really published your work, right? Catherine, I guess maybe start with you. What are your thoughts on the role of notebooks and science these days?
11:07 Panelists: Yeah, I think notebooks is such a cool thing to be a part of right now. And I think it's awesome that the Python community was kind of early on to attach the new Jupyter Notebooks because I think they really haven't reached their widespread potential. Like you can think about it in academic papers we've talked about. Every month we do business reporting documents just internally and have graphs of what our growth rates look like, but what if that was just a notebook and you can dive deeper and see that data hands on? So there's scenarios like that all the way to, you know, we have a security team at Microsoft who's building their anomaly detector and threat detection, threat hunting, all within notebooks. So I think it's just super cool to be a part of and see the vast array of use cases for notebooks.
11:49 Michael Kennedy: Yeah, it sounds like maybe it's not even really shaken out, right?
11:52 Panelists: Yeah, exactly.
11:53 Michael Kennedy: We kind of know what the use case for code looks like, right? Like regular executable files, websites and whatnot. But I feel like with notebooks, there's just all these different attempts to make use of them. Could you treat them as like units that can be combined as like a function or something? Or like are they a paper that's better?
12:13 Panelists: Which is why, yeah, it's cool to see, I mean we get within our community it's easy to see the applications and data science. But then it's always great to hear outside perspectives, people coming from different languages or different areas of technology and everything and thinking about how they can use this as a platform for experimentation or sharing work and collaborating. So it's very, very inspiring.
12:33 Michael Kennedy: Yeah, yeah, that's pretty awesome. Rong.
12:35 Panelists: Absolutely, I think the interactivity is what Jupyter is most powerful at. Like the fact that we can turn a static report or paper into something that everybody can go and execute. That's super cool. Not only we have seen lot of people using notebooks for teaching because that's like, well organized documentation plus real code. That's super easy for a teaching setting. And also we use that like internally as well as we report numbers and view dashboards. What if everything is interactive and other people can go and pivot the data independently without having to go into like a pile of code? So that's really, really, really powerful and that's why it's so popular in the data science community. Absolutely. A great thing to have and I think that adoption is going to just keep going given how easy it is to get started. And I think one of the things you mentioned earlier, just reproducibility, I think that's one of the things Notebooks simultaneously does really well and also poorly, in terms of like environment sharing. And that's something we've been thinking a lot about just in our platform. If I'm sharing a notebook with Rong, how can I make sure that dependencies get so complex? How can I make sure that everything, that Rong has everything she will need to run this notebook? And then even with data sets, if you share a report that connected to a database that was once storing this data from May 2016 and now it's June 2019, how do you handle those cases?
14:05 Michael Kennedy: Right and the report says such and such and like marked down below in a cell below it but it's like the data is not the same anymore, right? That's actually, that's an interesting idea like the life cycle of it, right?
14:16 Panelists: Yeah, I think that's one thing that Azure Notebooks could potentially help because the fact that it runs in the cloud and then we could potentially have pre-configured environments so you can make sure your notebooks will always produce the same results. Yeah, so right now you can use environment setup scripts and import your requirement subtext or environment config. But in the future we definitely want to make that as seamless as possible so I can just share whether it be Azure Notebooks project or GitHub repository with you that has my requirements and it'll just be in a container and everything's good to go.
14:46 Michael Kennedy: That's pretty cool. I mean, even with the requirements.txt file, right? It might say, we need these things to run this notebook but they might not be pinned, right? And so, like you run it again a year later, one of those things have become incompatible with the code that was written or something like that, right? It's a challenge and as a data scientist, you probably don't think about like, oh, I better pin the versions to my dependencies, you're like oh, look, it's a graph, it's working.
15:10 Panelists: Right. Don't change it, don't touch it, that's my version. But of course later on, if the package has newer versions, then you have to upgrade your notebook to make sure it still runs and stuff like that. It is kind of a pain to manage.
15:23 Michael Kennedy: It's pretty tricky. So I want to dig in more to Azure Notebooks but you said there's like this battle you all do at eight. So I think it's Rong's turn to say a little bit about, if not notebooks, if it's going to be more like code editor style of work but still in the data exploratory way. You talked about this other thing that you're doing, what's that look like?
15:46 Panelists: Yeah, essentially, to summarize what we have done is we brought the power of Jupyter Notebooks into VS Code. So today, in Jupyter Notebooks you can run piece of code and get your results right away, whether it's a data frame or it's a plot or just text markdown. And imagine can do all that inside VS Code. It's slightly different view but you get the same results.
16:10 Michael Kennedy: Is it actually the same format, the ipynb file? Or is it something different?
16:14 Panelists: So actually, So at the moment, the scenario we support is you work with Python scripts. So once you have your Jupyter notebooks ready, you're ready to import, we grab all the code in there and put everything into a Python file. So you're literally working with a Python file.
16:31 Michael Kennedy: Right, just a .py file?
16:32 Panelists: We have converted this...
16:32 Michael Kennedy: Yes.
16:34 Panelists: To .py, yes. So the idea is, hey, this is really an editing centered experience. And the fact that you can visualize all those things like plots and data frames and all that, that's like an an add-on bonus. But you are really working with...
16:48 Michael Kennedy: Yeah, do you have a demarcator for the different cells? Like...
16:51 Panelists: Yes.
16:52 Michael Kennedy: A double hash or something?
16:53 Panelists: Right, a single hash.
16:55 Michael Kennedy: A single hash, okay.
16:56 Panelists: A single hash, percent percent is the magic command. Not command, a comment style, a special comment you can put in Python file. Then we're going to recognize, oh, there's a cell. And then we do magic things. So we will visualize, like, kind of visually we'll highlight those cells in the code editor. So you kind of get a feel of, like Jupyter Notebooks. But then you can also run single cell using Shift+Enter. Like that's the key that people use, Shift+Enter.
17:23 Michael Kennedy: Right, that's normally the key in Jupyter.
17:24 Panelists: Yeah, you can do the same in VS Code. And you get the same results in, we show all the results in a separate window in VS Code. That's separate from your code. That's just how the design, we decided to go with that design. But there's no reason we can't show results in line. Because we have got some requests around that. So trying to bring the bring the two to work together. They are, I think they're very different users who want different things depending on their background. People coming from Jupyter Notebooks are more familiar with the in line results. But those who come from software engineering background that I used to work with editors and code files, they're more like, oh, I just wanted something on the side, so I can see all the history like all the results, I can always go back. So nothing is getting getting replaced in line.
18:15 Michael Kennedy: Right, because they want more of an editor experience.
18:17 Panelists: Right, that more the editor experience. So I think that we just provide different options for people who come from different backgrounds. So you can pick if you want a notebook, if you want a notebook style in VS Code, you can do all that.
18:29 Michael Kennedy: This portion of Talk Python To Me is brought to you by Linode. Are you looking for hosting that's fast, simple and incredibly affordable? Well look past that bookstore and check out Linode at talkpython.fm/linode. That's L-I-N-O D-E. Plans start at just $5 a month for a dedicated server with a gig of RAM. They have 10 data centers across the globe. So no matter where you are or where your users are, there's a data center for you. Whether you want to run a Python web app, host a private git server, or just a file server, you'll get native SSDs on all the machines, a newly upgraded 200 gigabit network, 24/7 friendly support, even on holidays, and a seven day money back guarantee. Need a little help with your infrastructure? They even offer professional services to help you with architecture, migrations, and more. Do you want a dedicated server for free for the next four months? Just visit talkpython.fm/linode. Well, you'll have to help me out with this one because I haven't done a lot of teamwork on notebooks, but I hear that merging notebooks can be challenging. Especially if someone's rerun a cell and it has a new output.
19:37 Panelists: Oh, yeah.
19:37 Michael Kennedy: Right, because it's, you know, called API and it's like, called the weather but like, they ran it and the weather's now different or, and so that's like a merge conflict and stuff like that. This is the problem?
19:47 Panelists: This is a problem. The way notebooks are stored and ipynb is just kind of JSON file. So, and it stores, you know the output and specific metadata about your file as well so you can go and change one line, and you'll suddenly, you know, try to merge it and you'll see eight lines or like 80 lines, and everything's moved around. So this is, yeah, one of our big focus in the near future for Azure Notebooks is, how do you make sure that source control is as easy as possible for notebooks. And we were just talking to a user last week, who was like, we tried to use GitHub with notebooks but then it got to be complex. So we all said, you know, they were team of four, they all created their individual notebooks, and then in the end, one person was tasked with copying the bits and pieces that were good. They got the short end of the stick. Yeah. So you know, we were talking to a bunch of users, as I said, like that, to understand, not just having GitHub integrated right here, because that doesn't solve the issue with the formatting of notebooks and that still doesn't make it perfect, but making sure that we have a complete, super easy end to end experience for that problem in the notebooks world.
20:51 Michael Kennedy: It sounds tricky and it sounds like the code thing that you guys are doing maybe sidesteps that?
20:55 Panelists: Right. We tried to solve that by avoiding notebook files. That's why we convert everything into a .py file. Now you can use source control and everything around it. Just plain code.
21:07 Michael Kennedy: Right, and then the output go somewhere else.
21:09 Panelists: Yes.
21:10 Michael Kennedy: And it's not part of the diff or whatever.
21:11 Panelists: Right, right. But you can actually export the results as a notebook file, if you want, to go back like if you want to share the results with someone, send someone a notebook file. So we joke that Rong and I fight but we really, we work really closely together because so many scenarios are, I explore in notebooks now I want to build it a .py file in VS Code, and maybe I make some changes that I actually want to now share that as an updated notebook. We work very hard to make that end to end complete scenario that's pretty easy.
21:39 Michael Kennedy: Yeah, that's cool. I guess while we're on the subject maybe we could talk about collaboration. If I'm working on an Azure notebook, I go over there, can I share that, kind of Google Docs style with collaboration?
21:52 Panelists: Not yet. We're thinking about how we could maybe leverage something like VS Code Live Share where you could do live sharing with individual people and watch them edit, all within the same document. But this is something that has come up in a few notebooks products before and then sometimes been rolled back because it's not, it's not super easy with the formatting. But yeah, so right now and Azure Notebooks you can share a notebook publicly and you can clone it and we can collaborate that way. But not yet with the live co-editing.
22:18 Michael Kennedy: Cool.
22:19 Panelists: That's something you can do in VS Code with the Visual Studio Live Share feature.
22:23 Michael Kennedy: Yeah, maybe people don't know about that. I did talk to Dan Tiller in the previous episode. But, you know, maybe people didn't hear.
22:28 Panelists: So Live Share enables real time co-editing and co-debugging across different machines, across different OS. So the idea is, any number, I guess, up to 30 at the moment, any number of developers can collaborate real time. And it's not just screen sharing. It's real session share being shared. So anyone can make changes to the same codebase and everybody else can see it and you can run the code and debug and everyone else can see that too. So essentially you can encourage a lot of scenarios around like, pair programming, or just remote troubleshooting. And we also see that being used in classroom settings too. Like a teacher is hosting a session with 30 students looking at the same thing. And that's pretty interesting, that actually generated a lot of interesting scenarios.
23:21 Michael Kennedy: Yeah I'll bet it did, like office hours. But I'll help you by just being on your system. Basically, right?
23:26 Panelists: Yeah, absolutely. So we actually extended that functionality beyond the basic support. So now we have added the Jupyter support on top of Live Share as well. So now you can imagine you're working on your notebook, you want some help and somebody else can connect to your session, you can run a cell, and both of your machines are going to see the plots, the data frames and everything the same on the screen. So that's pretty cool.
23:52 Michael Kennedy: Yeah, that seems super cool. One of the problems of course, especially I think, in data science would be the actual data, right? If you have a gigabyte of data and then you're talking to that from your notebook or from these notebook-like py files. It's one thing to say, well, yeah, you can just, you know, grab the notebook or grab the Python file out of GitHub, but then there's a gig of data. You got to get to them somehow, right? So being able to connect in and just sort of run, run in place would be cool.
24:19 Panelists: Yeah, actually everything happens on the on the host machine. Meaning, if you're the guest machine doesn't have the same setup, same environment setup, that's okay. Because nothing is required on the guest machine. You literally just need VS Code installed. And then all the like building and intellisense debugging and all the packages, environment settings, everything is streamed from the host machine. So you can literally see everything exactly the same.
24:45 Michael Kennedy: Yeah, that's really nice. You know, I have some friends who did a lot of pair programming, it was always, well, let's set up a virtual machine, or shared remote desktop so we can both type or other like, weird things.
24:56 Panelists: Yeah, it's a lot of work to get that set up.
24:58 Michael Kennedy: I'm really excited about this, this is cool. And, how about on Azure Notebooks? Can you like, if I have a gig of data, can I upload that in there? Like, what's the story with the data backing though?
25:08 Panelists: Yeah. So you can either store things locally in Azure Notebooks within your project. And so if you cloned it you would clone that data with and none of that's going to your local machines so you don't have to worry about those things. It's all Azure hosted.
25:19 Michael Kennedy: It happens really quickly.
25:20 Panelists: Yeah, but you can also connect to an Azure Blob store. So for instance, in the demo Rong and I are doing today, we put a bunch of pet images that we're classifying in a Blob store and we pulled them down in Azure Notebooks and then send it out and test it on our web servers from there.
25:36 Michael Kennedy: Yeah, that's cool. I guess that would be the most cloud natural, cloud native, sort of solution is, we'll put it in cloud storage and then talk to it or read it or write.
25:44 Panelists: Right, right.
25:45 Michael Kennedy: Yeah? Okay, that's pretty awesome. So I know that Visual Studio Code is free, right? Free open source and so on.
25:51 Panelists: Yes, everything we do here. The Python extension and the Jupyter support is part of the Python extension. Which is all free and open source.
25:59 Michael Kennedy: You don't need to install anything else. It's just like when the Python extension updates itself then magically...
26:05 Panelists: It's all in there.
26:05 Michael Kennedy: It's in there. Okay, super cool. What about notebooks?
26:09 Panelists: Notebooks is free. You can, we let you run for free on our hosted compute. You can connect to an Azure VM if you want a more beefy machine, GPUs, et cetera. You can connect to an Azure VM, in which case those would be paid but, everything within...
26:23 Michael Kennedy: So I would pay for a fancy VM that has GPUs, and then I would talk to it?
26:27 Panelists: Yes, exactly.
26:28 Michael Kennedy: How does that work, right? Like instead of, I wouldn't just go to my VM I set up and just like pip install Jupyter and do stuff there, right? Like, it sounds like it would be different.
26:37 Panelists: Yeah, so we actually use something called a data science virtual machine in Azure and with that, Jupyter's already installed there. So from Azure Notebooks you just, we have an easy compute picker that will automatically populate with the list of your data science virtual machines that you have access to. You just click on it, hit run and everything just works. You're running your notebook on that VM.
26:57 Michael Kennedy: Yeah that's cool. Are there like, scientists and companies and stuff using that a lot to do cool stuff?
27:02 Panelists: Yeah, we have a bunch of companies as well as education research, we see a lot of researchers using it, especially because, for the free classroom settings, people use that a lot.
27:12 Michael Kennedy: Yeah, that seems pretty cool. One of the things I saw on the Azure Notebooks that I thought was kind of cool is a lot of different languages support it.
27:20 Panelists: Yes.
27:21 Michael Kennedy: Yeah, so what languages or can I run there? I mean, there's Python, obviously.
27:25 Panelists: So Python 2 and 3, and then F# and R.
27:28 Michael Kennedy: Okay. So F# is kind of surprising, right? F# is like a functional .NET programming language. Which I haven't really done very much with. But what I didn't see there was C# and I figured if you're going to have F#, maybe you would also just throw in C# for the fun of it. Like, that's kind of interesting. Why not?
27:44 Panelists: It's something we're thinking about. We haven't, you know, we see so much of our usage is Python. So from that we really focus on making that our optimal experience and that's where most of our energy goes. But we definitely do hear requests every now and again for C#. And it's one of those, similar to what I was talking about earlier, where you see so much Python usage with notebooks, but there's so many other scenarios. And C# is a perfect example. How can you experiment with C# on notebooks? So something we love to do just haven't done yet.
28:11 Michael Kennedy: Yeah, yeah sure. No, I mean, I think definitely focusing on Python and Python 3, that's like, the sweet spot, right? Of course, but just yeah, the F# being there kind of stood out to me. So if I'm building these notebooks, or even with the VS Code, what are some of the other things in Azure that people are using and connecting to? I know, there's a bunch of other data science machine learning stuff going on there.
28:34 Panelists: Oh, yeah.
28:34 Michael Kennedy: What are some of the cool scenarios?
28:35 Panelists: So the very basic thing we can start with is definitely just beefy machines, right?
28:40 Michael Kennedy: Give me some GPUs.
28:41 Panelists: Right GPUs, faster machines. And then so on top of that, Azure provides what we call data science VMs. These are, think of those as VMs pre-installed with data packages pre-installed. So when you go there, everything is already there. Like, Jupyter Notebooks is installed. All those packages are already there. So you can definitely connect to that. We have Windows, we have Linux, different OSs, depending on what you want. So that's kind of the fairly common use case of Azure resources. Of course, you can use storage if you want to store data there. So you can do all that already. On top of that, Azure also provides a service that's machine learning specific called Azure Machine Learning Service. It is a comprehensive set of services that covers the entire workflow in machine learning. So starting with data. So Azure Machine Learning Service can help you manage data. Like what we provide, what we call data store, which is part of the service. So you don't have to manage your storage separately. Machine Learning Service knows where to find your data for training, for example.
29:44 Michael Kennedy: I see, you don't have to use the Blob Storage API...
29:47 Panelists: And then you figure out how to connect to that thing, right? So this is part of the whole world that Machine Learning Service manage the whole work for you. And then moving on of course, for training, you need more compute, like VM. So we talked about, so Azure Machine Learning Service can help you manage computers well at scale. So you can say, I want to 50 machines, up to 50 machines, but scale them back down if I'm not using them. Which is the best part because I'm terrible at controlling my cost. Scale down to zero. Yes, scale down to zero. Or you can scale up to as many as you want, pretty fast.
30:22 Michael Kennedy: Yeah, that's cool. And that's good that if you forget.
30:23 Panelists: Right, yeah. Which I always do. Especially when they're like, you know, you have like a 16 node GPU cluster.
30:32 Michael Kennedy: Right, keep them all running for days.
30:34 Panelists: Big bill. So it also helps you manage your training jobs. So you have different training runs, and you have different results you want to keep around. So Azure Machine Learning Service can help you manage all that. So for example, you have one experiment that has like 100 repeated runs, each run has different results. And at the end of a run, you generate a different model file. So that's all being stored as part of the service. So you can always go back, look at the model and go back, say, oh, this is the run I did that got to this model. So you can always trace back. So that's all managed. And then of course, once you have the model at the end of training, you can download it to local and do whatever you want with it. But you also you could have the Machine Learning Service manage it for you. Imagine if you have multiple models, and you all have multiple versions, that becomes a nightmare to manage. So you can have Machine Learning Service to manage all that in a central location. Not only that, but also once you have that registered with the service, it's super simple to turn a model into a runable service. Like literally, we can do that in five minutes.
31:41 Michael Kennedy: That's cool.
31:42 Panelists: Turning that into a service that's running in a single container that Azure can spin up for you fairly quickly, really five minutes, everything's done. Then you can start using this model already.
31:52 Michael Kennedy: So you talked about this pet image.
31:54 Panelists: Yes.
31:55 Michael Kennedy: So, what are you trying to find from these pet images?
31:58 Panelists: Yeah, so the end result is basically you can send our web service a image of a pet, cat or a dog, not like a obscure pet.
32:05 Michael Kennedy: Not a lizard?
32:07 Panelists: And we'll give you, our model will send back probabilities of different breeds. So if you send a golden retriever, I had my roommate wanted to test it with her golden retriever from home, send that and it got 99.7% confident that it was a golden retriever.
32:21 Michael Kennedy: That's awesome. So you could do that with this thing she's talking about, right? You train it up and you say, turn into a service. And then it's like a...
32:27 Panelists: That's what we do.
32:28 Michael Kennedy: Http post an image to it or something, and then it says here's a JSON result?
32:31 Panelists: Yeah, exactly. Exactly. So we're going to talk about all that in our view talk. Yes.
32:36 Michael Kennedy: Oh, yeah.
32:37 Panelists: Which is now available online at the time of this podcast.
32:40 Michael Kennedy: Yeah, we can put a link to it in the show notes actually. Thanks to the power of time travel and recording, and all that. This portion of Talk Python To Me is sponsored by Backlog from New Lab. Developers know the importance of organization and efficiency when it comes to collaborating on a team. And Backlog is the perfect collaborative task management software for your team. With Backlog, you can create tasks, track bugs, make changes, give feedback and have team conversations right next to your code. You track progress with features like Gantt and Burn-down Charts. You document your processes right alongside your Wikis. You can integrate with the tools use everyday, like Slack, JIRA, and Google Sheets. You could automatically register issues from email or web form submissions. Take your work on the go using their top rated mobile apps available on Android and iOS. Try Backlog for your team for free for 30 days using our special URL at talkpython.fm/backlog. That's talkpython.fm/backlog. You know, one thing that I saw when I was going through your website that I thought was super cool was this LIGO project. So you know, Python has been at the center of a bunch of super cool science stuff, right? Like, Kyle Cranmer and his team did a bunch of stuff with Python around the Higgs Boson discovery. There was Dr. Katie Bouman and the black hole picture. And then the LIGO experiment, which is detecting, you know, I still just don't have my mind around general relativity, but the space time, you know, with the curvature of gravity and all that. And so the idea is if there's enough of a disturbance in it, it should have a wave, right? And, you know, Einstein predicted that. But just recently, the LIGO project detected black holes colliding and actually captured that in, you know, that was Azure Notebooks.
34:33 Panelists: Yeah. Yeah, they do use Azure Notebooks. I'm not sure if it was used for the specific black hole collision. But they do use it.
34:39 Michael Kennedy: I think it's just so cool that Python and notebooks, in particular seems to be showing up around all this super cool science. I guess, let me ask both of you. You know, when you talk to people who are coming into the Python world and into the notebook type of space, what are they coming from? Is that like MATLAB or other tools? You know, where are they coming from?
35:01 Panelists: We see a lot of users coming from R and R Studio. That's kind of the world. But they see, like some of they still, I've seen users who use a mix of those two languages. They still feel like R has done something really well, and they already have things set up there. But they also see like all these Python packages that make things easy, so they want to do use that as well. So I definitely see a lot of that. Yeah, I think what's cool about Python is you do kind of land on it from a lot of different paths. So I had a user I was talking to last week, who was saying, you know, I feel like, they're business analysts and they've been using Excel for so many years, and they were like, I feel like Python has become the next thing once you've learned Excel. And now you want to do more advanced things or go from Excel into Python. But then you'll talk, I was talking to a researcher a couple weeks ago who also teaches a course and he was saying, the course was in MATLAB two years ago, then they moved to R for a year. And now they're standardizing across their department on Python. Because it is so, they're like, people started in MATLAB then they need R for a different course, and it was just too much. But they felt like Python would cover all their courses use cases and make their students happy.
36:09 Michael Kennedy: Yeah, that's pretty interesting. I'm sure they have a little bit of whiplash going on.
36:12 Panelists: Yeah, I know.
36:15 Michael Kennedy: But Python's interesting because it's not just useful for this focus data analysis, right?
36:20 Panelists: Right.
36:21 Michael Kennedy: Like if I go and learn MATLAB and I, say I'm getting a degree in math or biology or whatever. And then I leave and I'm not still doing research in biology. My MATLAB skills are probably not helpful. For the most part, right? It's not like, well, I'm going to go and get recruited to work as a MATLAB person. Maybe, but not not nearly to the extent that Python skills will open up the door to you in parallel careers. And so I think, whenever I think of students or academic programs, I think, you know, it's really good that they're doing something like Python. Even if it weren't Python, something that is like a general skill, independent of the research or the PhD or whatever, right?
36:59 Panelists: And I know that have standardized like high school, the AP curriculum in the US has standardized on Python for being the introductory language, which I thought was really cool. Because when I took it it was Java and I never used Java. Right, yeah. Yeah, Python is definitely widely used in so many different areas. Like for one, being able to build web apps, that's huge. In the data science area. And beyond those two, we also see a lot of just general purpose scripting using Python. And on top of whatever else you're using, right?
37:30 Michael Kennedy: Yeah, yeah, and the IoT stuff, as well, right?
37:34 Panelists: Oh, yeah.
37:35 Michael Kennedy: Like if you got CircuitPython and MicroPython...
37:36 Panelists: Oh, that's super cool.
37:37 Michael Kennedy: ...and all this stuff coming along. So I think that's pretty killer. Now, one thing I did want to ask you about because I think this is super positive but I don't really know why it's so different in data science. I feel like in data science, the core projects, NumPy, Jupyter, and whatnot are well funded, at least compared to say Flask for web development or, you know, SQLAlchemy, like you go and you look at NumPy and it's got like, I don't remember exactly, like Sloan Foundation funding and DARPA funding maybe. But there's like these these large groups funding these data science projects. Why do you think data science, open source gets so much support like this, whereas a lot of the rest of Python kind of doesn't. I mean, it's not like Instagram isn't worth, you know, billions of dollars and based on Python as well or, you know, there's a hundred examples like that.
38:26 Panelists: Since I'm relatively new to the area, to this space, I'm not super familiar with those particular projects and how they got funded. But I felt like just my gut feel, like the popularity in the space, interest in the space, just AI and data science in general got a lot of attention, for sure, from all sorts of people. And it's not just like, hey, we build this for fun, right? But also we want to take AI, we want to build that into our businesses and that's something serious. They better make good predictions, right? If this is going to impact my business or people's real life, like making predictions for patients, and that's serious stuff. So we need to get it right. It better work great. All those packages we rely on, they're fundamental to the prediction and data analysis. And that might be why there's, it's easier to get attention and funded because the value to the business, like we see in real life.
39:25 Michael Kennedy: I have a guess for myself as well but I want to hear your thoughts, Katherine.
39:28 Panelists: I was thinking something similar just in terms of the enterprise use cases. And of course, there's all the power of buzzword and data science being searched, you know?
39:37 Michael Kennedy: If it's based on AI, we're going to pay for that.
39:38 Panelists: Right, yes. Yeah, I love that. AI infuse everything. So I think that, that likely definitely helps. And then like you were referencing a few of those projects where you know, science and research which might be more engraved and grant worlds versus something like Instagram.
39:52 Michael Kennedy: Exactly like there's, there's already a grant infrastructure, right? Like it's super normal to go, I'm going to apply for a three million dollar grant for this project.
40:02 Panelists: Right.
40:03 Michael Kennedy: Right and it's I guess, because it lives in that world so much, maybe then that's a little bit why.
40:06 Panelists: Yeah, no, no, it's interesting to think.
40:08 Michael Kennedy: Yeah. The other reason I was thinking it's the data scientists usually report to, like executives.
40:13 Panelists: That helps too. Leverage those one on ones.
40:19 Michael Kennedy: Exactly. You know, yeah, we need a little bit of support here. Yeah, so let's see, I got a few other things I want to ask you about before we run out of time here. So, you know, not too long ago, GitHub was acquired by Microsoft. What changes have you seen internally or in the Azure data science side of things as a result?
40:39 Panelists: Actually, long before the acquisition, Microsoft internal teams have already started to adopt git for source control, like in our internal teams. That has happened even before this. So I will say, after the acquisition, there's definitely more use of GitHub and also we see more integrations that we are building with GitHub, being a front door to developers. And so more integration into like Visual Studio Code, better experience there. And we also build new features on GitHub, as a result of GitHub being part of Microsoft.
41:15 Michael Kennedy: Yeah, that's cool. And you guys did do the virtual file system stuff. Contribution back to GitHub so you could actually put Windows in there. Because apparently it was broken. Katherine, what about you?
41:26 Panelists: Yeah, it's been super exciting. Like Rong was saying, I mean, git is so ubiquitous that we were using it a lot beforehand. But now this, the acquisition has just opened the door to so many opportunities in terms of product integrations, and it kind of has re-shifted a lot of our thinking across our different products, particularly in developer division where Rong and I work. What should we build into our product first? What would be a good integration where we could leverage get GitHub for this? It's opened up a bunch of new scenarios and hopefully, customers will continue to see the value from it.
41:56 Michael Kennedy: Yeah, that's cool.
41:57 Panelists: Everybody at Microsoft uses Git now. Even as a PMs, we write documentation. And then we're checking there have to use Git. Microsoft docs are all on Git now, which is fun. Yeah, all documentations run on Git. So that's one single system that we use.
42:13 Michael Kennedy: Yeah, that's super cool. So with Azure Notebooks, what's the version control story? I know what it is for the Python, the py files, just check them into GitHub, or wherever.
42:23 Panelists: Yeah, yeah. So for Azure Notebooks right now, you can use, you can launch a terminal within Azure Notebooks that'll connect you to the container you're running your project in. And then from there you can use a Git command line as you're used to. And then we also have two views in Azure Notebooks. There's the vanilla classic Jupyter UI. And then you can also use the JupyterLab UI and install any Git extensions you want to in either of those ecosystems.
42:48 Michael Kennedy: That's cool. Yeah, I hadn't really thought about that. I guess it is baked into the JupyterLab probably, right? Cool. So one thing I did think was pretty neat is on the Azure Notebook page you have a bunch of featured projects that are kind of cool. So, maybe like I could talk a little bit about something. So there's one that talks about getting started with Azure Machine Learning, that's pretty cool. But then the second one was the Python Data Science Handbook by Jake VanderPlas.
43:11 Panelists: Yes, yes.
43:12 Michael Kennedy: That's cool. Yeah, maybe want to just talk about those a little bit.
43:14 Panelists: Yeah, sure.
43:15 Michael Kennedy: How people can find them.
43:16 Panelists: Yeah, we feature a few projects that are, six or so that we kind of rotate in or out depending on new releases or anything like that. Jake's book is usually always there because people love that and we love it. It's kind of our, it's such a big book of notebooks. Listeners are familiar with it. It's like a whole O'Reilly book written it all in Jupyter Notebook, so it's a massive GitHub repository. So we use it for like stress testing our systems and stuff. Because it's so big, it's kind of become our de facto standard. But we also have things like, I think, University of Cambridge Introduction into Python course is featured there, which is super cool. And it's really great for all of our education scenarios that use Azure Notebooks to see that. And then it also feeds a lot of online course inspiration as well. So we'll see a lot of people coming from different eSdu, Coursera courses, so we try to feature that content as well.
44:03 Michael Kennedy: That's cool. You see people creating like online courses and books and stuff like that on there.
44:07 Panelists: Yeah and something like any cloud hosted notebooks makes such a great case for online courses, so you don't have to spend 10 minutes telling people how to install things and then you're not there in person to help them with any installation. So being able to use something like Azure Notebooks where everything's just ready and you can clone this repo and you're good to go. It's definitely has been super helpful.
44:28 Michael Kennedy: Yeah, that's cool. I definitely, when I've done training and stuff, a lot of it is, let's make sure everyone can run Python. Just type python3 -V, tell me what what you get. Is it an error? Is it a number? Is it 3.4?
44:44 Panelists: That's a very important first step. Right. So we try to limit the time spent there and maximize time spent on code.
44:51 Michael Kennedy: That's cool. Yeah. Did you hear about the announcement of Python shipping with the next version of Windows? The next major release on Windows? This was Steve Dower talked about this at PyCon. And basically, you know, Steve got Python 3.7 in the Windows Store.
45:08 Panelists: Yes.
45:09 Michael Kennedy: And that was cool. So if you need to go install it that would set it up. So apparently now Python is going to come as a little shim in Windows. So if you type python, it'll pop up that store thing and say, click here to install it. And then type it again.
45:22 Panelists: Yes, I know Steve worked on that hard. Finally made it happen.
45:26 Michael Kennedy: Yeah, that's awesome. I really think that's going to be a huge, huge deal.
45:30 Panelists: Yeah, so many people get lost in that first initial hurdle so... Right. ... hopefully it'll help.
45:34 Michael Kennedy: Yeah, I mean, it's still doesn't solve all the pip install this, deactivate an environment.
45:39 Panelists: First step.
45:41 Michael Kennedy: We're getting closer. At least this one less bit of friction there.
45:44 Panelists: Definitely. Yeah, absolutely.
45:45 Michael Kennedy: So let's see, I guess, you know, maybe you can each chime in about what was your favorite thing that you saw or coolest thing here at the conference?
45:53 Panelists: I'm super excited about the new terminal that ships in the Windows.
45:57 Michael Kennedy: Yeah, that sounds pretty cool.
45:58 Panelists: Yeah, I haven't used it myself but looks pretty cool. And then I already heard of comments about... I love that. ...hey, I want to switch to Windows now for development. Because now I can do everything on one machine, like, it's pretty cool. I'm looking forward to start using it.
46:15 Michael Kennedy: Yeah, that thing is not changed in a really long time.
46:18 Panelists: Yeah, I know. And I'm glad they focused on that. Because command line is the center of developer workflow. And we would definitely think that's important too.
46:29 Michael Kennedy: Yeah, I feel, that's actually interesting point. I wonder how much of people sort of doing a lot of open source other, like Microsoft outreach to other platforms outside of Windows. Bringing people that are, kind of live in that space, which is often more command line terminal driven, back in. Sort of going like, all right why is it such a bad experience here on Windows on the command prompt? I wonder like, there's probably some cycle there.
46:53 Panelists: Oh, yeah. Absolutely. Definitely. They have definitely with like, multiple iterations. But because it used to work very differently than the other platforms. But now it is important that we get that right and make that better. So make Windows a great development environment again.
47:10 Michael Kennedy: Alright, cool, yeah, that's a good one. How about you?
47:11 Panelists: Yeah, so I thought the the terminal and the WSL, Windows subsystem for the Linux announcements are all really great. I also really like, we talked a bit about Azure Machine Learning. And originally that had their own APIs, which they'll still maintain. But they're also introducing compatibility with MLFlow, which is a databricks project from the Spark community around managing ML life cycles. And I thought that was another cool combination where we're enabling the open source community on Azure platforms. So I was excited about that as well.
47:42 Michael Kennedy: Right, brick things that are like, work with that API.
47:44 Panelists: Yeah, yeah. So you can use the MLflow APIs and it will work perfectly with your Azure machine learning workspace and track them on a life cycle.
47:52 Michael Kennedy: Very nice. Yeah, that's really, really cool. All right, well, one more question and then I'll ask you the two to sort of closing questions, of course. So the final question is, I feel like over the last five, six years, Python has been super, super popular. Have you seen the Stack Overflow article of the incredible growth of Python?
48:12 Panelists: Yeah, yeah I did. It's in one of our slide deck version.
48:14 Michael Kennedy: So, there it has this huge, huge growth and it's really positive, it's so fun to be part of that community.
48:21 Panelists: Oh, yeah, absolutely. Super exciting.
48:23 Michael Kennedy: I feel like a lot of those, the new folks to that graph, they're not everyone, but a lot of them are coming from the data science side of things.
48:32 Panelists: Oh, yeah.
48:33 Michael Kennedy: What do you think about this?
48:34 Panelists: Yeah, absolutely. I've seen an article actually on Stack Overflow last year, analyzing why Python is growing so fast. I think the one of the the conclusion that author made was because of AI and machine learning becoming more and more popular in just like, any businesses, because it used to be a very tech driven thing that so many high tech companies are doing that. But now we want to infuse AI everywhere in all different businesses. And then we see that start to, like come up. A lot of developers want to do something with AI, and in Python being a natural choice for doing that job. That's why we see a lot of Python growth coming definitely from that space. Actually one of the data points he pointed to was the fact, how fast Pandas as a data package has grown in the past few years, as one indication that how much data science work, the workload has been growing in the Python world. So it's super exciting. So Python is now the number four on the most popular language list. Went from like, was it six or seven last year?
49:42 Michael Kennedy: Yeah. It's definitely growing.
49:43 Panelists: And now it's number two.
49:44 Michael Kennedy: Super cool.
49:44 Panelists: Or number four. Yeah we're definitely kind of in the, working on the most very hot area. Where there's Python and AI and the intersection in between.
49:54 Michael Kennedy: Oh, yeah. You put those two together.
49:55 Panelists: Yes, they're keeping us employed. But yeah, much like Rong was saying, I also think it's cool because I think you can see the power of Python really quickly in a data science scenario, especially with the community around it and all the different packages you can use. I think that really shines in data science and ML cases. I mean, even just with like, visualization libraries. There's so many, and there's so many that do such cool, powerful things that I feel like when people are first getting exposed to Python through data science, it really shines and shows the power of it.
50:26 Michael Kennedy: Yeah, it's true, because you can really quickly generate a graph or a model or something like that, right?
50:31 Panelists: Yeah, like a 3D plot, it's, it's awesome.
50:33 Michael Kennedy: Yeah. As opposed to like, let's put a little game, right? Like it still takes a while to build tic tac toe.
50:38 Panelists: Right. Right.
50:39 Michael Kennedy: And then it just looks like a terminal thing. You're like, well that's not so impressive.
50:42 Panelists: Exactly. Yeah.
50:43 Michael Kennedy: That's that's a pretty good point. Alright, cool. Well, it's been a super interesting compare and contrast, the .py versus the notebooks, a way of working. But thanks for both of your sharing the stories. And let me ask you a quick question before we get out of here. Although I'm, especially Rong, I'm sure I'm going to be able to guess your answer here but... So, if you're going to write some Python code, what editor do you use?
51:06 Panelists: Video Studio Code. You guessed right. I said, if I'm writing some Python code, usually notebooks but I used to be a Vimer and now I use the Vim extension for VS Code when I'm in an editor environment.
51:21 Michael Kennedy: That's cool. You kind of brought them together.
51:23 Panelists: It was just been great. Yeah it's nice. All the key mapping. I need my key... Yeah it's all in there. So when I can get that with the power VS Code, I'm happy.
51:33 Michael Kennedy: You'll be happy, awesome. And then, you know, there's so many packages out there that people might know about. So, you have you come across one that's like, oh, wow, this is really cool, people should check it out a notable PyPI package?
51:45 Panelists: I haven't like deeply doing this myself. But I looked at Plotly, which does 3D plots, as one of the first things I tried when I first joined a team and we actually can render that inside VS Code too and interact with that plot. And I thought that was super cool.
52:03 Michael Kennedy: Oh yeah, that's pretty cool. Plotly is great for for graphics.
52:05 Panelists: Yeah. I'm a sucker for a good visualization when I create too. So yeah, I like Bokeh, I like Plotly. Yeah, I think those are probably my favorites. I'm trying to think if there's any...
52:17 Michael Kennedy: Yeah those are really cool. I'll throw one out that's notebook related that's not graphical, is papermill. Have you tried papermill?
52:23 Panelists: Oh yeah. Yes, papermill is very cool.
52:24 Michael Kennedy: Yeah. kind of turn your notebook into almost like a function you can call or something like that. Yeah, it's pretty wild. All right, well, thank you both for being on the show. It's it's been a lot of fun to talk about it. I'm happy to see the work you all are doing in VS Code around, sort of editor-fying, notebooks. Yeah, good work on the cloud stuff.
52:42 Panelists: Yeah, thank you. It's been fun. Thanks for having us.
52:46 Michael Kennedy: Thanks for being here. Bye. This has been another episode of Talk Python To Me. Our guests on this episode have been Rong Lu and Katherine Kampf. And it's been brought to you by Linode and Backlog. Linode is your go-to hosting for whatever you're building with Python. Get four months free at talkpython.fm/linode. that's L-I-N-O-D-E. With Backlog you can create tasks, track bugs, make changes, give feedback, and have team conversations right next to your code. Try Backlog for your team for free for 30 days using the special URL talkpython.fm/backlog. Want to level up your Python? If you're just getting started, try my Python Jumpstart By Building 10 Apps course. Or if you're looking for something more advanced, check out our new Async course that digs into all the different types of async programming you can do in Python. And of course, if you're interested in more than one of these, be sure to check out our everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for Python. We should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play, in the direct RSS feed at /rss on talkpython.fm. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.