Awesome Jupyter Libraries and Extensions in 2022

Episode #394, published Thu, Dec 15, 2022, recorded Thu, Dec 1, 2022

Episode Deep Dive Links Transcript

Jupyter is an amazing environment for exploring data and generating executable reports with Python. But there are many external tools, extensions, and libraries to make it so much better and make you more productive. On this episode, we are going to cover a ton of them. We have Markus Schanta, the maintainer of the awesome-jupyter list on the show and we'll highlight a bunch of Jupyter gems.

Play on YouTube

Watch the live stream version

Episode Deep Dive

Guest Introduction and Background

Markus Schanta joins the show as the creator and maintainer of the awesome-jupyter GitHub list. He comes from a data analysis and quantitative finance background, having worked at Goldman Sachs, Man Group, and now his own firm, Blue Balance Capital. Throughout his career, Markus has relied extensively on Python and Jupyter notebooks for data exploration, visualization, and report generation. His passion for tooling and community-driven resources led him to curate the awesome-jupyter list, which helps developers and data scientists discover powerful Jupyter extensions and libraries.

What to Know If You're New to Python

If you’re relatively new to Python but want to follow along with the ideas in this episode, here are a few quick starting points:

Remember that Jupyter notebooks let you blend code, visual outputs, and text in a single environment.
Understanding basic Python syntax (variables, loops, imports) is enough to get started with Jupyter.
Libraries like pandas or simple data visualization packages (e.g., matplotlib) can help ground your learning in interesting notebook examples.

Key Points and Takeaways

1) Curated "Awesome-Jupyter" List Markus maintains a GitHub repository compiling a comprehensive set of Jupyter-related tools, extensions, and resources contributed by the community. It spans everything from visualization libraries to collaboration and version control utilities, offering a one-stop resource for power users and beginners alike.

Links and Tools:
- awesome-jupyter (github.com/markusschanta/awesome-jupyter)

2) Collaboration and Education Tools Jupyter notebooks are popular for classroom instruction and team-based data work. Tools like nbgrader let teachers automate assignment distribution and grading, while nbtutor visually explains Python code execution for students learning programming fundamentals.

Links and Tools:
- nbgrader (nbgrader.readthedocs.io)
- nbtutor (github.com/ljvmiranda921/nbtutor)

3) Visualization Libraries Declarative plotting libraries in Python make it easier to build rich, interactive graphs directly inside notebooks. Altair stood out as a favorite, offering a concise syntax built on top of the Vega framework, while bokeh, matplotlib, and seaborn also remain popular.

Links and Tools:

4) Publishing and Converting Notebooks Jupyter notebooks can serve as the core of data reports, articles, or documentation. nbconvert transforms notebooks to HTML, PDF, or other formats, while Jupyter Book turns collections of notebooks into polished publications or course materials.

Links and Tools:
- nbconvert (nbconvert.readthedocs.io)
- Jupyter Book (jupyterbook.org)

5) Version Control and Collaboration Storing notebook outputs directly in .ipynb files can lead to large diffs and merge conflicts. Solutions like nbdime (intelligent diffs), nbclean (removing saved outputs), or jupytext (syncing notebooks with Markdown or .py files) make collaborating through Git more manageable.

Links and Tools:

6) Reusable Notebook Workflows Papermill and other tools let you treat notebooks as functional pipelines: define parameters for inputs and produce versioned or parameterized outputs. This is especially useful for automating reporting or chaining computational steps at scale.

Links and Tools:
- Papermill (github.com/nteract/papermill)

7) nbdev for “Literate Programming” nbdev transforms Jupyter notebooks into production-ready Python packages. It supports two-way syncing between notebooks and .py files, automated testing, documentation generation, and a robust build process for distributing libraries or sharing code.

Links and Tools:
- nbdev (github.com/fastai/nbdev)

8) Binder for Live Demos Binder makes a repository of notebooks instantly executable in the cloud. It automatically spins up Docker containers so anyone can run your notebook with one click—perfect for demos, reproducible research, or interactive documentation.

Links and Tools:
- Binder (mybinder.org)

9) IPython Magic Commands Built into Jupyter’s IPython kernel, these magic commands simplify debugging (%debug), benchmarking (%time), workflow history (%history), and even external shell integration via !ls or !ping. They’re a powerful yet underutilized feature for everyday notebook users.

Links and Tools:
- IPython docs (ipython.readthedocs.io)

10) Deepnote and Hosted Notebook Solutions Deepnote is a cloud-based notebook platform with an emphasis on real-time collaboration, commenting, and easy setup. Hosted solutions like Deepnote or Google Colab reduce DevOps overhead, offering ready-to-go environments for teams, students, or data scientists.

Links and Tools:
- Deepnote (deepnote.com)

Interesting Quotes and Stories

On Multi-Use Collaboration: “Having a shared setup in the cloud means I can pick up my analysis from wherever I am—just need a browser.”
On Teaching with Jupyter: “I can define how I want the assignment to be graded automatically, and students can see right away if they’re passing each test—everyone wins.”

Key Definitions and Terms

IPython Magic Commands: Special commands in Jupyter (e.g., %time, %debug) that simplify tasks like performance measurement or debugging right within a notebook cell.
Binder: A free cloud service that takes a GitHub repo of notebooks and creates a live, runnable environment with one click.
nbdev: A tool from fast.ai allowing you to create Python packages, tests, and documentation from a set of notebooks (literate programming approach).
nbdime: A library that helps you see meaningful diffs for Jupyter notebooks under version control.
Papermill: Parameterizes and executes notebooks programmatically for tasks like generating multiple reports from one notebook template.

Learning Resources

If you’d like to strengthen your Python foundation or learn more about data-focused workflows, here are some hand-picked courses from Talk Python Training:

Python for Absolute Beginners: Ideal if you want a clear, paced intro to Python programming.
Data Science Jumpstart with 10 Projects: Practice building real-world data science apps with Python and Jupyter.
Python Data Visualization: Master the core libraries and techniques for plotting and exploring data within notebooks.

Overall Takeaway

Jupyter notebooks remain a powerhouse for anyone working in Python—especially in data analysis, education, or collaborative settings. By adopting community-driven extensions such as nbgrader, nbdev, and nbdime, you can supercharge your workflow, seamlessly integrate version control, and polish notebooks for wide distribution and production use. The awesome-jupyter list curated by Markus is a testament to just how vibrant and fast-evolving this ecosystem is, offering newcomers and experts alike a springboard for discovering new ways to make Jupyter even more effective.

Links from the show

Markus Shanta: markus.schanta.at
Markus on Twitter: @markusschanta
awesome-jupyter list: github.com
Jupyter book: jupyterbook.org
Jupyter Desktop App: jupyter.org
Talk Python Episode on 60 Notebook Envs: talkpython.fm
nbdev: github.com
Python Tutor: pythontutor.com
Cell Magics: ipython.readthedocs.io
Watch this episode on YouTube: youtube.com
Episode #394 deep-dive: talkpython.fm/394
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode #394 deep-dive: talkpython.fm/394

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Jupyter is an amazing environment for exploring data and generating executable reports with Python.

00:05 But there are many external tools, extensions, and libraries to make it so much better and to make you more productive.

00:12 On this episode, we're going to cover a ton of them.

00:15 We have Marcus Shanta, the maintainer of the awesome Jupyter list on the show,

00:19 and we'll highlight a bunch of Jupyter gems.

00:21 This is Talk Python To Me, episode 394, recorded December 1st, 2022.

00:27 Welcome to Talk Python To Me, a weekly podcast on Python.

00:43 This is your host, Michael Kennedy.

00:45 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython, both on fosstodon.org.

00:53 Be careful with impersonating accounts on other instances.

00:55 There are many.

00:57 Keep up with the show and listen to over seven years of past episodes at talkpython.fm.

01:01 We've started streaming most of our episodes live on YouTube.

01:05 Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.

01:13 This episode is sponsored by the AWS Insiders podcast.

01:17 AWS is changing fast.

01:19 Listen in to keep up over at talkpython.fm/AWS Insiders.

01:25 And it's brought to you by Sentry.

01:27 Don't let those errors go unnoticed.

01:29 Use Sentry.

01:30 Get started at talkpython.fm/sentry.

01:33 Transcripts for this episode are sponsored by Assembly AI, the API platform for state-of-the-art AI models that automatically transcribe and understand audio data at a large scale.

01:44 To learn more, visit talkpython.fm/assemblyai.

01:48 Marcus, welcome to Talk Python To Me.

01:50 Hi, Michael.

01:51 Thanks for having me.

01:52 This is going to be a very broad, not necessarily super deep episode, but we're going to talk about a ton of cool little extensions and widgets and libraries that you can plug into your Jupyter work and make it awesome.

02:06 I really like talking about these topics where it's like, oh, that's only a 10-minute commitment to see if it's going to help me out or not, right?

02:12 And not like some huge framework you got to learn.

02:14 And there's going to be a lot for people to take away if they do anything at all with notebooks, I think.

02:18 Yeah, and I think one of the nice things about notebooks is they're very easy to use.

02:23 I think the barrier to entry to you just being able to do something tangible with them is very low, much more so than maybe a lot of other things.

02:32 Yeah, they're very, very welcoming environments, especially if someone set them up for you, if it's one of the hosted ones, or if you're not in there trying to configure a kernel with a special virtual environment, like, how do I get to the right installs?

02:45 And why is this thing not there?

02:46 Or a whole super-nature hub running on Kubernetes.

02:49 Yes, exactly.

02:50 But once you're kind of beyond that shell and many people have that set up for them, then it's ready to roll.

02:55 All right, so going to be tons of fun to talk about these things.

02:58 But before we get to them, let's start with your story.

03:00 How did you get into programming in Python?

03:01 How did I get into programming?

03:03 I think it sort of started when my parents decided to buy a computer when I was eight years old, back in the 90s.

03:11 And then sort of at 12, I started to realize, oh, you can program these things.

03:16 And some well-intentioned adult, like, bought me a book on C and, you know, like 12-year-olds and pointers.

03:23 It doesn't mix.

03:24 C, the easy parts.

03:26 Yeah, exactly.

03:27 Just void, star, star.

03:28 Come on, kids.

03:29 Get with it.

03:29 Yeah.

03:30 That was just, like, conceptual a bit too much for a 12-year-old.

03:35 But then luckily, like, once I got into high school, they actually taught us Boral and Delphi.

03:40 I don't know if she ever came across that.

03:42 It was a very safe, easy-to-use, like, GUI-driven way to get into programming.

03:48 And you could do a lot of things even that, like, a 14-year-old would find exciting, right?

03:54 Just, like, C-style, hello world, and I can create numbers in a loop, but actually, like, fun GUI-like stuff.

04:00 You know, I never used that, but I used Visual Basic and some Windows Forms and other stuff.

04:06 And I really think, even today, that style development, it's kind of withered and it's too bad, right?

04:13 It used to be so amazing.

04:16 You would just go over there, you know, button, text box.

04:19 I click on this thing.

04:20 Here's what it calls.

04:21 The interaction was so simple.

04:23 And now you're doing, you know, maybe you're doing the web, and it's like, I'm going to do Webpack.

04:28 And then, you know, bring in all these things, and I'm doing TypeScript, and then it's all CSS and HTML.

04:34 And I'm not bemoaning that style of development, but it's like, that's kind of replaced this.

04:39 Just give me a visual of what it looks like, and let me build something simple.

04:43 It has been replaced with...

04:44 I feel like it was a very good pedagogical tool to teach programming and GUI development.

04:50 Like, I remember we did find things like rectangles on the screen that just chased each other around, right?

04:56 Yes.

04:57 That was a fun way into programming.

04:59 Yeah.

05:00 And I think there's some bits like that still around, isn't it?

05:03 What's the, like, programming learning language?

05:06 Something with an animal?

05:07 Is it small talk?

05:08 Or the thing with a turtle?

05:10 Yeah, yeah.

05:10 The turtle where you can do some drawing.

05:12 There's that.

05:13 Yeah, yeah.

05:13 There's Anvil, actually, which is a Python sort of front-end and back-end web framework that has this.

05:19 It's really, really similar.

05:21 My daughter used that to play with it a lot.

05:23 But yeah, I would love to see like a Visual Basic Pi or something, you know?

05:29 Something like that would be so amazing.

05:30 Maybe it exists, yeah.

05:32 But that's where I really started developing or programming on my own.

05:37 Then I went to university.

05:38 That was in the early 2000s.

05:41 And they taught us mostly Java, but also sort of a weird mix of like logic-oriented programming, functional programming.

05:49 And I really only ever got in touch with Python when I did my master's in the States during some courses.

05:55 And then professionally, I learned proper Python when I was working in London.

06:00 And that was a company, Man Group, with a lot of really good Python developers.

06:05 And there I picked up like proper Python.

06:08 Yeah.

06:08 And stuck with it ever since.

06:10 If I have a choice, I will stick to it.

06:13 It's interesting that we come from, you know, my background was not that different than yours in some ways.

06:17 I did like C++ in the early days.

06:20 And you think as you advance in your career and you get more technically capable and stuff, you would think, well, you're going to be doing more intense stuff and more like deep dive.

06:29 Like now you're writing like kernel drivers.

06:31 Like it's for me at least, and it sounds like you a little, quite the opposite.

06:34 Like more towards some of these higher level languages that let you build really amazing things.

06:39 But you're not juggling like pointers to pointers and like all this crazy aspects, right?

06:44 I think the first thing, like your expectation that you're describing is maybe a function of sort of technical expertise.

06:50 In the second development, what actually what you end up doing, right, is a function of age, right?

06:55 That you like, as you become older, you focus more on the high levels things.

06:59 You just want to get stuff done.

07:01 You don't care about showing off.

07:02 Exactly.

07:02 You can make this compile, right?

07:03 What's the shortest way for me to be?

07:05 And very often that's Python.

07:07 Absolutely it is.

07:08 All right.

07:08 So some real-time follow-ups.

07:10 The audience has helped us out here.

07:11 Don says, Delphi is a wonderful way back into Pascal from the college days.

07:15 Indeed.

07:15 And some of those languages, Logo was one of these visual building ones, as well as the one with the cat is Scratch, which, what an amazing name.

07:25 I haven't really used Scratch, but.

07:26 I think the cats or animals in general seem to be good metaphors for teaching programming.

07:30 Yes, they do.

07:32 And the O'Reilly publishers, the publisher O'Reilly, has built their entire book series on unique animals and programming.

07:39 All right.

07:39 Well, how about now?

07:40 What are you doing these days?

07:41 Right now, I am a founding partner at Blue Balance Capital.

07:45 We are a small, independent, alternative asset management firm based in Vienna, Austria.

07:51 I started this company three years ago with my partners.

07:55 Before that, I worked at an asset management firm in London.

07:59 Yet before that, I worked at Goldman Sachs as a quantitative analyst.

08:05 And so basically, during my professional career, I was always more a sort of data analyst rather than a systems developer.

08:14 So everything or most of the things I use Python for are more from the perspective of here's a piece of data.

08:20 And I want to sort of analyze that data set to better understand, like, I don't know, a company or an economic trend or an industry or a particular trade.

08:31 So I was always interested in Python in the context of, like, some data set and some insight that you can glean from analyzing that data set.

08:40 How much of that was taking a block of historical data, like, here's the last quarters and, you know, reports or whatever, versus trying to make predictions, you know, like real-time trading or other types of real-time information?

08:55 So I think sort of the starting point is almost always some kind of like time series.

08:59 And then the asset management firm, Mancobura, they were actually developing automated trading systems.

09:07 So it was like very much the first thing you described, where you have an input that is a time series, you apply some transformation, and then a computer actually produces trades on the back of that.

09:18 Whereas at other times, more like what I do now is you think about it, you get some insight from the data, but then there's also some other real-world considerations or exogenous factors that you think about as a human person and then make your decision.

09:33 So I think for me, it was the whole gamut of sort of the product is a direct trade that the machine puts on or trade that you put on or maybe just some advice that you give to a company.

09:44 Right, right. We're thinking that tech, generally, that the tech indexes are going down over the next three months versus buy today and sell tomorrow, yeah?

09:53 Or buy now, sell tomorrow.

09:53 Predictions are hard, especially about the future.

09:56 Yeah, well, especially now with COVID and wars and like...

10:01 I have been in finance for more than 10 years now, but I've never seen anything like the last two years.

10:07 Yeah, that's nuts.

10:07 I think much more happened in the last two years than in the 10 years before.

10:11 It is living through some history, isn't it?

10:13 Yeah.

10:14 One more question on this background side of things.

10:16 You did a lot of work for these other companies like Goldman, and now you're founding a smaller company.

10:22 How do you feel that your programming and data science background has suited you to be more of leading this new company?

10:30 I think I have a great sort of education from both Goldman as well as Man Group.

10:39 They're both very technically capable organizations with a lot of very smart people.

10:44 And as I was starting out together with my partners as a very small team, one of the things that gave me comfort is that I have this stack of things that I know work together.

10:54 I understand how they work together and how to sort of apply them to do useful things.

10:59 So having that sort of under your tool belt or ready to go and not having to figure out sort of how they go together.

11:07 But just on day one, here's an AWS instance or EC2 instance.

11:11 Spin it up, get a notebook running and produce some nice charts.

11:15 That was something very valuable for me to just be able to do very quickly.

11:19 Cool.

11:19 That's what I would think.

11:21 And I feel even if you have these skills, even if you're not the one writing, if you hire a team or you find a consultant, being able to speak with them and understand like, yeah, no, no, I have a recommendation.

11:32 And I think actually this, you know, like, let me tell you, that tool is good, but this one is better and it would fit because X, Y.

11:38 I think just that's super valuable.

11:40 So I just wanted to kind of check in with you on that.

11:42 Yeah, like I tried and test a stack of tools and every once in a while you get to branch off and try some new shiny and see if it works, see if it sticks.

11:51 But having this trusted thing that you know and how it works is valuable.

11:55 Yes.

11:55 Like if you're doing JavaScript, you want to have a really trusted tool that's been around for at least a month.

11:59 Yeah.

12:00 I mean, just kidding.

12:01 Luckily, we don't have it quite that bad in Python.

12:04 It's a lot more stability there.

12:05 Okay.

12:06 Well, let's talk about this list, which is why we got together and I invited you here.

12:11 Is this awesome Jupyter list.

12:14 You know, what is it?

12:14 Where did it come from?

12:15 I mean, most people are familiar with awesome lists, but maybe what's the philosophy of yours?

12:20 Yes.

12:21 So awesome lists, they're just like curated lists with resources that are useful or pertinent to some particular topic.

12:30 And then in this case, it's Jupyter notebooks.

12:33 So this is really a list of things that I started with 20 entries and then just put it up on GitHub.

12:40 And then over time, just more people added to that list of things they find useful and have some relationship with Jupyter.

12:48 And I think that up to this point, I think more than 100 people have collaborated on other things that they find useful to this list.

12:55 So it's just like living, breathing thing of whatever people find useful that has a relationship to Jupyter.

13:01 I still find these very valuable, even though a good part of my job is like to track what is new, what's interesting, what's trending.

13:08 Still, I find so many things that are new here.

13:10 And when I first got to Python, I was like, wow, look, it just keeps going.

13:15 I just thought there was three web frameworks.

13:17 There's one way to talk to a database.

13:19 Look how many there are.

13:20 It's so amazing.

13:21 And it's always delightful.

13:22 You probably remember like back in the old days of the internet, you had directories, right?

13:28 I mean, that was Yahoo.

13:29 That was the first search engines, right?

13:31 Yes.

13:31 You had these like catalogs of things.

13:33 And here's a website that is about cats.

13:36 And here's one about dogs.

13:37 And in some ways, it feels like in this day and age, we have come back to like you actually have a person who is keeping some kind of like directory of things that are useful or pertinent to a particular topic.

13:48 It's kind of funny that way.

13:50 And what's quite interesting for me is sort of one of the benefits that I have from doing this is that I see like what other people find useful.

13:58 And so for myself, I just know, hey, these are the things that people are using.

14:02 And so I've got a pretty good radar of the whole Jupyter and Notebook ecosystem just because I'm sort of curating this thing.

14:09 Yeah.

14:09 You probably have had people recommend things and you're like, no idea what that is, but that looks awesome.

14:15 So it belongs on awesome Jupyter.

14:16 I try to be very inclusionist to this.

14:19 So when people include things on this list, more often than not, I exclude them.

14:25 And even in the cases where I'm like, I don't see myself using that, but I'm sure there's some category of people who might find that useful.

14:33 And it just goes on the list.

14:35 Yeah.

14:35 You don't want to over-index on your specific use of Jupyter and your vertical, right?

14:42 Because we've got astronomers who are using this stuff.

14:44 We've got economists.

14:45 We've got biologists.

14:47 We've got students, right?

14:49 It's publishers.

14:50 All sorts of folks.

14:51 Yeah.

14:52 Yeah.

14:52 People who care more about, I don't know, like keeping, turning their computation on a cluster.

14:57 Other people are more into visualization.

14:59 You name it, you have it.

15:00 Yeah.

15:01 Right.

15:01 Like ML folks might have one concern over others.

15:05 A lot of people who use it in education, that's actually one of the sections in there is a whole section dedicated to education, people teaching courses using notebooks and what the best tools around that is for it.

15:18 Like, I don't know.

15:19 It may be even grading homework assignments that you distribute to people in the form of notebooks.

15:24 Yeah.

15:25 I definitely want to highlight that because my, I haven't talked about it very much, but I was a graduate TA.

15:30 So, boy, I graded a lot of, a lot of calculus, a lot of linear algebra and various other applied calculating type things like MATLAB type of stuff and automating.

15:40 Oh, it would have been good.

15:42 Okay.

15:42 So.

15:43 You just replace yourself with a regression test.

15:45 Exactly.

15:47 Submit your calculus test to the continuous integration.

15:51 We'll see how you did.

15:52 This portion of Talk Python is brought to you by the AWS Insiders podcast.

16:00 When was the last time you ordered a physical server to host your functions as a service, your latest API, or your most recent web app?

16:08 I remember the last time I did.

16:10 That was around the year 2001.

16:12 And yes, it was quite the odyssey.

16:14 Of course, we don't do that anymore.

16:16 We run our code in the cloud with near instant provisioning and unparalleled data centers.

16:22 And the most popular cloud provider is AWS.

16:25 But for all the ways that AWS has made our lives easier, it has also opened a massive box of choices.

16:32 Should you choose platform as a service?

16:34 Or maybe it's still VMs with IaaS.

16:36 What about your database?

16:38 Maybe you should choose a managed service like RDS with Postgres.

16:42 Or is DynamoDB better?

16:43 Maybe Aurora?

16:44 No, wait.

16:45 I hear good things about Amazon DocumentDB too.

16:48 And that's where the AWS Insider podcast comes in.

16:51 This podcast helps technology leaders stay ahead of Amazon's constant pace of change and innovation.

16:56 Some relevant recent episodes include Storage Wars Database Edition, Microservices or Macro Disaster,

17:04 and Exploring Computer Vision at the Edge with AWS Panorama.

17:08 They bring on guests to debate the options and the episodes are vibrant and fun.

17:13 So if you want to have fun and make sense of AWS, head on over to talkpython.fm/AWS Insiders.

17:20 Yes, I know you probably already have a podcast player and you can just search for it there.

17:25 But please use the link so that they know you came from us.

17:28 Thank you to the AWS Insider podcast for keeping this podcast going strong.

17:36 So there's a couple of sections that I'm not sure I really want to dive into because I think,

17:40 I don't know, they're not exactly the notebook ones.

17:43 But one is this collaboration education stuff.

17:47 So maybe we could start there.

17:49 And let me just set the stage by saying if a little while ago I met with Sam Lau and talked with him.

17:55 He and Philip Guo, they did a research project where they studied 60 different network environments,

18:02 not just Jupyter, but like Google CoLab and 58 others.

18:07 And so just kind of putting it out there, like you might think just Jupyter versus JupyterLab is the discussion,

18:14 but there's a whole lot of different places where you can do notebooks, right?

18:19 Yeah, and some of them, like you can run on your own machine.

18:23 That's sort of what I have in this runtime slash environment section.

18:26 Those things tend to go in there.

18:28 And there's a separate category of, I call them in the list, hosted notebook solutions.

18:34 Those are things that you don't really run on your own machine, but they run somewhere in the cloud.

18:39 So I think basically that is one way you can break them down into categories.

18:44 It's just do you run them yourself or do they run in the cloud somewhere else?

18:47 One thing that I didn't see on the list, but maybe would be kind of its own special.

18:53 Open a pull request.

18:54 Yeah, right.

18:54 Here we go.

18:55 Is the JupyterLab desktop app.

18:58 Have you seen this?

18:59 Not sure I've seen that.

19:01 Yeah.

19:01 So what it is, is it's an ElectronJS app that bundles the runtime environment to be JupyterLab.

19:08 And it comes with its own Python and everything.

19:11 So it's a thing you can hand to somebody that runs locally that lets them do notebook stuff

19:16 without them having to have Python installed and set up the environments.

19:20 And it just kind of has a little wizard to get it started, which is, I'm not sure I would

19:24 use it personally, but it's pretty interesting.

19:25 It sounds like a very low barrier to entry notebook environment.

19:30 Yeah.

19:30 I think it could be good for, you know, like in a school environment where you're like,

19:34 all right, kids, just take this and run it.

19:36 I don't want to have to debug why you can't install Python 3.10, but you need 3.7, you know, whatever.

19:42 Lots of different people have different kinds of Jupyter or notebook setups.

19:47 Mine personally tends to be, I've got to, I actually run mine in the cloud because I find it convenient

19:54 to be able to access it from different machines.

19:57 So like I access it from work and then I can access it from my notebook.

20:02 Even like when I'm at a friend's place or something, all I need is a browser to access it.

20:07 And I can just like continue where I left off on the other machine.

20:11 Other people prefer.

20:12 You can probably be closer to the data, right?

20:15 Exactly.

20:15 Right.

20:16 Yeah.

20:16 And you've got a lot faster pipe and you're not that dependent on what your own network

20:22 situation looks like wherever you are.

20:24 Yeah.

20:25 You're just shipping the answer, not the gig of data required.

20:29 Exactly.

20:29 I get the answer.

20:30 Exactly.

20:30 Right.

20:31 And you've got a pretty easy machine on the other end that can deal with all the calculations.

20:36 Yeah.

20:36 So I find that pretty cool.

20:37 At some point I overdid it and even dockerized the whole thing.

20:42 And then I felt like that was getting more in the way of it than being helpful.

20:47 Yeah.

20:48 You're like, I gave myself a DevOps job.

20:50 Why did I do that?

20:51 Yeah, exactly.

20:52 I only maintain one of these installations.

20:54 Why do I dockerize them?

20:55 Yeah.

20:56 That's actually a really good point.

20:57 I have the same philosophy on web apps.

20:59 It's like, well, if there's just going to be one of them and it's just me, how much flexibility

21:03 does this thing really need?

21:04 Okay.

21:04 Yeah.

21:04 So there's a whole section on these with honestly many places I haven't heard of and ways

21:11 to run it.

21:11 But let's talk about two things in this collaboration education section.

21:16 Three actually, but two are kind of in my mind, put them together.

21:19 One is NB grader, like real quick.

21:21 Like this is a pretty cool project.

21:23 Tell people about this.

21:24 This is pretty much what I described before in the abstract.

21:26 Right.

21:27 If you are a person in education and you teach a course and you want to want your students

21:33 to do a particular assignment and then they send in their submissions, you don't want to

21:38 hand grade them one by one.

21:40 What you can do is formalize basically what you want the answers to look like in a form

21:45 of regression tests.

21:46 And that is basically what NB grader is.

21:49 So like you get one notebook and you define what you want the answers to look like.

21:53 And then it just does the rest of it for you.

21:56 That's pretty interesting.

21:57 I've never used it myself because I'm not working in academia.

22:01 Yeah.

22:01 Like the value prop is obviously obvious with that one.

22:04 Well, I think there's two values here.

22:06 Obviously the less effort on the instructor.

22:10 There's also a little bit of more fairness.

22:13 Yeah, sure.

22:14 There's an interesting angle.

22:15 Like I'm sure that this is true for grading.

22:17 You know, is it morning and you're rested and patient or is it late and you're in a rush

22:22 and you're frustrated?

22:23 I don't know which affects which in terms of how the grades go, but it's got to have an

22:27 effect, right?

22:28 I was talking to some folks who did machine learning for discovering planets on Talk Python and they

22:34 said after the afternoon coffee and cake or cookies or whatever it was they had at this

22:41 university, more exoplanets to be discovered than in the morning.

22:45 Okay.

22:46 I was always told to like call people after lunch.

22:50 That's when they are usually most contained and most open to.

22:54 Exactly.

22:55 And so there's probably a thing about grading.

22:57 So the fact that this doesn't care, it doesn't get coffee.

23:00 It gets electrons.

23:01 That's good.

23:02 And I think there's also like a social science paper on like jury verdicts and sort of the

23:07 harsh use of jury verdicts over time of date.

23:10 Right.

23:11 That's a little bit, that's a little bit harsh to think about, isn't it?

23:14 Like I got an extra year in prison because they were grumpy, right?

23:17 That's not how justice should work.

23:19 Yeah.

23:19 They didn't have their coffee yet.

23:21 So the other angle that I think is interesting with this is if you're a student, you get

23:26 to know whether or not you passed that question.

23:29 Right.

23:29 A lot of times when you're doing complicated things, it's like, I think this is right.

23:33 But if it's not just really straightforward, like a calculus, here's what the formula derivative

23:39 is.

23:39 Right.

23:39 But it's a slightly more nuanced.

23:41 It's hard to know what the right answer is.

23:42 And so here you're like, well, the test passed.

23:43 So we're good to go.

23:44 I like that.

23:45 I had a course like that at university once where you could do multiple submissions and

23:50 the system would tell you like how many points you scored.

23:53 And it actually sort of was very motivating to sort of keep going until you score a perfect

23:58 answer.

23:58 I think having something like that in a course would be super cool.

24:02 I totally agree.

24:03 Before we move on real quick in the audience, David says, I use NB Grader for my teaching.

24:07 It's super helpful.

24:08 NB Grader identifies wrong answers and then you can go in and assign partial credit.

24:12 Yeah.

24:13 I love it.

24:13 That's actually really neat.

24:14 Really neat.

24:15 Okay.

24:15 The other one that's more of a educational demonstration or exploration is NB Tutor here.

24:22 So NB Tutor lets you go in.

24:25 Will you tell people about it if you're familiar with this one?

24:28 No, I haven't used it lately, but it looks like we're dancing.

24:31 I mentioned Philip Guo and Sam Lau.

24:34 They did Python Tutor, which lets you go and write some Python code.

24:40 And it shows you basically how it executes and how variables are related with pointers and

24:45 stuff.

24:45 And this is inspired by that.

24:47 So what it lets you, not this one, it lets you basically run a, what is it?

24:52 A magic command with a percent?

24:54 Cell magic.

24:55 Cell magic.

24:55 You run some cell magic to turn it on.

24:57 And then to the right of the cell.

24:59 It starts showing the pointers and how things are relating.

25:02 So if you're trying to understand computer science and things, I think this would be cool for teaching.

25:07 And you have your code and all you have to stick on to get the visualization is this one short cell magic and you get the rest for free.

25:14 That's pretty cool.

25:15 Yeah, it's really cool.

25:16 They give credit right here to OnlinePythonTutor.

25:18 There's some other ones.

25:20 The Jupyter Drive one to integrate Google Drive looks pretty neat.

25:24 I think it's a little bit expired when I opened it.

25:27 It is definitely more of the experimental flavor.

25:31 Yeah.

25:31 And I imagine sort of whoever develops this is also on the mercy of Google Drive keeping their API stable.

25:39 Absolutely.

25:39 I think that is one way or it's in general, it's a non-trivial question figuring out how to best store your notebooks.

25:46 I mean, if you're just one person, you can probably stick them into Google Drive.

25:49 Yeah.

25:50 But as soon as you have more than a handful of people working on the same set of notebooks, you probably want a solution that is a bit more sophisticated than that.

26:00 I totally agree.

26:01 One of those solutions might be a proper Git story, right?

26:06 And some of the tools we'll talk about are going to cover that, right?

26:08 Yes, exactly.

26:08 The other one could be a collaborative, like a Google CoLab or some other, one of these other environments where it's like Google Docs.

26:16 There's hosted environments that sort of have that as a built-in or you basically, the other way is you roll your own and make it Git-based and both have their advantages and disadvantages.

26:27 I think with Git, you always know a little bit better what you have and what it does.

26:33 Whereas with the other one, that might come with some other fringe benefits like being able to comment on it or having versions of the notebook very nicely integrated with your GUI.

26:44 Yeah.

26:44 And it's like whatever you prepare for sure.

26:46 Like both work.

26:48 Yeah.

26:48 The online ones often have like infrastructure that comes with them too, right?

26:52 When the ability to press go and run it on a GPU if you're willing to pay or whatever.

26:57 Yeah.

26:57 Cool.

26:57 Okay.

26:58 Let's, I think that probably is the interesting ones that jumped out at me from there.

27:03 And then next one is visualization.

27:06 I mean, this is at the heart of the value of notebooks in the first place.

27:10 So Altair, tell us about that one.

27:13 I'm biased, even though I certainly, I'm not a developer of Altair.

27:17 I think what that team has developed is pretty amazing.

27:21 I use it for most of the things, for most of my visualization needs or almost exclusively.

27:27 What's neat about Altair is that it is declarative and it is built on top of a technology, on top of another package, which is called Vega.

27:37 And Vega is a platform agnostic visualization framework.

27:43 So basically what you have to do is if you want to have a chart like that, you just write some JSON declaration of basically, here's your data set.

27:52 This is the URL to the data set.

27:54 It's a tabular format.

27:56 The variable that I want on the X axis is called Foo.

28:00 The variable that I want on the Y axis is called Bar.

28:03 And I want a scatter plot.

28:05 And please make origin in this example here.

28:08 Please make that the color of the dots.

28:10 And you just specify that in a declarative format.

28:13 And then you can, what that allows you to do is you can create this declaration from Python.

28:21 But it might just as well be JavaScript or even like a handwritten JSON, right?

28:27 Right.

28:28 There's some kind of JSON data definition that goes to Vega and that drives the picture.

28:32 So basically Altair generates that data set that goes down to the next layer, right?

28:37 Exactly.

28:37 So like Altair is the Python binding on top of Vega.

28:41 And I think sort of declarative systems, most of the time they have a higher level of abstraction.

28:48 They have more concise notation.

28:51 And the way I always explain this to people when they ask about it is Vega and Altair is to visualization what SQL is to data query, right?

29:01 Your SQL query, you can execute that from within Python.

29:04 You can execute it from within a GUI.

29:07 It's sort of a language agnostic specification of what data you want to query.

29:12 And this is basically the same thing for visualizations.

29:16 Yeah.

29:16 It looks really great.

29:17 There's a beautiful picture of a scatterplot with a legend and multiple colors kind of pulling out some nuance in the data.

29:26 And it's like, I don't know how many lines of code you put that in.

29:29 Maybe four if you didn't multi-line one of them.

29:32 I mean, it's really.

29:33 You can probably golf it together in four lines if you're.

29:37 Without semicolons.

29:38 So if you did semicolons, you could do it one, but that would be wrong.

29:41 But like four reasonable lines, you could do this beautiful picture here, including the important.

29:46 And that already gives you a reason.

29:47 They're quite impressive visualization.

29:49 I think it has some nice defaults.

29:52 Like it knows how to nicely space the labels of the axes and stuff like that.

29:56 And then what it still allows you to do is build pretty complex visualizations too.

30:02 So there is this one example where you basically have a scatterplot on top.

30:07 And then on the bottom, you have something like a histogram and you can select the range in the histogram at the bottom.

30:14 And there you go.

30:16 And then you get this beautiful interactive animation and you don't actually have to write any imperative code.

30:22 You just specify what you want and Altair and Vega kind of like do the rest for you.

30:28 So I write some JavaScript, but years ago, I used to see these like really nice and beautiful animations that were built of D3.

30:37 And I'm like, I want to do cool stuff like that.

30:40 That's really pretty.

30:41 But I don't know any JavaScript.

30:42 And I feel like this is like leveling the playing field a little bit more.

30:47 And it allows you to do similar things from within Python.

30:51 It's really nice.

30:51 One of the things that I think is a little ironic is for the people who create these tools, like the people who created Altair, they have to write so much JavaScript.

31:01 And not that much Python, right?

31:03 Because they're building these interactive, beautiful experiences on the front end for us.

31:07 We get to write the Python and there's like a lot of that complex JavaScript is encapsulated into these tools that we don't have to think about, but we get to use, which is great.

31:17 It's a dirty job, but somebody's got to do it.

31:20 And I have a lot of appreciation what those folks are doing for the rest of us.

31:24 I do too.

31:24 All right.

31:26 So Altair is number one, right?

31:27 On the visualization.

31:28 I mean, it doesn't hurt that it starts with A, but also maybe one of the best ones.

31:32 Some of the other shout out.

31:36 In all fairness, like these things are sort of usually alphabetized.

31:40 These every once in a while, like things in the wrong place.

31:44 And by now I even built myself a linter that keeps the lists nicely alphabetized.

31:49 I'm sure that makes a lot of sense.

31:50 Yeah.

31:51 So Bokeh is one that's out there.

31:53 That's pretty well known.

31:54 People use a lot.

31:55 Yeah.

31:56 Yeah.

31:57 There's a lot of them, right?

31:58 There are.

31:58 And I think when you talk about visualization, the other very popular ones are probably MatBlootlib,

32:05 which probably was one of the first plotting engines for our backends for Python.

32:10 And then Seaborn, which kind of like builds on top of that.

32:13 Yeah, absolutely.

32:14 Seaborn is nice.

32:15 One that I've seen just recently on notebooks is TQDM.

32:19 I've always used this from CLI applications.

32:23 And TQDM is a way to just take a for loop and whatever you're going to loop over, you just

32:29 put that in TQDM, bracket that thing, and it becomes this live animated progress bar, which

32:37 is really neat.

32:37 But I've only thought of this as a terminal CLI type of thing.

32:41 But it works in notebooks too, I just learned, right?

32:43 Yes, it does.

32:45 TQDM is one of those, like, does exactly what it says on the team kind of things.

32:49 It does one thing and it does it very, very well.

32:51 Yeah, it's not an incredible output.

32:53 But at the same time, it's like, you know what?

32:54 I want to have a little bit of feedback for the users or for myself.

32:59 And you're like, OK, it's just literally wrap your iterator in TQDM and it's good to go.

33:06 It's a very natural thing to want.

33:07 Just imagine you've got like some long running computation over a loop, right?

33:13 And you just don't want to stare at a blank screen for two minutes.

33:16 Yes.

33:16 You can kind of see how maybe the idea developed from there.

33:19 I've got a lot of those type of things.

33:21 This portion of Talk Python To Me is brought to you by Sentry.

33:27 How would you like to remove a little stress from your life?

33:29 Do you worry that users may be encountering errors, slowdowns, or crashes with your app right

33:35 now?

33:35 Would you even know it until they sent you that support email?

33:38 How much better would it be to have the error or performance details immediately sent to

33:43 you, including the call stack and values of local variables and the active user recorded

33:48 in the report?

33:49 With Sentry, this is not only possible, it's simple.

33:52 In fact, we use Sentry on all the Talk Python web properties.

33:56 We've actually fixed a bug triggered by a user and had the upgrade ready to roll out as we

34:01 got the support email.

34:02 That was a great email to write back.

34:05 Hey, we already saw your error and have already rolled out the fix.

34:08 Imagine their surprise.

34:09 Surprise and delight your users.

34:11 Create your Sentry account at talkpython.fm/sentry.

34:15 And if you sign up with the code talkpython, all one word, it's good for two free months

34:21 of Sentry's business plan, which will give you up to 20 times as many monthly events as

34:26 well as other features.

34:27 Create better software, delight your users, and support the podcast.

34:32 Visit talkpython.fm/sentry and use the coupon code talkpython.

34:39 Brian out in the audience says, HV plot, hollow views, bokeh, and panel are all awesome and

34:46 tightly interconnected.

34:47 Yeah.

34:48 And those are really nice.

34:49 All right.

34:49 So next one is the publishing.

34:51 This might be also a little bit at the very heart of notebooks.

34:55 The original idea of the notebook was, I want to have some explanation and then some executable

35:01 code and then some visualization.

35:03 Almost like I want to tell the story of a research project or something like that.

35:08 Right.

35:08 And so this section, it's right there, isn't it?

35:11 Yeah, exactly.

35:12 I think it's sort of less clear cut what this category is as compared to some of the others.

35:17 But it is basically anything in sort of that space of how do you run it?

35:24 How do you tell a story with a notebook?

35:26 How do you point out little things inside those notebooks?

35:30 One of the entries in there that I find quite interesting and useful, and it's also pretty

35:37 awesome from a technical perspective, is binder.

35:40 So what binder allows you to do is you can basically take any GitHub or even GitLab or

35:47 other hosted Git solution URL and put it in there.

35:52 And then what binder does is it builds a Docker image that has all the dependencies of those notebooks

36:00 and builds that image, finds an executable node somewhere in the cloud in their infrastructure,

36:07 and then points you to a Jupyter instance that has that notebook running.

36:12 So what it allows you to do is you see a notebook of GitHub and you're like,

36:16 geez, what if I want to poke around with this thing?

36:18 You just go on binder, put in the URL, and you can play around with a notebook interactively.

36:24 It's really cool.

36:25 So if people have seen the launch binder, a little tag or whatever you call that on like

36:30 a GitHub repo or somewhere else, I guess it could even be in an article that then just

36:35 points back over to one of these.

36:37 If you click it, just as you said, it's going to create an executable environment with the

36:40 right dependencies and let you run your code there, which is kind of impressive that that's

36:45 available to the world openly, publicly without authentication, right?

36:49 It's an incredible engineering feat, right?

36:51 Yeah.

36:52 Just all the considerations of like finding a node that has like sufficient sort of resources

36:58 available to be able to do that.

37:01 Assembling.

37:01 Basically, you don't know what people are going to throw at you in those repos, right?

37:06 Doing a pretty good job at sort of making a whole lot of notebooks executable.

37:11 I had Carol Willing, among others on the show recently to talk about Mastodon.

37:16 And we talked about like the federated storing.

37:18 You know, there's a bunch of people who are creating servers and allowing others to use it,

37:23 sort of volunteering to add a little bit of resources.

37:26 And she said, you know, a lot of what she sees over there actually was there's some parallels

37:32 over in the binder space about how certain universities and other places are like, we'll set up the

37:37 ability to run some of these binders to, you know, add a little bit of compute and resource

37:42 to the world.

37:43 And yeah, it's similar.

37:45 It's pretty amazing.

37:46 Like from what I can see, like what goes on behind the scenes, there is some of this,

37:51 I think is the execution is run on Google hardware.

37:56 So like, and then beyond Google, they have three other hardware providers.

38:01 So not only do they manage to sort of make all those code notebooks executable, but they

38:06 even run them on four different sets of infrastructure, which is pretty amazing.

38:11 Very cool.

38:12 Okay.

38:12 So a couple of others.

38:14 Another one here that really jumps out at me is Jupyter Book over.

38:19 Let me pull up there.

38:20 So build beautiful publication quality books and documents from computational content.

38:26 Now, really nice, huh?

38:27 I think there's a couple of sort of projects that try and do similar things, which is basically

38:33 you create a set of notebooks and then you either get a webpage or you get a book.

38:40 And two of the things that these are useful are, well, one, you're trying to write a book

38:46 about some subject matter, a machine learning book or something like that.

38:50 And the other, the other case where it's really useful is documentation, right?

38:56 If you are a developer or a maintainer of a software package and you want to document your

39:01 API, something like that can be very useful.

39:05 So what this gives you is not only the ability to write documentation, but also to include

39:11 code in that documentation.

39:13 And then in some cases, if you have the binder link, you can even set it up so that you've

39:18 got a piece of code in there and people by clicking a tab next to it can even try out

39:24 what that code does, fiddle with it a little bit, and then see what that does.

39:28 That's it allows you to do some pretty cool stuff.

39:31 That's a pretty interesting way to bring it back around.

39:33 Like we've taken this computational thing, got it going, turned it into a static book that

39:38 you have.

39:38 But if you click this button, you can go back and kind of.

39:40 What we just talked about also sounds very much like NBDIV, which is actually a project

39:45 that I use in, has a very similar flavor where it's specifically geared towards people who

39:52 write software packages.

39:54 And the idea there is that you take your code and you define your classes inside of

40:01 a notebook.

40:01 And then people can, so you can both actually have your code lives inside that notebook.

40:09 You can also define your tests in that notebook.

40:12 And then some added benefits that you get from that is you can run your tests from a notebook.

40:18 And these things don't live in seven different places, like your code base and then your test

40:23 base and then your documentation repo.

40:26 But they all live together in one space.

40:29 And if you make a change that influences or has an impact on all three of them, you don't

40:34 need to do it in three places, but you can just do it in the notebook where it's all together.

40:38 Of all the things that plug into Jupyter, I think I'm most impressed with NBDIV.

40:43 It's pretty nuts.

40:44 Yeah.

40:44 I think this is Donald Newt, right?

40:47 He called this a literate programming environment.

40:50 And I feel like this is the kind of stuff he envisaged where he back in like 83, when he

40:56 wrote his book around about literate programming, right?

40:59 This is really what he had in mind.

41:01 It took a while to get there.

41:02 Yeah.

41:02 The tools he was working with, they weren't like these.

41:04 Yeah.

41:05 Yeah.

41:05 So with NBDIV, you can have your documentation, you can publish, you can take your notebook

41:09 and export it or convert it into or build it into a Python package or a Conda package.

41:15 You can publish it to PyPI and Conda.

41:18 You can have tests, you can have continuous integration.

41:20 Then also you can, if you've got complicated code that you need to integrate in other ways,

41:26 you can sync it to Python files and then back.

41:29 I think that two-way integration is pretty cool.

41:31 Yeah.

41:31 For this, I really need to get this out of the notebook into Python directly, but then

41:36 don't just carve it off, like keep them sort of connected, right?

41:39 Exactly.

41:40 That's one of the key points there is that NBDIV allows you to do all of these things

41:45 in the same place, right?

41:47 Sure, you can do all of these things separately and a lot of people do them separately, but then

41:52 having them all in one place is just so much easier when you say make an API change.

41:58 Yeah.

41:58 Another one which wasn't really even shouted out there in that highlights that they have

42:02 is NBclean, which if you're doing, and we talked about the two possible ways for collaboration,

42:08 if you're doing the Git way, you know, these notebooks, their files contain the output,

42:14 which could completely vary from run to run.

42:16 So like every time you save it or rerun it, it's a merge conflict, right?

42:20 And so this will strip out that kind of information to avoid merge conflicts.

42:24 So it can be a real good way to sort of prepare it.

42:27 I suspect that could be a Git pre-commit hook.

42:29 I'm not sure of it, probably.

42:30 That's one way to do it.

42:31 And sort of, I think sort of the fact that the notebook format contains the cell output

42:37 and also some many information.

42:39 It's a bit of a blessing and a curse at the same time.

42:42 On one hand, it sort of, when you have a notebook file, you can load it up and you immediately

42:47 see the output without having to run the cells, right?

42:51 So imagine someone sends you a notebook file and you cannot run it on your machine.

42:55 You still get the benefit of seeing what their output was when they generated it.

43:00 So it's nice for that.

43:01 It's also nice for the fact that if you put it up on GitHub, GitHub can render this notebook

43:07 file with the output in a very nice and sensible way.

43:10 You immediately see what the output was.

43:11 But the disadvantage of doing that is that whenever you rerun it, sort of the contents of that file

43:18 change and it doesn't produce clean diffs.

43:22 For example, if you just change one line and then that line produces a block, your diff might

43:27 be a couple of kilobytes long, right?

43:29 If you don't want that.

43:30 And one of the tools that sort of deals with that is something we had on the list, ChupyText.

43:37 So ChupyText is basically, so ChupyText deals with this problem by basically giving you paired

43:45 notebooks.

43:46 So you have an IPNB file and you tell your Jupyter IDE.

43:51 In addition to that IPNB file, I also want a markdown file or a PY file that contains just

43:59 the input cells.

44:00 And then what you can do is if you want to version your notebook is just check in that

44:06 clean or stripped PY file or markdown file and rather ignore the IPNB.

44:12 So that is what we actually do in our company.

44:16 We have a couple of notebooks that serve as reports and we edit them collaboratively.

44:21 And the way we version them is by stripping out the output via ChupyText.

44:27 And so you really get nice, clean diffs.

44:29 And when someone makes a change, you can tell what they changed in a reasonable way.

44:33 That's super valuable.

44:34 Do you want to go back to the MB dev for just one second?

44:37 Because there's a big, long list here that you would get if you just ran MB dev help.

44:41 And so many of these are like, as a standalone command, they'd be like, oh my gosh, what

44:46 an amazing tool.

44:47 Like another one I just saw is MB dev changelog, which will create a changelog.md file just from

44:53 like your closed and labeled GitHub issues.

44:56 And, you know, like that's a cool feature on its own.

44:58 I could see installing that and never even using a notebook and just running that on my

45:02 Git repo, which is, you know, so there's a bunch of, bunch of neat things here.

45:06 You look at the list and what MB dev does for more than five minutes and it makes you want

45:12 to build a software package.

45:13 Yes, it does.

45:15 Which you could do with MB dev, by the way.

45:17 So that's, that's kind of meta.

45:19 Yeah.

45:19 Another one around there is NB convert.

45:21 NB convert is, I mean, it's, it's almost there in the name, right?

45:26 Basically one thing it gives you is a command line command where you can do NB convert for

45:33 dot IP and B and say, like, I want to convert this notebook to HTML.

45:38 So you can script the conversion of notebook on a command line level.

45:43 And then beyond that, what you can do with NB convert is it's also accessible as a Python package.

45:49 So you can control that notebook conversion from within your Python code.

45:55 And that is useful for a number of things when you really want fine grained control over how you

46:02 execute or convert your notebook.

46:05 So one of the things that I have built with that is basically a way to convert notebooks

46:10 without the input cells.

46:12 So if you want to build a report out of a notebook and you have a non-technical audience and you

46:18 want to get the code out of their way and you really want to just show them the output,

46:23 that is something you can build with NB convert.

46:25 That's really cool.

46:26 I hadn't realized it did that.

46:28 So here's what I want to show you.

46:30 And if you actually want to see the code, click on this binder version or view it statically

46:35 on GitHub.

46:35 But for most people, they just want to see here's the description and here's the figures.

46:40 It's still potentially accessible at least.

46:43 Totally.

46:43 And I think notebooks are actually quite a nice way to design a report, right?

46:49 Because what you have is building blocks.

46:51 And my report is I want the total number of orders for the last week and I want to chart.

46:58 And then I want like the number of orders by zip code or whatever.

47:03 And you build that in a notebook and then you just like NB convert it with the input stripped

47:10 out and you get an HTML file that you can serve on a web server and have something like a dashboard

47:16 for your team.

47:17 Very good.

47:18 So it's super cool for building things like that.

47:20 Yeah.

47:21 Yeah.

47:21 Very neat.

47:22 One more in this section before we move on.

47:24 There's a bunch, but I think Paper Mill is pretty unique.

47:28 You want to tell people about Paper Mill?

47:30 I haven't.

47:31 To be honest, I haven't used it that much.

47:32 I haven't either, but I did read this Netflix Paper Mill article about what they were doing.

47:40 And they were basically using...

47:44 Ah, Kyle Kelly.

47:44 Yeah.

47:45 Yeah.

47:45 They were doing some work to take notebooks and use those as like building blocks for managing

47:51 their infrastructure.

47:51 So there were a lot of interesting benefits and Paper Mill will let you basically turn the

47:58 variables at the top of the notebook into inputs.

48:01 And then the variables at the end of the notebook as outputs.

48:04 And you could treat it like a function.

48:05 So they're chaining these together for all sorts of crazy DevOps-y type things, I believe.

48:12 It basically sounds like something I once rolled my own of where basically what I just described

48:18 that I'm doing with NB Convert, I once built a version where sort of you could in the first

48:23 cell have a variable and serve that like a function input.

48:26 And it sounds like they probably did a much better job of sort of imagining you have a

48:31 dashboard and then you've got one for the US and you want one for international orders,

48:36 right?

48:36 Or for Canadian or what have you country, right?

48:39 You can basically do that out of the same notebook file, but just say, hey, these are two separate

48:45 versions.

48:46 One for the US, one for Canada, one for Mexico, whatever.

48:49 Exactly.

48:50 One of the benefits that they said they were getting was if something goes wrong, like if

48:55 a step crashes, because as you described the notebook, as it executes, it stores the exact

49:01 output and the inputs.

49:02 It's a snapshot in time of what happened when it went wrong.

49:06 So instead of just having a log message of it went wrong and here was the input, it's like,

49:09 well, here are all the steps and you can see the variables and the output coming.

49:13 And then here's the crash.

49:14 You're like, oh, look, these are the three inputs.

49:16 And here's how this one got.

49:17 And actually gives you nice diagnostics.

49:20 Yeah.

49:20 It's like a report of what went wrong.

49:22 It's almost like a, you know, you have these parametrized tests, right?

49:27 It is.

49:28 Yes.

49:28 Parametrized tests with some diagnostic output about what went wrong during that execution.

49:34 That's pretty cool.

49:35 It's unique.

49:35 I think worth a shout out, I suppose, there.

49:38 How about the version control side?

49:40 Anything you want to give a shout out to there?

49:41 That is a section that is a bit more sort of well-defined and clear what goes in there.

49:46 It's basically all sorts of tools for diffing, merging, code reviewing, changes in notebooks.

49:53 NB Dime is probably one of the more well-known packages in that category that looks very well-built.

50:00 Right.

50:01 The diff and merge tools.

50:03 Yeah.

50:03 So imagine you have an image, right, in an IPMB file, you would usually get like a horrendously large diff if one of the things in that image changes.

50:14 And it just displays that that diff not has 50 lines that change, but just like a neat, here is something that changed one line and doesn't mess up your whole diff.

50:25 Right.

50:25 This plot change, not these 700 lines of plot definition change.

50:30 Yeah.

50:31 Yeah.

50:31 Very cool.

50:32 And merging as well, which is neat.

50:34 Yeah.

50:34 All right.

50:34 We're getting a little short on time.

50:37 Let's see.

50:37 I want to, I know you want to give a shout out to Deep Note, both because you like them, but also they're a sponsor of the...

50:44 They support the list.

50:44 Yes.

50:45 They support the list.

50:45 Yeah.

50:46 So maybe tell us about them.

50:47 Deep Note is basically one of these hosted notebook solutions.

50:50 We already mentioned Binder.

50:52 We already mentioned Colab.

50:54 And they have built one that is a bit more centered around collaboration of teams.

51:00 You also get an execution environment with different sets of hardware from them.

51:07 And they really emphasize collaboration.

51:09 So you can comment and have discussions about individual cells in your notebook.

51:15 You can very nicely see what changes other people did to your notebook and all of that in a very well done, very dense GUI that makes a lot of things very easy for you.

51:27 If you're not already using one of these environments and you have a team that is maybe also not particularly technically focused on where it runs, how it runs.

51:37 This is a very turnkey solution.

51:39 Less DevOps, more Google Docs type of style.

51:43 They solve a lot of those problems for you.

51:46 It looks really neat.

51:47 The collaboration seems very, very nice.

51:49 I think that's a pretty unique thing to do in a polished way.

51:52 Yeah, and they also have nice integrations for a lot of data sources.

51:57 So you can directly query SQL from their GUI and pipe SQL results into a data frame, right?

52:05 So you have one cell where you write your SQL and that goes directly into the data frame that you can then visualize in the next cell.

52:12 So the ergonomics of Deep Note are probably better than many other notebook solutions.

52:18 That's pretty neat.

52:19 And so like you said, it's worth pointing out they're a sponsor of your list, not of our show, but they are a commercial thing, but they are a commercial venture, but they do have a free version.

52:28 There's something for me where I feel like it's, I kind of like to support companies that are purpose built, right?

52:38 Like Deep Note is built for running notebooks and collaboration, whereas a lot of these big tech things, it's like, well, I know Facebook's a social media company, but I could also do this other thing that runs on, or I could run this on Google.

52:50 And there's something about like, okay, there's a company whose only job is to do this versus like, I'm not sure I could ever get support.

52:59 Like if I had a problem with my Gmail, I don't know that I could ever get help, ever, right?

53:03 Whereas like, you know, if I went to an email company and got email from them, they would help me with email because it's their thing, right?

53:09 I think where it shows is that they spent a lot of time thinking about how people use notebooks and what they want to do with those notebooks.

53:17 They think a bit beyond just sort of the Jupyter technical, but more the ergonomics of how you actually use a notebook within a team.

53:27 And I think sort of, you can definitely tell that they know how to use notebooks and have thought about all the ways in which people use notebooks and how to make that better.

53:36 Nice.

53:36 Well, I've never used them, but if I have the need, I'll check it out.

53:40 Sounds good.

53:41 All right.

53:42 What else?

53:43 Maybe we've got time for one more.

53:45 Is there one we haven't talked about yet that you're like, oh, we really got to cover this one?

53:48 Let's do cell magics, right?

53:50 Okay.

53:50 If you maybe just Google for IPython cell magics.

53:54 I'll kaggy for it.

53:54 Yeah.

53:55 They are not strictly a Jupyter thing, but I think they're a not so well-known thing that is super neat.

54:04 And they are, since Jupyter itself started in, from, was born out of IPython, these are things that are also built into IPython.

54:13 So one of them is, for example, a percent debug.

54:17 So when you have a notebook open and you get an exception and your code gives you an error message, you can then, in a cell directly below that, type Terset debug.

54:28 And that will open up a debugger session that takes you directly to the point where your code fail.

54:36 Right?

54:36 So if you write something like, I don't know, I accessed true, the array true at an index position of five, right?

54:44 And that's where your code fails.

54:46 You can do percent debug and look at, well, at that point in the execution, what was actually stored in the list full?

54:54 Right.

54:54 And did it actually have five elements or did I run out of bounds?

54:57 So, like, whenever I get an error during an execution of a notebook and it's not obvious, I like to use percent debug to find out what's going wrong with my code.

55:07 Very nice.

55:08 Another one that is very useful is percent time, which is, have you used that?

55:13 No, but I often want to answer this question of, like, is this getting a little bit faster, a little bit slower?

55:19 And I don't really want to go into a profiler and I don't want to go to, like, write the date time or timestamp code to, like, print it out myself.

55:26 It's just, like, yeah.

55:27 I use it a lot in those situations where, like, I've got, I've got way A to do something and way B to do something, right?

55:34 And then I wonder, is this actually faster or is this actually performance-wise worse than doing it the other way?

55:42 And this gives you just a very quick and dirty answer to, like, orders of magnitude.

55:47 Is this the same level or should I be doing things differently?

55:50 It's very useful for that.

55:51 Yeah.

55:51 Percent time and some function call and see what it takes.

55:54 It's brilliant.

55:55 And then another one I want to mention is just exclamation point and then the shell command.

56:01 Yeah.

56:02 It's very useful for, I think it's also available as %sx.

56:07 Yeah.

56:08 So it's basically a shorthand.

56:09 So you can do exclamation point LS to get to a directory listing just to see what is in the directory.

56:17 Or you can do, like, I don't know, who am I to see what user you're running something as.

56:22 Or you can do ping a machine to see if it's up directly from your notebook.

56:27 So this is a quick and dirty way of running command line commands without having to leave your notebook.

56:35 It's really cool.

56:36 You know, if you wouldn't just interact with the file system or you see what files are available to me or all these things, right?

56:41 Yeah.

56:41 You wouldn't know because this hasn't published yet.

56:43 But in the sequence of when these shows come out, I just talked to, or you're ruined from data science at the command line.

56:51 Wait, have you seen this book?

56:52 No.

56:52 Yeah.

56:52 So it's got a bunch of interesting things that you can do on the command line for, like, querying data or running things in parallel.

56:59 And a whole bunch of these, these sort of ideas of, like, how do I do really cool stuff with the shell?

57:06 Just use your bang command and they become integrated into your notebook, right?

57:10 Which is, I think that's super cool.

57:12 You can even take it to do things like you do an LS and then your Python session gets past the contents of that directory as a list.

57:21 And you can then use it as a variable and assign that list of variable.

57:25 So it's like, that is a really, like, easy, but also, like, very shoddy way of listing the directory contents very quickly.

57:33 Yeah.

57:33 I mean, you should probably use pass a little bit or something.

57:35 We shouldn't even mention that.

57:36 We should not encourage that.

57:37 But if there's, like, really interesting stuff that's happening on the shell you want to use, like, for example,

57:42 there's this symbol that says total lines equals.

57:44 That's a brilliant one.

57:45 Yeah.

57:45 Which is a Jupyter level command.

57:48 And this says bang arrow to redirect input of some text file word count dash L.

57:53 And that tells you how many lines are there.

57:54 Like, yeah, maybe you want to make this all Python so it doesn't, it's not platform dependent, right?

57:59 Sure.

58:00 If you want to package this up, right?

58:02 In that package.

58:03 But if you only end up doing it once and you just want to know how many lines there are, that's a way to do it.

58:08 Right.

58:09 You're in, like, an exploratory situation, right?

58:11 And you just generated this file and now you want to know.

58:13 I don't know.

58:14 So there's, that's a really powerful one.

58:16 I'm glad you brought that up.

58:17 And then just the last one from there is, because it definitely saved me in a couple of very hairy situations, is percent history.

58:25 So that basically gives you a history of the 20, 10, or however many last commands you executed in a Python session.

58:35 Why it's sometimes useful is just imagine you accidentally delete a cell in a notebook.

58:41 I mean, you can always undo that, right?

58:44 But every once in a while, your undo history is so horribly messed up that you somehow sort of lost that cell forever.

58:52 And there was, like, a lot of finely tuned code in that.

58:55 So you can go back to that with percent history and get your deleted cells from beyond the grave.

59:01 Yeah.

59:02 And you can do percent history dash in five.

59:05 Just show me the last five changes.

59:06 Yeah.

59:07 Yeah, it's really nice.

59:08 Cool.

59:09 All right, Marcus.

59:10 I think we might be out of time, but, you know, what a cool project this awesome list is.

59:15 We've got a lot of people listening in.

59:17 I'm sure there's a lot more awesome packages, tools, resources that could and should go on this list.

59:25 So if whoever is watching or listening to this has a package that they feel should be mentioned there, it's just as easy as doing a pull request and adding your favorite tool or resource to that list.

59:38 We're very inclusionist on the list, right?

59:41 And we like to include people's suggestions.

59:43 Yeah, fantastic.

59:44 There's a couple that I could see showing up there, like a few comments in the live stream that those could be PR.

59:50 So have at it.

59:51 That'd be awesome.

59:52 Yeah.

59:53 Awesome.

59:53 All right.

59:53 Final two questions before I let you out of here, though.

59:55 If you're going to write some Python code, what editor do you use these days?

59:59 Jupyter.

59:59 Of course.

01:00:00 I overuse it.

01:00:01 Beautiful.

01:00:02 And then notable PyPI or Conda package or something out there or even a Jupyter plugin.

01:00:08 I know basically this entire show has been one after another, but something you want to give a shout out to.

01:00:13 If I can just give one a shout out, it's Altair.

01:00:16 Yeah.

01:00:17 You should be doing the plotting in Altair.

01:00:18 Altair right on.

01:00:19 Yeah, it's quite nice.

01:00:20 All right.

01:00:21 Well, thank you so much for being here.

01:00:22 It's been great to have you on the show.

01:00:24 Thank you for having me.

01:00:25 Yeah, you bet.

01:00:25 Bye.

01:00:26 Have a good one.

01:00:26 Bye.

01:00:26 This has been another episode of Talk Python To Me.

01:00:30 Thank you to our sponsors.

01:00:32 Be sure to check out what they're offering.

01:00:34 It really helps support the show.

01:00:35 AWS is the lead cloud for developers, but with over 250 services, it's an overwhelming set of choices.

01:00:43 That's where the AWS Insiders podcast comes in.

01:00:47 Their job is to help you make sense of all those AWS options.

01:00:51 Listen to an episode at talkpython.fm/AWS Insiders.

01:00:55 Take some stress out of your life.

01:00:57 Get notified immediately about errors and performance issues in your web or mobile applications with Sentry.

01:01:03 Just visit talkpython.fm/sentry and get started for free.

01:01:08 And be sure to use the promo code talkpython, all one word.

01:01:12 Want to level up your Python?

01:01:13 We have one of the largest catalogs of Python video courses over at Talk Python.

01:01:17 Our content ranges from true beginners to deeply advanced topics like memory and async.

01:01:23 And best of all, there's not a subscription in sight.

01:01:25 Check it out for yourself at training.talkpython.fm.

01:01:28 Be sure to subscribe to the show.

01:01:30 Open your favorite podcast app and search for Python.

01:01:33 We should be right at the top.

01:01:34 You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm.

01:01:44 We're live streaming most of our recordings these days.

01:01:47 If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

01:01:55 This is your host, Michael Kennedy.

01:01:57 Thanks so much for listening.

01:01:58 I really appreciate it.

01:01:59 Now get out there and write some Python code.

01:02:01 I'll see you next time.