Learn Python with Talk Python's 270 hours of courses

#493: Quarto: Open-source technical publishing Transcript

Recorded on Friday, Dec 20, 2024.

00:00 In this episode, I'm joined by JJ Allaire, founder and executive chairman at Posit,

00:03 and Carlos Scheidegger, a software engineer at Posit, to explore Quarto, an open source tool

00:10 revolutionizing technical publishing. We discuss how Quarto empowers users to seamlessly transform

00:15 Jupyter notebooks into polished reports, dashboards, ebooks, websites, and more.

00:20 JJ shares his journey from creating RStudio to developing Quarto as a versatile multi-language

00:25 tool while Carlos delves into its roots in reproducibility and the challenges of academic

00:31 publishing. Don't miss this deep dive into the tool that's shaping the future of data-driven

00:36 storytelling. This is Talk Python to Me, episode 494, recorded December 20th, 2024.

00:42 Are you ready for your host? There he is.

00:46 You're listening to Michael Kennedy on Talk Python to Me. Live from Portland, Oregon,

00:51 and this segment was made with Python.

00:53 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy.

01:01 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,

01:06 both accounts over at fosstodon.org, and keep up with the show and listen to over nine years

01:12 of episodes at talkpython.fm. If you want to be part of our live episodes, you can find the live

01:18 streams over on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube and

01:23 get notified about upcoming shows.

01:26 JJ, Carlos, welcome to Talk Python to Me. Great to have you two here.

01:29 Thanks. Great to be here.

01:30 Thank you. It's great.

01:31 It's really fun to have you both here. It's the week of Posit for me here at Talk Python,

01:36 just coincidentally having two groups from Posit on at basically almost the same time, right?

01:44 Last recording, last episode was Great Tables, which made me have a much deeper understanding

01:49 appreciation for what you can do at Tables. I thought, well, what is there more than kind

01:54 of an Excel look to it, you know? Formatting, maybe?

01:56 Yeah. That's what most people think, and then they see like four examples and like, oh,

02:00 yeah, that's right. I see.

02:02 Yeah, exactly. But Note Tables, well, there might be a table in the presentation today,

02:08 but that's not the topic. It's more about leveling up your notebook game, right?

02:13 Yeah, I think that's the most sort of straightforward win for Python users is sort of the ability to go

02:22 from a Jupyter notebook to a website, to an academic article, to a report with parametrization,

02:30 to an e-book, sort of growing from a notebook to the many different ways in which you want to share

02:34 it with people.

02:36 I mean, put maybe just a little bit more broadly, you know, we data scientists create all kinds of

02:42 very, very valuable things with their code. And this can be, as we said, you know, tables,

02:48 it can be visualizations, it can be the results of models. And then how do I project that out into the

02:55 world? How do I make it valuable for other people? And a lot of times that means turning it into some kind

03:00 of production output, you know, a PDF, a report, a dashboard, you know, a website. And so I'd say

03:07 that's, you know, in the large, what Quarto is about.

03:10 Yeah, I think it's a super cool tech. I've been hearing a lot of people recommend it lately, which

03:16 is always a good sign, always encouraging. Before we dive into that whole side of things, though,

03:21 let's hear a little bit about yourself. You know, who are you, quick introduction, and how'd you get to

03:26 working on Quarto and data science and all that? Sure. My name is JJ Allaire, and I have been a kind

03:33 of software tool builder for, I guess it's almost 30 years now, which dates me quite a bit. But I've

03:40 worked on lots of different... That catches up on you. It's really like, what? Yeah. And I've worked on,

03:46 you know, web servers and development tools and authoring tools and programming languages and lots

03:52 of different things. But about, I guess, coming up on 14, almost 15 years ago, 14, 15 years ago,

03:59 I decided that I wanted to focus on open source software. And I had early in my career, prior to

04:06 becoming a software engineer, I was very into quantitative analysis for economics and political

04:10 science. And I used a bunch of the tools that were available at the time. And so I, when I found out

04:15 about, this is actually even before, like, PyData was a thing, before Pandas was a thing, I was like,

04:21 oh, there's open source statistical computing. R, amazing. I can just work on that. So that's kind

04:27 of what I did. I set out to improve the, I'd say, the computing tools around R, and I worked on RStudio.

04:34 And then that has, I won't go into all the gory details of that. But really, that led into a lot of

04:40 these use cases, because a lot of the people using R were trying to figure out how to share and publish

04:45 their work, whether they be writing scientific articles or inside a company presenting things.

04:51 And so I worked on a thing called R Markdown, which was like, you know, merging R and Markdown

04:57 for publishing things. And then that more recently led to, wow, like, it's great that we did this thing

05:04 for R, but that's like 10% of the universe. So can we generalize this? And that was like the

05:11 Genesis of Quarto, like, let's generalize this idea and make it truly multi-engine. And it really is.

05:17 It works with R, it works with Jupyter, it works with, you know, the Julia, is it the Plato? No,

05:25 no, the Julia. There's a Julia engine. There's two Julia engines, actually, that it works with

05:30 observable. So the idea was like, let's take this core idea and make it work everywhere where there's

05:35 interesting computations happening. So anyway, that's a little money.

05:38 Yeah, amazing. And if you capture or have tools for the Python space and for the JavaScript space

05:46 and the R space, that's a good coverage of the data science world.

05:50 It seems to be, yeah. Yeah. That was our idea. Yeah. Awesome. Carlos, how about you?

05:54 Thanks. My name is Carlos Scheidegger. I'm an engineer here at Posit. I've been working in

06:00 on Quarto since I joined about two and a half years ago. My background is in computer science. I'm a

06:07 recovering academic, like I tell folks. I came to the US in 2005 for grad school. And I was working on a

06:16 completely different area. I was working in sort of like geometry processing, computer graphics.

06:20 But as part of my work in research, I started realizing that a lot of the experiments we were

06:25 publishing in papers had sort of a depth of like parameter settings and sort of choices and things

06:31 that like really dominated the output result. And it was really hard to go on a paper and see what was

06:36 happening there and sort of be able to figure out like what were the choices that someone made when

06:41 they published the work? How do I compare my work to theirs? And so I switched and I, as part of my

06:45 graduate work, I built a system in Python for sort of reproducible computational analyses and sort of,

06:52 how do you share these work and sort of, you know, as you're sharing like a PDF, you'd get like all of

06:56 the parameters and all of the things that came with them. So this was back in, you know, 2008, sort of

07:01 ancient history at this point. I then sort of joined AT&T research at that point and they have a really

07:08 strong R team there and sort of started doing some related work there. And that's sort of how I crossed

07:13 paths with JJ as they were founding our studio back a long time ago. And we more or less kept in touch.

07:20 Yeah.

07:20 So I eventually joined academia and kept interested in sort of this work on how do we make sure that the

07:26 experiments that we have are as easily shareable as the artifacts that we end up all reading, like the PDFs,

07:32 websites, but if you need to go back to figure out that what did actually happen there, we want to make

07:37 that as easy as possible. And so the idea of Quarto resonated really strongly with me. And so as I was

07:42 realizing that there's a lot of academia that I didn't, wasn't a great fit for me personally,

07:50 we can have a separate conversation about that. I reached out to JJ and he talked about Quarto and said,

07:56 this sounds like a great project and we really hit it off and I've been working on it since,

08:01 since we started. So that's how I ended up here.

08:04 That's cool.

08:04 That's right.

08:05 I think there's a lot of overlap on the data science side between our people, Python people,

08:10 and our tools and Python tools.

08:11 There's a lot more overlap than I think folks realize from the outside. And when you sort of notice

08:18 that like the people working on these tools end up sort of picking the tool that's best for the job

08:22 and Python is fantastic. Like I've been writing it for 20 years and sort of you find people trying

08:27 to find similar solutions to problems and sort of some things are more easily expressible or easily

08:33 to reach for in one language or another. But I think in 2024, the story is that more often than we realize

08:40 like teams and even people are like, they're polyglots, right? So they will speak like they will write

08:45 enough Python, they will write enough R, but they want something that sort of works together as well as you

08:50 can. And so that's something that we've focused on for two very, very explicitly. So you can have a

08:55 website in which, you know, half of your team is working in R and half of your team is working on

08:59 Jupyter notebooks, and they both can work on the environments that are most observable.

09:05 Yeah.

09:05 Or observable.

09:06 So directly in HTML and JavaScript.

09:09 So that's really part of what we're trying to do is sort of make it as natural as possible for people to keep

09:16 continuing their work, but then be able to publish it and sort of not forget like all of the computational

09:21 things that came with it and make it easily accessible.

09:23 I think also another demographic, another group in this is the people who are receiving the research,

09:30 receiving the stuff. They might not be either of those people. They don't want to see a bunch of code

09:35 and then a DF.head and then a plot. And then they want to report looking thing.

09:41 That's right. Yeah. And there's, we've definitely done a lot because it really does vary. I would say

09:46 the preponderance of people are in that category. I would say maybe 80, 85% of the people are in that

09:51 category, but then you do get the, the odd person who says, well, wait a minute, you know, where,

09:55 where did this come from? So we have a bunch of things, you know, like for publishing notebooks,

09:59 they'll let you, you know, interactively show the code or not show all the codes. Don't,

10:04 you know, show the source code. You know, we have a feature where like you see a plot in a,

10:08 you see a plot in a website and it just, it's just as a plot. There's nothing. And there's a

10:13 little text here at the bottom that says source notebook. You click it. Now, suddenly you're in

10:17 a notebook that you're in a lot. So ways to make it available, make the computations available without

10:22 hitting people over the head. That's cool. It's almost the equivalent of right-click view source

10:27 of the early web. Right, right, right. Exactly. But that's right. Yeah. That's very much.

10:31 But really to your point, I mean, that's why when I say production output, it's like the quality of

10:38 output that people are used to getting, you know, out of like office, or when people make

10:42 professional PDF reports, that's kind of what people are looking for at some level. Sometimes

10:47 people just want, Hey, just give me the information. I don't need it to be, but a lot of times they want

10:50 the production quality. And so we do spend quite a bit of time on making that possible.

10:54 Sure. And I would say the other thing that it's seems like this solves a bunch is the,

10:59 I have a notebook I created myself. How do I share it with the world? Yeah. Yeah. Right. A lot

11:04 of data scientists are like, there's interesting solutions out there, but there there's a lot of

11:08 uncertainty on what to do. You know, do I need to learn how to do Django or flask? Probably not.

11:14 Right. Right. Do you lean on something like streamlet where you kind of program to their model,

11:19 but it's cool and interactive, right? What do you do? You export to a PDF and just, or HTML, just here's

11:26 the notebook, but static, right? None of those are streamlets and pretty interesting option, but it's,

11:30 it's a very focused sort of tool, right? It's not a publication tool for sure. It's more interactive

11:36 tool. For sure. Yeah. And I think that one of the really interesting things that happens when you

11:41 sort of engage with the folks that are actually trying to produce this, like in, in a company or,

11:47 you know, in an academic department or like in a government agency, those, those stories you just

11:52 said, like, do I want a PDF? Do I want an HTML? They are all true, but for different stakeholders.

11:57 Yeah. Yeah. Right. So, you know, if I'm a data scientist in a company and I want the CTO to see

12:02 what I'm doing, the CTO is not going to have time to go pour over the sort of gigantic report that has

12:07 all the details. They want something that has up-to-date data that looks in a sort of expected,

12:12 like might have the company branding and so on. And ideally something that's like email to them,

12:16 like on a daily basis. Right. So we want the same notebook that you've used to sort of create your

12:23 analysis to go as easily from that into something that can be like a scheduled send with CI or with

12:29 one perhaps positive commercial offerings, or, you know, you can easily publish it as HTML,

12:34 through GitHub pages, through Netlify or through your own provider or...

12:38 Yeah. I see. So you get basically a static site that you can...

12:41 That's right. ... publish as you see fit, right?

12:43 Yeah. Yeah. Okay.

12:43 That's right. So at the very basic, right, Quarto sort of jumped the gun a little bit. Quarto is a

12:48 command line tool that you can use and you can take your notebook, right, as an input, like your

12:53 IPINV file and just say Quarto render, and then we will produce an HTML file. So you're sharing your

13:01 screen now. So if you go and get started on Quarto.org's webpage, which is where we are right now,

13:06 you sort of, you see there's a tutorial for computations on the left sidebar. And then there's

13:12 sort of a number of different pathways that you can go through there. And one of them is Jupyter,

13:15 which is actually the one that you're seeing there by default. And so in here, we're highlighting that

13:20 you can take like your Jupyter notebook or your Jupyter lab input. Once it's saved to a file, you can just use

13:25 Quarto directly to produce an HTML. So you can say Quarto render HTML, you can say Quarto preview,

13:31 and that will generate sort of like a local hosted website, but you can say Quarto publish.

13:36 And then we have publishing sort of targets to Hugging Face, to our own Quarto pub free service,

13:43 to GitHub pages, to a number of other places. So at a very high level, you can take your notebook as it

13:49 is, and it can produce a really nice looking HTML page that's available.

13:52 This episode is sponsored by us here at Talk Python. Of course, you can support Talk Python by checking out

13:59 our courses at talkpython.fm/courses and recommending them to your colleagues and team.

14:05 But this time around, I want to just encourage you to connect with us on a few other channels.

14:10 You're listening to the podcast, and that's really great. But did you know that we also have a blog?

14:15 It has a couple of really interesting articles I recently posted. Jump over to talkpython.fm/blog

14:22 and subscribe to the RSS feed there. And we also have a great mailing list where in addition to

14:28 announcements about Talk Python, I also share some tools and tips that I learned recently.

14:32 So if you're a fan of email, just visit talkpython.fm and click newsletter to sign up. That's it.

14:37 As always, thanks for listening.

14:39 And one of the really fun recent things that happened is Hamel Hussain, MB Dev guy, sort of

14:48 fantastic author. So he's working for Answers AI now, and they just published a service where you can

14:54 literally go on Answers AI and point to a GitHub repository that has an IPINB page, and they will

15:01 dynamically sort of on the backend, find that GitHub page, create the Quarto page from your notebook,

15:07 and then just publish it. Right? So like sort of zero click, if you have a public repository on

15:11 GitHub and you just want an HTML page to share with someone else, you literally do not have to do

15:16 anything. You can just go there, get the output. And so that's a really sort of like zero click way to do

15:22 that. Yeah. Go ahead. I was going to say it's called NBSanity. You can find it at NBSanity.com.

15:30 In the show notes also, I think there should be a link to Hamel's blog post announcing it.

15:36 And he goes into more details because one of the cool things is that you can actually use,

15:40 even if you're not running Quarto, you can put a bunch of the Quarto options in there. You can say,

15:44 oh, I want to show source code or not show source code. It makes source code expandable. Or,

15:49 and you, if you just put those in your notebook, when they render it, then they actually use those.

15:54 So it's a, it's sort of a way to use Quarto to publish your notebooks without even

15:57 installing Quarto. So it's, it's pretty cool.

16:00 Yeah. It basically, if you can just figure out the URL to put together, it's, it'll just pull it

16:06 together for you.

16:07 Yeah. And I think, I think that is like the concept of Quarto, which is okay. I have Python code or I have

16:14 a notebook and I want to project it. And I want to do that in a flexible way. And that's, you know,

16:19 that sort of web is like a default case. But I think the thing that people should also understand is that

16:24 we really do go quite a ways beyond that with like making PDFs, making Word documents, you know,

16:29 stitching things together in a website, you know, making presentations, you know? So it's,

16:35 there's a lot there. One of the curses of it is actually, it can be a little hard to explain.

16:40 It's like, what is Quarto? Oh, publish anywhere, anything, any computation, any language.

16:44 That doesn't really concretely connect with anything I'm thinking that I want to do, you know? But,

16:51 but if you dig into it, it does, it does, I think, connect pretty well to the things that people

16:56 want to do. But I think there's, there's interesting value to be gained from those kinds of decisions.

17:01 Michael, I know you had recently Charlie Marsh on the podcast and UV and rough and all these tools,

17:06 right? And sort of UV has taken a few of these moves that like, you know, make perfect sense,

17:11 which is as you start sort of like consolidating these tools, there's a lot of really interesting

17:15 use cases that arise naturally from the way you can compose them, right? So yes, we have render,

17:20 we have publish, we have preview, and it takes a little bit of time for you to explain. But in the

17:24 end, you get the system that sort of like grows with you. And like, we hope it will be the tool

17:29 that people reach for when they have sort of like data science, technical content production needs.

17:34 And so we hope to- Instead of trying to, like yourself, trying to piece together, say,

17:38 I'm going to try to convert my notebook to markdown and then use Pandoc on this and run this tool

17:44 against it. If you guys own that whole workflow, then there's new opportunities. Is that what you're saying?

17:48 That's right. That's correct. That's right. Yep. Yep.

17:50 Yep. So as an example, folks have, you know, we talked about Hamel and BDEV. Another web page that

17:58 folks might have known or sort of website and company is Fast AI, right? So Jeremy Howard's

18:02 Yeah. Sort of initiative. Fast AI is a Quarto website, sort of, you know, like it just,

18:08 yeah, so just go on Fast AI right now. If you open the source. Yeah. So fast.ai is the, yeah, got it.

18:17 Yeah. Right there. Yeah. So this is actually a Quarto website. So the blog here is, but it's all of

18:22 these apps. So if you see there, it's published with Quarto 1635 or something in there. And these are all

18:29 directly from notebooks. So the entries in here on the blog entries, they're all IPINB files. And then,

18:36 Quarto, among the other things we can do is sort of collate all of the different sort of blog

18:42 entries you have, make listings so you can filter them. So this is very useful for a blog. It's useful

18:46 for folks that have sort of like an ongoing set of reports that like need to be sorted chronologically.

18:51 If you have an academic website and you want to list your articles, that kind of stuff comes up all the

18:56 time. But the input for all of those are IPINB files, right? So Jeremy Howard: Yeah, that's right.

19:02 So this will work entirely in IPINB. And so from there, we generate the HTML. We sort of extract the

19:08 headers and create the listings all sort of automatically. So this is like an entirely

19:12 Quarto website.

19:14 Quarto, I guess it makes sense being a static website that it's pretty well suited for this.

19:19 A lot of these tools that I talk to people about, they're like, this is a great thing for internal

19:25 publishing of our site, or it's a great thing if you were going to interact with it, but it's kind of

19:30 heavyweight and it doesn't really support a lot of design. And it's not a, you would never build your

19:35 public website with this is like what I hear from a lot of these tools, right? But it's,

19:39 this is pretty excellent.

19:40 Jeremy Howard: That's you definitely can.

19:41 Yeah, we leaned, yeah, we leaned pretty hard into, into, you know, allowing a lot of CSS customization.

19:51 So we use Bootstrap and we have a whole theming system that's really based on Bootstrap themes,

19:56 but you can use SCSS directly to create your own themes. So, you know, it's, it's, you can just go

20:01 like, oh, make me a nice website. I don't want to think about it. People do that. But then people

20:04 who really want to like party on, I want to make this look just so can absolutely do it.

20:10 So yeah, again, we might, because I was going to say, we might miss a little bit of the details

20:15 for folks who are only listening, but another example that you might want to search just to

20:20 show sort of, you know, the range where you can go from that. It's also another quarter website.

20:24 So if you just search for real world data science, this is an initiative from the Royal

20:28 Statistical Society. They have a website, which yep, right. That one. So they have a number of

20:35 different writers that write sort of like columns and content pieces and ideas. This is also a quarter

20:40 website. So I believe this one is based on markdown. So they are using instead of IPA and BS inputs,

20:46 they use our markdown, but the same sort of thing. And the only reason I'm, I'm bringing this up is that

20:53 you can see how very different it looks, right? So it has categories and things like that. But if you want to

20:58 take sort of, you know, the time to do these and sort of have all the like the banner images and sort of like that

21:03 kind of thing, you can absolutely do it. And any result, you get to publish through a number of different

21:10 places that you might want to. Cool. Yeah. For people listening, it kind of has that landing page look with the

21:15 hero section with big pictures, right? You know, it doesn't feel very notebooky at all. That's for sure.

21:20 Another use case out in the chat is Kevin O'Malley says, "I teach a summer school on using Quarto for

21:27 academic publishing in the university I work in. It's a great tool, but we spend a lot of time thinking

21:31 about how to make documentation better. Interesting." Yeah. Documentation is a, I mean, part of,

21:36 we, I think Quarto's documentation is very good. Like I honestly say this without, I think it's, you know,

21:44 I don't want to say best in class because folks have actually realized how important documentation

21:49 is and there's a lot of really good documentation out there, but I think we're up there. I think our

21:53 documentation is very good. It doesn't mean that it couldn't be better. And I think part of the issue

21:59 with Quarto is that diffusion of entry points that we talked about earlier, right? Do you want to make a

22:05 single PDF? Do you want a website? Do you want dashboard? Yeah. And all of those become different

22:11 places. And are you coming from a Julia background? Are you coming from a Python background? Are you

22:14 coming from an R background? So we tried to create these like different entry points on the get started

22:19 webpage and our guides are customized for VS Code experience for, Jupyter experience and so on.

22:25 But we a hundred percent agree that documentation remains one of the hardest parts for us.

22:30 If you click on the guide, on the guide tab at the top, you'll see that, I know everyone

22:38 can't see this, but like, it's just such a vast number of things and it is easy to feel lost.

22:44 We try to organize it, but it's like, what does this not do? You know? So, so a little bit of a

22:50 choose your own adventure. It's a little bit of that, which is not always what people are looking for.

22:53 They might be looking for like, no, tell me, I want to get this to happen. Just give me step one through

22:58 four. And, I think we could probably stand to do a little more of that.

23:02 Yeah. But here, this trees, this is kind of a decision tree, you know, like,

23:05 what are you trying to do authoring? Are you trying to do diagrams? Okay. Are you using Python? Okay.

23:10 You go down this path and then we'll help you.

23:12 Yeah. Yeah. Yeah. Yeah. We, one, one thing I want to touch on just, I know a lot of people have

23:16 not experienced this. I know you two have, I have, and it's mind blowing. This is somewhat built upon

23:24 Pandoc, right? Yep. Yep. Yes. We can. And Pandoc is ridiculous, right?

23:29 It's ridiculous. Yeah. It is like the most, the biggest workhorse that no one, software that

23:34 no one knows about, you know, it is, it is an incredible piece of software. Yeah. Yeah. If you're

23:39 like, well, I have a, a LaTeX document and I want it as a media wiki markup format. Could I possibly,

23:48 yeah, you could just convert those? Yeah. If you, if people are willing to at the pandoc.org website,

23:52 yeah. On the right, there's a little gray. Yeah. I don't know what you would call it,

23:57 but it's a gray. It looks like if you drew the diagram for rock, paper, scissors, but 500 ways,

24:02 it shows all the connections. Exactly. All the connections.

24:05 All the connections. How do I get, yeah. The, basically the given one format,

24:10 what other formats can they go to? And literally it's illegible because it converts,

24:14 converts so many things to others. And that's a really interesting foundation for you guys, right?

24:18 Well, I want to talk about this a little bit because I think that the, the really big idea

24:22 behind pandoc is the, the, a consequence of the big idea is that you can do all this conversion.

24:27 But the big idea behind pandoc, if you, if you're familiar, like if in programming languages,

24:32 when you're parsing and writing interpreters, you have this abstract, abstract syntax tree,

24:37 right? Or IL, there's a sort of a abstract representation of the program. and then you can compute

24:43 on it. And I think what pandoc has done is they've created essentially an abstract representation of

24:48 what document. and so we think of documents as just like, oh, we just spray a bunch of content

24:53 into a file and it's just a big blob. pandoc to use documents as something you can compute on.

24:58 And so all the things that might be in the document, you know, tables and footnotes and figures and

25:03 bullet lists and, you know, all these things are in an abstract model. And so what, what happens is it

25:09 takes that, you know, whatever that is, word doc or markdown file, and it brings it into its abstract

25:14 model. And now you can actually compute on the document. And that's really actually the heart

25:18 of how Cordo is able to do, you know, Cordo is built on pandoc. And that's how we're able to do

25:22 almost everything is that we're actually not just dealing with text and markup. We're dealing with

25:27 this sort of abstract model of a document, and then we can do all these powerful things with it. So I

25:32 don't know if you want to expound on that. You probably could expound way too long on that,

25:35 Carlos, if you, if you chose to, but, but I think the shortest, important bit is yeah. So

25:43 pendoc operates on this abstract syntax tree. It's a document that has paragraphs. Paragraphs can have

25:49 spans with, you know, strong text and their text and things like that. And we court, you can think of

25:55 Cordo as a very, very big orchestrator of pendoc and sort of like configuration,

26:01 orchestrator or like choreographer for pendoc. So Cordo itself is a command line application. We ship,

26:08 pendoc with it. So like our bundles all include pendoc with it, but fundamentally we are a TypeScript

26:14 application that, sort of puts itself in front of pendoc and then after it, right? So, you know,

26:20 all of the complicated things you might want to do to generate multiple websites, to extract bits of the

26:26 documents, to know the titles, to create your blog posts and your entries, right? So Cordo gets in front of it,

26:31 does all of that orchestration work, then calls pendoc a number of times and then calls sort of

26:36 some post processors. And the way this integrates with engines and Jupyter and so on is our, what we

26:43 call engines in Quarto are the things that turn, your document that has executable code with the

26:48 document that has the results of the execution, right? And so all that, that needs to happen for

26:54 an engine in Quarto to exist is that it takes Markdown or Jupyter notebooks as input, and then it produces

26:59 Jupyter notebooks as output or the Markdown annotated with those, results. And then we just sort of

27:06 process them and send to pendoc, right? So really pendoc is, is at the center of what we can do with

27:11 Quarto. And you can think of, Quarto as just sort of sitting around it and sort of expanding the scope

27:16 of the things you can do with pendoc.

27:18 We've done a few more things. Like if you, to use pendoc typically, you know, there's, you know,

27:23 160 command line options and you just kind of figure out how to, you know, it's tremendously powerful.

27:29 So we've tried to, I would say, organize that experience a little bit. so it's like, oh,

27:33 I just say I want a PDF and it's in YAML and I do a couple options and it does the right. So I think,

27:38 you know, you can think of pendoc properly as like this sort of engine that you can do anything

27:43 with. And we try to give you like the happy path to a bunch of things that you probably want to do.

27:48 Yeah. I think that's hugely valuable because while pendoc is great, it's also

27:52 super complicated. And a lot of times, if you want to combine different documents, you know, maybe I'm

27:58 actually working on a project that has a bunch of Markdown files, but they really need to be one

28:03 Markdown file on a certain order. And then that thing gets processed. But then as it gets transformed,

28:08 there needs to be changes to it. And right. You know, I'm, I can program, I can write that code to

28:12 do it, but there's a ton of common use cases, like take this notebook and publish it on the web.

28:17 Yeah. Right. It could just be built into your tool. Right.

28:20 So one of the thing that, I'm sorry, go ahead, you're going to go ahead.

28:23 I was going to say, even the thing you're talking about, I need to paste together a bunch of files

28:28 and turn it into a single file or turn it into a book or turn it into a, you know, those are things that we can,

28:32 we facilitate those sorts of things, things as well, you know, like this, this is really a book

28:37 actually. And it's got 20 chapters and each one is in its own file. And when it's a website,

28:43 I just want it to be like a website that lets you navigate my book. When it's a Word doc,

28:46 I want them all concatenated together into a Word doc, you know, et cetera.

28:50 Right. And then the final destinations and EPUB and a PDF or something.

28:53 Yeah. EPUB. Yeah. Or I went, yeah. EPUB or PDF. Sure. Yeah. Make the EPUB,

28:57 you know, create the EPUB archive just so, just so. And yeah. So sorry, Carlos.

29:02 I was just going to say that to Michael's use case of sort of having a number of markdown documents

29:08 and so on. I wouldn't presume to say you should use Quarto. Maybe it's a good tool for you. But one of the

29:14 things that we do believe in very strongly is sort of making sort of, it's a principle that we try to

29:18 abide by. It's sort of making hard things easy, but never at the expense of making very hard things

29:24 impossible. So, you know, we will provide you the happy path. We sort of, you know, the standard

29:29 YAMO options, we have validation for them. We have completion, we have sort of integrated documentation

29:34 if you're in VS Code, for example. But if you need to actually extend things, we give you a number of

29:40 escape patches and entry points. And so if you're in a Quarto project, for example, we have a fairly

29:45 complete system of like pre-render scripts and post-render scripts, or we'll give you the set of things that we

29:51 found on your project. You can run TypeScript code against it. You can run Python code against it. Tell

29:55 us what you've done to pre-process our project or to post-process our project. So if you need to start by

30:00 collating markdown documents or by, you know, going to some database and pulling the documents that you

30:05 want to render on a webpage, all of those things are enabled by this pretty extensive extension system,

30:12 for lack of a better term. Sort of, we try to put extension entry points in as many places as we can,

30:17 so that folks are not stranded when they go from a simple project to inevitably something that's more

30:21 complicated. And so we try to make Quarto grow with them as your project tends to inevitably grow as well.

30:28 Yeah, that sounds really great. When I'm sure there's some pre-built extensions, but I also imagine

30:34 that we could write our own.

30:35 Is that in JavaScript? Is that in any language?

30:39 Okay.

30:40 Is that shelling out to...

30:40 Well, let me tell you. So a couple of the things are pretty open. So what Carlos said about pre-render,

30:52 post-render, sure, any language, that can be just the shell. Okay. There's a thing called Pandoc filters,

30:58 which is essentially take the AST in and transform the AST. Those also can be written in any language,

31:04 filters. So for example, there are two different libraries to write them in Python, but people write

31:09 them in all kinds of languages. That's available. But I would say the official mechanism for extending

31:15 that has more affordances, more APIs, more flexibility that we have is based on Lua, because Pandoc has an

31:21 embedded Lua interpreter in it. So as it's running, they give you access to the runtime and you can do

31:28 all sorts of really flexible things inside Pandoc as you're running. So Lua is, if people aren't familiar

31:36 with it, was originally created actually for embedded game engines. So the idea is we need very, very fast,

31:45 very fast execution, very high level language. It looks not dissimilar from Python, I would say,

31:54 but it runs very fast. So for example, like the Cloudflare kind of, what is it called? Envoy for

32:05 writing super high performance, like REST proxies uses Lua. So like, okay, this code is going to run

32:12 on every HTTP request that happens. We're not writing that in any language that isn't just

32:16 going to be screaming fast. So I think the reason why Pandoc used Lua was they said, well, we can very

32:22 easily embed it and it's going to be fast because when you're processing documents with a lot of nodes

32:27 in them, you really want to be using a language that's very fast at runtime. So it is Lua. And if

32:32 you go to our website, there's a whole section on extensions and we do quite a bit too. Let's see.

32:38 Yeah. Go to where's extensions, Carlos. It's gotta be.

32:41 Yeah. If you just search for extensions, you should be able to find good documentation and sort of,

32:47 yeah, you can start from. Yeah. Yeah. Yeah. Yeah. That's right.

32:51 There's a bunch of ones that people, yeah, these are ones people are, have already done. So they've

32:55 sort of added features. Yeah. Oh, you guys support the concept of short codes.

32:59 Yes, we have.

33:00 Yes, we have short, you can write your own short codes. Yeah, exactly. Yeah. So you, you can write that,

33:05 the two kind of workhorse extensions are short codes and filters. And so short codes are like

33:10 content injectors, filters are content transformers. And so you can see examples of,

33:15 you know, different, different short codes and filters that people have.

33:19 Yeah. So an example for people who maybe are not familiar with this idea is pretty common in static

33:24 websites. Yeah. Static site generators. For example, Hugo has the concept of short codes as well. Yeah. And

33:30 it's like, I want to write a markdown, but I also want to embed a YouTube video.

33:35 Exactly.

33:35 Exactly.

33:35 Exactly.

33:35 And that is like an iframe and all sorts of things.

33:39 Exactly.

33:39 Yeah.

33:40 The video short code we support natively.

33:42 Yeah. So we have a video short code that's, that's built in. We wrote a video short code built

33:47 in because for this, it's so, it's so common, but you know, people make these specialized ones. One of

33:51 my favorite ones was somebody who was like a short code and you like give it some kind of,

33:54 I don't know what that was. It was like some kind of like chemistry or biology, you know, code. And

34:00 it creates this like interactive molecule visual visualizer, you know, right in there.

34:04 Yeah. Like a 3d, whatever the language for 3d stuff in the browser is that came out.

34:09 Yeah. And it just, it's wild. And you're just a short code and give it one little code and it just

34:13 does it. So, yeah. So these are, and these are, these are written in Lua, but, and part of that is

34:19 also because we want, again, this idea of, of being pan box supports Lua, but we also want

34:24 people to run these extensions with zero dependencies. So like, if I'm a, you know,

34:28 I write an extension, I don't know if you've got R or Python or like, I don't know what you're, what

34:33 you're, you know, and I don't want to have to satisfy a bunch of dependencies. So we want to have like

34:38 fast dependency free and well and agree with Pando. So that's why we ended up with Lua. And we, we like it.

34:43 and, it's served us well. We've done, we did some stuff in our quarter extension. We do a bunch

34:48 of like, we actually add a bunch of like static typing annotations for Lua so that it's like, we get

34:54 really nice auto-complete and linting and stuff. So. Yeah. Cool. You know, JJ, five years ago, if you said,

35:01 I need to write an extension in Lua. Yeah. I would have to learn Lua. Maybe not well, but I would have to

35:07 like, I could probably write that. I could probably write it in Python or, or R and then just tell chat

35:13 TPT. You definitely could. Yeah. Yes. Yep. So. What do you guys think about this? This crazy LLM stuff

35:20 that's taking over programming? I think, I think it's pretty interesting. I think it's, I would say

35:25 that I was kind of a little wary of it until I saw, I'd say probably Claude 3.5 and O.1 are a huge,

35:34 a huge step change in capability. and so I was kind of waiting for that. but seeing that,

35:40 I would say they're already now quite effective at accelerating many parts of development.

35:46 I think part of what we have to do is figure out the, the human, the sort of, what is the right,

35:51 I think for, for many, most types of software for the foreseeable future, it's going to be human in

35:58 the loop. I believe. And so what does the interface for human in the loop? What does it look like?

36:02 Is it, you know, it can go all the way from do a thing that's complicated. Let me see the diff and

36:07 I'll approve it too. I just want to talk about it and then I'll do it too. Let's talk about it and you

36:12 can do some things, you know, and I'll tell you if they look good or not. And I think that's a really

36:17 interesting area of exploration for, for the tool building community for the next few years.

36:21 It's, I agree. It's super interesting. I, I, I don't have it built stuff for me, but there's a lot

36:25 of times I'll be like, I could look this up, but I'd, I'd like to see an example from it and then I'll,

36:30 I'll take it and make it mine. Or I have a thing, explain it to me. Like, what does this regular

36:34 expression do? Or what does this curl command do again? I can't be bothered to study them.

36:38 I have a funny anecdote about this. Cause I, cause I do the same thing. Like, let me look this up.

36:43 So I was like, I was writing a VS Code, I was working on a VS Code extension and I was basically

36:47 trying to save some state, you know, per workspace state. And I'm like, I know there's like, there's

36:52 a one liner for this actually VS Code. They have this built in. So I said, I just want to do some

36:56 per workspace state. Like, how do I do that? And it goes, oh, great. And it goes like, I'll make

37:00 workspace state manager as it produces like 180 lines of code, you know, and it's got all this kind of

37:05 events. It's got all this stuff. And I'm like, isn't, I was like, isn't there like a one-liner?

37:09 Like, oh, totally. There is a one-liner like here. It's right here.

37:14 You do have to, you know, when you have it, try to build stuff for you, you have to kind of go,

37:18 don't lead the witness too much, you know? Yeah. Yeah. Yeah. For sure. I will just add that

37:26 I think people who try to just say, I'm going to take this thing and just have it build the thing

37:30 and then build the next thing. And if you take it too far, you're going to end up with a pile.

37:34 You're dead in the water. You cannot understand. You cannot maintain. It doesn't quite do the thing.

37:39 That's exactly right. Yeah. Yeah. Carlos.

37:41 Carlos, you spend a lot of time universities teaching people.

37:47 This is... Yeah. It's interesting. I left, I think, just before this became really,

37:54 really bad. My spouse actually teaches Intro to Python. And so, you know, she's sort of much closer

38:00 to the thick of the bad parts of like sort of exposing like this really access to LLMs for folks

38:05 who don't otherwise understand what's happening. And so I'm of two minds of this. One is I think there's

38:12 something fundamentally complicated for us to handle, us as in like society and humans, which is it is much

38:20 harder to tell that something might be wrong than to generate the output, right?

38:24 Yeah. Sort of it's really complicated, right? Like it's actually just genuinely hard to do so.

38:29 And it's even harder when you don't have the expertise. And so it's absolutely the case that

38:33 there's a number of folks that are using ChatGPT to generate code, to generate sort of solutions to

38:38 homework is like the bad case that happens all the time. Right? Yeah.

38:41 Yeah. But even if you sort of step that aside and say like there's no grade in play or anything

38:45 like that, the risk is that folks really don't have the ability to tell when something is going bad or

38:51 wrong. And I think that's a complicated thing to do. The counterpoint that I think is quite interesting

38:56 is that we have computers right in front of us, right? Like, you know, if ChatGPT gives me like a

39:01 snippet of Python code, one thing I can do with it is I can run it myself, right? Like, you know, I can go and

39:06 check it. Right? And so I think the places where I find it to be extremely valuable are like when it

39:11 tells me something that I can very easily verify whether or not it's the case, right? Sort of like

39:16 I say, okay, can I tangentially, I'm trying to teach myself Rust because it's 2024 and like, that's what

39:21 everyone else is doing. And I found it to be sort of exceedingly valuable to just sort of like, you know,

39:25 ask for something, how do you do that in Rust? And it gives me something and I just, you know, sort of

39:29 use the code and then like, I go look at what it does and sort of, but I'm actually the one running the

39:34 Rust code, right? Or something else is there's, there's a like third party that can verify the

39:38 output in like a meaningful sense. And that combination to me seems really powerful, sort of

39:44 like the ability to sort of, you know, generate these potential things with my ability to quickly

39:48 check whether they are true or not, that seems transformative to me. Totally agree. Yeah.

39:52 And so, so that's where I think it's most exciting work that's going to come out is how do we combine

39:58 like these things? And one fascinating thing to me is that if you think about like PyLens, PyWrite or

40:04 Pydantic, these sort of like typing efforts for Python that exist in many other languages, something

40:09 like an LLM gets to benefit a lot from it on both sides, because the annotation is information that it

40:15 can use to sort of figure out what it actually needs to tell you. But you can also use the annotations

40:20 because the annotations means that there are fewer valid programs out there, right? That's what a type system

40:25 does. It disallows some programs to exist. And so if it gives you a wrong result, it's much more

40:29 obvious that that's the case. Right. You might see a type error.

40:32 That's exactly right. That's right. Yeah. And so that to me is where I think all of this stuff is

40:37 happening. If you, so you mentioned you're a mathematician by training, Michael. So Terence

40:43 Tao is a fields medalist, sort of very famous number theorist guy. He's been talking a lot about his

40:48 efforts to use lean, which is an automated theorem prover. So people are sort of trying to formalize

40:53 mathematics through like, you know, automated theorem prover programming language that has

40:57 like formal semantics and so on and using LLMs for sort of coming up with ideas for like the proof

41:03 strategies and so on. And there, if it doesn't, if the theorem is proven wrong, it will tell you

41:08 right away. Yeah. So that's the kind of thing that's very, very powerful to me.

41:12 Yeah. Yeah. I'll often, I'll often for something that's a little complicated, you know, I'll, I'll

41:21 have it generated. I'll look at it. Look, as Carla said, I'll look it over, I'll run it. And then I'll

41:25 say, I'll, I'll say like, now write me a ton of tests, you know, and then like, did you test for

41:30 this? Did you test for that? Did you? And then suddenly there's like 80 tests and that helps it's,

41:34 that's a little bit, it's not, that's not a guarantee, but it helps with that closed domain kind of thing.

41:40 Right. Not a proof checker, but, but you know, it's a proxy.

41:44 Yeah. I didn't mean necessarily take us this deep down the LLM path, but it's amazing. It's

41:47 super interesting. It's easy to go there. It's easy to go there. Yeah.

41:50 It's affecting all of us so much. And I think it's going to bifurcate the, the software developer.

41:56 Yeah. World on one hand, I think folks who are like, you guys are saying, I can check it. I'll take it.

42:01 Then I'll run with it. It's only going to make it better, faster, right? You want to do extension.

42:06 You don't know Lua. No problem. It doesn't matter. Yeah.

42:08 Right. We'll make that work fine. On the other hand, I think, you know, to your wife's position,

42:14 Carlos, I think there's a really serious danger that LLMs will stunt the growth of students.

42:20 Yeah. Yeah.

42:21 You already see it happening.

42:22 It's already happening.

42:22 Yeah. It's already happening.

42:23 This is already happening.

42:24 Yeah.

42:25 Yeah.

42:25 And it's, it's a really, I think it's, it's a shame because in computer science specifically,

42:31 the, the fix is easy, right? The computer is right in front of you. You can test that, right? Like that

42:37 kind of perspective that like, you know, you have a computer in front of you that can run code and tell

42:41 you what it's doing. There's nothing stopping you from doing that. That perspective is so important,

42:45 right? That's why open source is so amazing. You can like go into the Python code and like download it and

42:50 try different things. There's no one stopping you from doing that. So for computer science, I do

42:55 think that we have an easier way out where I think this is terrifying is where folks use LLMs as a

43:01 search engine with having no idea that this thing invents the fact that I don't know, like George W. Bush

43:08 has issued a part in the history of the app, right? That kind of stuff comes up, right? People publish stuff

43:12 on the news without minimizing this. It's already here and it's not just students, it's everyone.

43:17 Yeah. Yeah.

43:18 It's a slightly broader point. And then we can get off the LLM train here, but a slightly broader

43:22 point that typically to get the most out of these things in a lot of domains, and this could be law

43:27 or medicine programming, you really need like substantial human expertise to oversee them.

43:33 And that person probably has 10 years of experience. And so great. Okay. So those people are going to

43:40 become more and more valuable. I know how to be the human in the loop because I, you know, but how do we

43:45 develop the new, you know, the, the, the 10 years of experience? How do I ever get the 10 years of

43:49 experience if I'm constantly letting the LLM do things and I'm never learning? So that's the crisis.

43:55 I would say. It's an absolute, it's going to be a real problem in a few years. How do you,

43:59 how do you make the gap from new out of school to I have the 10 years experience? That is going to be

44:04 a huge problem. It is a huge problem. Yeah.

44:07 But do you know, William Gibson's quote about like, the future is already here. It's just not

44:10 even as distributed. That problem already exists. It doesn't need LLMs for that, right? This problem

44:15 of checking expertise, right? To tie back to the portal conversation, right? Like a number of there's

44:22 what got me started in the reproducibility crisis is this inability for us to tell whether some output

44:26 of an academic paper is correct or not. Right? Like it's really much harder to check. That is the same

44:31 problem that now absolutely everyone is going to run into on every place. Right? And where I think

44:36 Porto has something to say about it and like this entire project of computational reproducibility and

44:41 documents that have the code that generated them is that you have the ability to check, right? So, you

44:46 know, our manuscript support that JJ alluded to early, right? When you create a website, right? You have

44:53 the entire website and there's a link to the side that's like, here's the code that generated it. You can

44:56 run it on binder, right? You can run it in code spaces. Yeah. And so you can check when you look

45:01 right here. That's in the upper right. Yeah. This thing right here. That's right. Yeah. So that

45:04 shows the code statically, right? So there's a separate mode in which you not only have like the

45:09 source code that you can see it, but you actually have the full IPNB like available for you to run it

45:14 on binder and like check for yourself that it's actually producing the results that you expect.

45:18 Right? So you can go from the pros that is hard to tell whether it's true to the code that actually runs.

45:24 So you can at least see what the person is using to make the claims that they did. And I think that

45:29 kind of like integrated metadata that like sort of corroborates the statements you're making is only

45:35 going to become more important as we go forward rather than less. Yeah. Yeah. I agree. That harkens

45:40 back to your reproducible science initiatives, right? That's exactly right. Yeah.

45:45 So let's talk about using Corto. Where can I run it? So it's, it looks like it's supported at least on

45:51 the major OS's Mac, Windows, Linux. Yeah. Right. That's right. It's a, it's a command line tool

45:57 that you can install separately. You can also just pip install it. oh, interesting. Yeah. Yeah.

46:02 I mean, just that's a lot of people do that. and you can pip install older versions and,

46:06 you know, versions from GitHub and all that sort of thing. in terms of, I would say, you can

46:12 just use it with any editor with a command line tool. I do think it benefits from, from some tooling,

46:18 like some auto complete and some amount of just like, Hey, preview this thing, show it to me.

46:23 and so there is, there is pretty good tooling across there. So there's like a Jupyter extension.

46:28 There's a VS Code extension that does quite a bit of stuff. there's a Neo Vim extension.

46:34 there's a support integrated support in our studio. So, that's important. If you go to our, like,

46:39 if you go to our, like, get started thing, you can see, we sort of, let's see,

46:45 I think you have to scroll up. you go to get started. It'll kind of, if you scroll down,

46:51 we kind of say to you like, okay, what are, what, you know, what's your preferred tooling

46:56 environment? And then we kind of set you up in that environment with the right

46:59 extensions or other things. And so there's always use a text editor, but you know, I think the tooling

47:05 is helpful. Yeah. So one of the things I was thinking about when I was asking that,

47:08 two of the things really that I think are rough edges that you probably solve that people don't

47:13 necessarily know they're rough edges yet till they try it is one hand doc has a lot of dependencies

47:19 dependencies that are not obviously dependencies. Yeah. Like for example, I was trying to get,

47:25 some file out of pandoc and it was saying this LaTeX thing is not installed on your Mac.

47:29 Yeah.

47:30 I'm like, why is this here? Why is this happening?

47:32 That's right. We definitely, yes, yes, yes. We, we have, we have definitely solved that. So on,

47:37 so it depends on the, now it depends on the scenario. So for LaTeX, we don't embed a LaTeX distribution. However, if we notice you don't have LaTeX, we tell you just

47:46 type quarto install tool, you know, it's a, what is our LaTeX distribution?

47:52 Tiny tech.

47:53 Tiny tech. Install tiny tech. And then that's actually a tech distribution that we maintain

47:58 that is like very small, has different form factors and isn't like five gigabytes to start

48:03 with and does auto installation of packages. So for that, we kind of say, yeah, you need,

48:07 you need LaTeX. We're going to make it super easy. So it's a one liner. You can get LaTeX.

48:11 In other cases, like, types is another, many people may or may not have heard of. It's a,

48:16 it's sort of a competitor to LaTeX. It's another PDF, among other things, PDF engines.

48:22 and it's re it's a phenomenal tool. It's, you know, creates incredible output. It has a much to me,

48:30 like more sort of natural and scalable, programmable interface than LaTeX does. We just embed

48:36 the, the, the types runtime in Quarto. Cause it's not very, it's not, there's not much to it.

48:42 You know, it's not, it's not huge. It's not, it's not like that's cool.

48:45 If you haven't used types and you use PDFs, you generate PDFs as part of your day job. and you've

48:53 heard of LaTeX and you have the reaction that every single person that needed to use LaTeX is,

48:58 it's like fondness for the pain it caused and the quality of the documents it generates.

49:03 Types-

49:04 It generates incredible, it doesn't get better.

49:06 Types gets to what I would say is 90% of the quality that LaTeX does. There are some

49:11 micro typography tricks that they haven't quite incorporated yet, but they do a number of other

49:16 things. And it, so Types is genuinely amazing. I could not recommend it more strongly. It's a little

49:23 weird to type as typst.app. It's another open source product. They have an offering that is sort of like

49:31 a collaborative editor where you do those things, but you can use it as a command line. T-Y-P-S-T.

49:36 Types-t. Types without an I. Types-t. That's right. Yeah. Okay. Yeah, got it. All right. Wow.

49:44 Yeah, we can add that to the show notes. It's hard to, it's hard to phonetically tell me how

49:49 to type it and search it. But it's amazing. Once you know, it's going to hit, this is not going to

49:53 come up. It's very searchable. Okay. I'll put the show notes. It's a fantastic, fantastic tool.

49:57 Yeah. So yeah, Types is incredible. Yeah.

49:58 Pandoc supports outputting from Markdown to Types because of course they do. And so we can do that as

50:04 well. And it's phenomenal. It's incredibly, like I hope in five years we will not have to reach for LaTeX.

50:13 It's early, right? It's sort of existed for a couple of years, but I am genuinely, the team is extremely

50:19 impressive. It will not surprise anyone if I say it's a Rust application because again, of course it is.

50:26 But amazingly technical work. Compilation is impossibly fast. To me, the difference between

50:34 using LaTeX and Types is the difference between using Mercurio and Git. If you ever remember the

50:39 first experience when you said Git something and you just got the result back and you thought

50:44 something went wrong, it could not possibly be this fast. That's the same impression you have from Types.

50:49 It generates this amazing PDF in like 0.2 seconds. And you just look at your computer and like,

50:55 I cannot believe this just happened. It's really nice. So we support that.

50:59 We've made a big investment in Types and so we bundle the runtime and we're bullish on it.

51:06 All right. I'm going to have to learn more about this. This is very exciting.

51:09 So yeah. Okay. So that was the one thing I want to talk about was the rough edges of tools like

51:15 Pandoc and others. The other I want to talk about with you all is continuous integration. So you could

51:22 set up say an automation in GitHub where somebody pushes a new version of a notebook, it automatically.

51:28 Yeah. That's right.

51:29 Yeah. So yeah, we have the integration. We got GitHub actions integration. What's the story?

51:34 Yeah. So if you go to, if you go in the guide here, just, I know folks listening won't see this,

51:38 but we'll just talk through, go to publishing. You can see, we sort of have this idea of, yeah,

51:44 I guess you can go to publishing basics there and it'll give you an overview of everything.

51:48 So we basically have this idea. We have a Quarto publish command that can kind of publish to lots

51:54 of different destinations. So it can publish to Netlify, GitHub pages, you know, some posit things.

52:01 We have our own little hosting service, hugging face spaces. So the idea is we know how to talk to all

52:06 those. And so you can just say Quarto publish and it'll remember, oh, you published to this URL before you

52:10 want to update it. So that's like the workhorse for publishing. And then all of that works in,

52:16 you can see at the bottom, there's a note publishing with CI and all of this can work

52:20 by basically like put your token in CI, you know, you know, export the token in the environment,

52:26 call publish. And so lots and lots of people use CI to update things with Quarto. And they may be,

52:34 again, they may be publishing to get a pages, maybe Netlify, maybe hugging face, maybe, maybe another

52:38 service. Yeah. You want to give a shout out to Quarto pub? Yes. Yeah. The Quarto pub is just a free

52:44 service. I don't really know much about it. What is this?

52:45 It's literally just a free service that you can use to publish static, static documents and

52:51 websites. And that really all we wanted to do here, it doesn't support all the fancy features

52:56 like authentication and custom domains and all that our goal. And there are other services that do that.

53:01 but our goal was to have a free service that makes it dead simple to publish your thing.

53:06 You get your own, you know, your own like sub domain and you can just publish stuff. So we

53:10 wanted to be just like, I can definitely publish my thing to the web. and so that's kind of what,

53:14 what, what it's for. and then if you want to do things like, for example, if you're using

53:18 get a pages, well now I can, you know, I can like have it authenticate. So only people who have access

53:22 to the repo have access to the pages, or, you know, if I'm using Netlify, I can do all manner of

53:27 fancy things. So it's, it's designed to be simple, easy, and free, to kind of get your stuff out to

53:32 the web. So I use that. Like when I'm, if I publish, I go give a presentation and I have a slide deck and

53:36 it's written in Porto. I just put it on Porto, you know? Yeah. Beautiful. So I suppose one of the

53:41 other publishing destinations other is just a pile of HTML, CSS. That's right. Exactly. Yep. Put it

53:48 wherever. Yeah. It's just going to make Porto is just going to make a directory for a website. It'll

53:52 be underscore site, you know, for a PDF, it'll be a PDF and whatever.

53:56 Put down some engine X or caddy and off. Exactly. Do whatever. Yeah, that's right. That's right.

54:01 One thing I will notice a note about the CI just before we switch subjects is, we have a

54:06 repository that I think we linked to, somewhere where we have a number of examples of using GitHub

54:11 actions. So it's a GitHub slash Quarto dev slash Quarto actions that we maintain. And we document

54:17 with a number of different use cases that folks might want to do if they need to like grab dependencies,

54:22 right? So your CI often where the sharp edges are that you're going to have to install R, you're going

54:27 to have to install Python and so on. And so we have sort of, you know, a number of actions that will

54:31 talk Quarto will render it and it will publish it. So from there, you can just take it where it is,

54:37 drop some of these into your own repository, and then you're off to the races. So this is another

54:41 place where we're trying to like reduce the sharp edges to like actually get people to see the stuff

54:46 you've built. Yeah. Yeah. Cool. I'll link to it in the show notes so people can find it and use it.

54:50 All right. we have time maybe we, if we lightening rounded a little bit, we got time for,

54:56 for two, we can do it. We can do it here. So, one, this is open source under what license?

55:04 MIT MIT, which may basically means that commercially you can do whatever. So

55:10 Quarto is sort of unencumbered in that sense. That's right. Right. Yeah. That's right. Exactly

55:15 right. Yeah.

55:15 What's the business model? How does this tie back to Posit and why is this?

55:19 So Posit has a, a pro so product Posit has a product called Posit Connect that is essentially

55:25 like an, I would call it an enterprise publishing platform for what we call data products. And so

55:31 Posit Connect does a lot of things. It publishes shiny apps. It does stream load apps. It does dash

55:36 apps. It does flask apps. It does apps. And it also does, content. So it does websites or documents.

55:43 And then it has, it adds some things like, oh, you can schedule, the content. So like, you know,

55:48 run this, update this once a day, update this website once a day, or whenever this thing,

55:53 updates, you know, send an email to these people. So it's, that's the idea. so Posit Connect is,

55:58 is, a commercial product that we sell. And so I think, from a business model standpoint,

56:03 if people are successful with Gordo, as obviously we make it very easy to publish it to

56:08 everywhere. We're not trying to privilege, you know, or say, oh, oh, you know, it's a Roach Motel.

56:14 You have to publish with Posit Connect, but we're, we're, you know, we do, we do try to have,

56:18 yeah, yeah, we do. We do try to have a value added product. It focuses, I'd say, honestly,

56:23 more on the like internal publishing, you know, than the, than the public, publishing.

56:29 Yeah. That's cool. And just for, for, for disclosure, people haven't listened to other episodes,

56:32 you guys have sponsored the show before for Posit Connect. And I want to say, thank you.

56:36 Okay. Okay. Yeah. Awesome. I really appreciate it. But, but this is not sponsored, right? This is just,

56:40 no project. So, okay. So that was, yeah, that was one of the ones I wanted to talk about. Just like,

56:48 how open is it? It feels a little bit like the Visual Studio code model. Microsoft makes this.

56:53 It would be awesome if there were some great integrations to Azure and encourage people.

56:57 Oh, right, right, right, right. Yeah. I mean, right. People love it for whatever,

57:02 but if it's, if there's the right use case, then, Hey, it's a perfect tool.

57:05 Yeah. So a little bit of that, we, we've sort of like, Hey, we've tooled this up for internal,

57:10 you know, enterprise usage, as much as we can. we, you know, I'd say we've had,

57:15 we've had lots of contributions. I think our biggest external contribution was,

57:20 would you say Carlos, the folks from Julia added the, Julia engine, which is a pretty,

57:25 pretty, pretty substantial piece of work. and so we, we very much like to collaborate and take

57:30 pull requests and yeah.

57:32 The yours are welcome, huh?

57:33 For sure.

57:34 The yours are very, are very welcome. Yeah.

57:36 And currently Quarto is, I think, you know, a, a large code base that is not super approachable.

57:42 I think that's a fair assessment that we would agree with. We are spending a lot of time in 2025,

57:47 actually figuring out ways to sort of let the community in and sort of change the ways that

57:51 people can write extensions and sort of make it easier for people to build on it. I will say that

57:55 in contrast to the way that Microsoft does like VS Code versus Codeium, right? So we don't have to split

58:00 products, right? So we have just Quarto, but it is true that inside Quarto, there are sort of things that

58:06 only make sense, for, so there's a published to like RStudio connect, for example, right? So

58:11 there are things that integrates with RStudio that are inside the Quarto code base that are there

58:16 because of RStudio. but that's an entirely fair trade-off that people can make, right? They don't

58:21 have to use those tools if they're, that's right.

58:23 That's right.

58:23 Not at all. Not at all.

58:24 Yeah.

58:24 Yeah.

58:25 But you want to serve your customers as best you can. That's right.

58:28 Yeah.

58:28 Yeah.

58:28 Okay. I don't know who wants to take this one, but people know node.js.

58:33 And one of the great ironies of a lot of these data science tools is a lot of the people working

58:37 on them, for example, Jupyter themselves, it's kind of like, we'll take one for the team and we'll

58:42 write all the JavaScript to make it interactive. So you all can stick in your data science language

58:46 and not worry about it.

58:47 Yeah.

58:47 That's right.

58:47 It sounds a little bit like that here. And so you guys are using Deno, Deno Deno.

58:52 Yeah.

58:52 Yeah.

58:52 How, one, how do you say that to, to like what it's, it's like an alternative to node.js,

58:58 right?

58:58 That's right.

58:58 What's the deal with this internal story?

59:00 Yeah. I don't know if it's pronounced Deno. I just usually pronounce Deno. But it is a,

59:06 I think Deno is right, right?

59:06 Yeah. So it is a, so it has V8 inside it, just like node does. It's actually founded by the same,

59:13 by Ryan Dahl, who is the founder of node and he's now CEO and founder of Deno, the company behind it.

59:18 So the main thing with Deno is sort of, it really wants to be TypeScript first. So you can run

59:25 JavaScript, but Deno has sort of like a really built-in type checker that uses sort of like,

59:30 you know, the TypeScript type checker, but it's, it's really quite built in and it has a really

59:35 advanced sort of capability system for folks that need to bundle things. You can run Deno and say,

59:41 run Deno, but do not allow it to go on the net. Right? So run Deno, but do not allow

59:45 - Oh interesting. - allow to like import other things. Run Deno, but allow only access to the

59:52 file system or do not allow access to the file system or do not allow. So they have a really

59:56 fine grain capability system that is quite nice. And for us, so that was, so one of the things that we

01:00:03 wanted was something that sort of let us just do that and quick and sort of, it's easily also

01:00:09 embeddable. So Deno is like a single binary that you get to ship and you get V8 with it. And so

01:00:14 we just bundle Deno internally. And so it's just a pleasant like modern alternative to,

01:00:20 to node where most of the standard library and all of those are TypeScript first. And so you have the

01:00:27 really nice type annotations. The standard library is all in TypeScript and you get access to that. And so

01:00:32 that's what drew us to it. - This is cool.

01:00:34 Yeah. I didn't know, I mean, I'd heard of it, but I hadn't really paid much attention.

01:00:39 Wow. It's popular a hundred thousand GitHub stars and it's less than two years old or something like that.

01:00:43 It's not bad. - Yeah.

01:00:44 Yeah. It's quite popular. They have a number of interesting things. So the company is built behind

01:00:49 and trying to use this Deno deploy, which is sort of like this, you know, serverless where you ship

01:00:55 - Edge computing. - TypeScript, edge computing. And so, you know, you can have like a very small piece

01:00:58 of TypeScript instead of having node that runs on the edge. And so that kind of, that's their business

01:01:02 model for us. It's just a modern alternative to node with all the lessons that they are.

01:01:08 Awesome. Right time.

01:01:08 Yeah.

01:01:09 Yeah.

01:01:09 That's cool.

01:01:10 All right, guys, I think we're out of time here as much as I want to talk more about it.

01:01:14 Absolutely.

01:01:14 But let's leave it with a little bit of a call to action. People are interested with Quarto.

01:01:20 Yep.

01:01:20 They want to get started. What do you all tell them?

01:01:22 I would go, I would go to the Quarto.org website and I would literally just, there's a get started

01:01:26 button and it basically will let you pick your tool, whatever VS Code or Jupyter, or just

01:01:31 NeoVim. And then it'll take you through a tutorial that's step-by-step that shows you the basic

01:01:37 mechanics of the tool. And then it shows you how to, how it works with computations, how you can embed

01:01:43 code and run the code and control how the code is run and all that. So I think if you go through that

01:01:48 tutorial, it'll probably take you, you know, a half an hour or something like that. You will have a very

01:01:53 strong, you'll be up and running and you'll have a, you'll be in your favorite tool and you'll have a very

01:01:57 strong idea of what the thing can do and how it's useful. And then from there, I'd kind of just go

01:02:01 back to the homepage and sort of browse through and say, okay, what can I do with this? So I can make

01:02:05 dashboards, I can make presentations, I can make reports, I can make websites. And I think it'll be

01:02:10 easier to connect concretely with those things once you've gone through the tutorial.

01:02:14 The other thing I want to point out is we have a pretty active discussion forum on GitHub. We don't

01:02:20 have a discord. We'll probably fix that in 2025. But we do keep a very close eye on both issues.

01:02:27 And on discussions and sort of, you know, we are, we try very hard to be responsive to everyone and

01:02:32 sort of, you know, if people want to share their work or ask questions, sort of take, you know, the

01:02:38 the quarter sort of documentation and then figure out like, you know, okay, what are the things? How

01:02:41 do I put them together? And they have questions. We are like, we keep a really, really close eye on that.

01:02:45 And we try to be very responsive. So I would encourage people to just hop on, on GitHub discussion,

01:02:50 ask questions, and we'll be there to help you.

01:02:52 Yeah, excellent. It's nice to have discussions because a lot of times people will come and

01:02:57 if there's no discussions, they'll file an issue, which is really a question. And you're like,

01:03:00 well, when do I close this? Because I kind of answered it, but you don't necessarily know you

01:03:05 answered it and they won't close it because it's there anyway.

01:03:07 That's right.

01:03:08 Yeah. So awesome. Well, congratulations, both of you on such a cool project. And I know it's

01:03:13 doing good things for a lot of people.

01:03:15 So I want to quickly shout out just the folks. And so, you know, there's a large number of group

01:03:20 folks like inside Posit that both help, like, you know, sort of besides JJ and Charles who contributed

01:03:25 early, Christoph Dervie, Gordon Woodhull, Charlotte Wickham, Mene, Mikael is a huge help on our issues

01:03:31 as well. So it takes, it takes a village and I just want to shout out those folks too.

01:03:36 Yeah.

01:03:36 Yes.

01:03:37 Cool. That's excellent. All right. Bye guys. Thanks for being here.

01:03:40 Thank you. Take care.

01:03:41 Thank you.

01:03:41 This has been another episode of Talk Python to Me. Thank you to our sponsors. Be sure to check

01:03:48 out what they're offering. It really helps support the show. Want to level up your Python? We have

01:03:52 one of the largest catalogs of Python video courses over at Talk Python. Our content ranges from true

01:03:57 beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription

01:04:03 in sight. Check it out for yourself at training.talkpython.fm. Be sure to subscribe to the show,

01:04:08 open your favorite podcast app and search for Python. We should be right at the top. You can

01:04:13 also find the iTunes feed at /itunes, the Google Play feed at /play and the direct RSS feed at

01:04:19 /rss on talkpython.fm. We're live streaming most of our recordings these days. If you want to be

01:04:25 part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel

01:04:30 at talkpython.fm/youtube. This is your host, Michael Kennedy. Thanks so much for listening. I really

01:04:36 appreciate it. Now get out there and write some Python code.

01:04:47 Bye.

01:04:49 Bye.

01:04:51 Bye.

01:04:53 Bye.

01:04:55 Bye.

01:04:57 you you Thank you.

01:05:00 Thank you.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon