Learn Python with Talk Python's 270 hours of courses

#492: Great Tables Transcript

Recorded on Thursday, Dec 19, 2024.

00:00 Join me as I chat with Rich Annone and Michael Chow from Posit, where we explore the transformative

00:05 power of data tables with the Great Tables Library.

00:08 We'll cover practical applications of great tables, showcasing how thoughtful design and

00:13 advanced formatting can elevate your data presentations.

00:16 And you'll learn about innovative features like nanoplots and interactive elements, as

00:21 well as the importance of structure, format, and style in crafting tables that inspire.

00:26 Whether you're a seasoned data scientist or just starting out, this episode is packed

00:30 with valuable tips and inspiring examples to enhance your data storytelling.

00:34 This is Talk Python to Me, recorded December 19th, 2024.

00:38 Are you ready for your host, David?

00:41 You're listening to Michael Kennedy on Talk Python to Me.

00:44 Live from Portland, Oregon, and this segment was made with Python.

00:48 Welcome to Talk Python to Me, a weekly podcast on Python.

00:54 This is your host, Michael Kennedy.

00:56 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,

01:01 both accounts over at fosstodon.org.

01:05 And keep up with the show and listen to over nine years of episodes at talkpython.fm.

01:09 If you want to be part of our live episodes, you can find the live streams over on YouTube.

01:14 Subscribe to our YouTube channel over at talkpython.fm/youtube and get notified about upcoming

01:20 shows.

01:20 This episode is sponsored by Posit Connect from the makers of Shiny.

01:24 Publish, share, and deploy all of your data projects that you're creating using Python.

01:29 Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quattro, Reports, Dashboards, and APIs.

01:35 Posit Connect supports all of them.

01:38 Try Posit Connect for free by going to talkpython.fm/posit.

01:42 P-O-S-I-T.

01:43 And it's also brought to you by us over at Talk Python Training.

01:48 Did you know that we have over 250 hours of Python courses?

01:52 Yeah, that's right.

01:54 Check them out at talkpython.fm/courses.

01:56 Hey, everyone.

01:58 Two quick announcements before we dive into great tables.

02:00 First, thanks for all your patience.

02:01 It's been a few weeks since my last episode release.

02:04 What I thought was going to be a brief winter break, pause, turned out to be longer.

02:08 I had to slow my work down recently to help my dad with some health issues.

02:12 We're getting that sorted and things are back on track with regard to the episodes.

02:15 I have six already recorded and in the queue.

02:18 Now, for some awesome news.

02:20 I have a new feature I know you're going to love for the podcast episodes.

02:24 I named it Deep Dives.

02:25 I sent an email out about this early January.

02:28 And if you're not part of the newsletter, you should be.

02:30 The idea is to add way more reference material and backstory information to each episode page

02:36 so that when you come back, you can skim through the content, pulling out important quotes,

02:41 important topics, and extra detail.

02:43 For example, on the episode I did with Carson Gross about HTMX, the deep dive starts out

02:48 with a section called what to know if you're new to Python to help you get the most out of

02:53 that specific episode.

02:54 Then it features incremental adoption and complexity budgets, HTML fragments and partial updates,

03:00 minimal JavaScript for React-like UIs, and even interesting quotes such as,

03:06 your application has a complexity budget.

03:08 Spend it efficiently by Carson.

03:11 These do not appear in your podcast player, but if you visit any episode page for episodes

03:15 that are published after October 2020, you'll find a deep dive for that show.

03:20 Now, this does not replace the links section, which still exists on the episode page and in

03:25 your podcast players.

03:26 I think it adds a ton of value.

03:28 I actually did a long write-up on this over at the Talk Python blog at talkpython.fm/blog.

03:34 Creative, I know.

03:35 So please give it a read if you want to know more.

03:37 And there's an Easter egg there as well.

03:39 There's an awesome notebook LM AI mind-blowing audio track I put in the middle that you should

03:46 check out if you're interested in that kind of thing.

03:48 It's really, really interesting.

03:50 Well done.

03:51 And we live in crazy times.

03:52 Let's just leave it there.

03:53 Check it out on the blog.

03:54 So with that, thank you.

03:56 Thank you for all the support.

03:57 And let's dive into great tables.

03:59 It's an awesome episode.

04:00 Rich, Michael, welcome to Talk Bython Me.

04:04 Great to have you two here.

04:05 Hey, thanks for having us.

04:06 Great to be here.

04:07 It's a pretty interesting topic, these tables.

04:10 You know, when I first heard about it, I was like, I was like, I, you know, it sounds

04:14 interesting, but how are we going to talk about tables on the podcast?

04:17 And you know, how, how much is it really to talk about with tables?

04:20 Turns out there's a lot.

04:22 And we're going to find a way to talk about it on the podcast.

04:24 So it's, it's a pretty deep topic.

04:26 There's so much.

04:27 You don't need two hours.

04:28 And honestly, like what you said just now is like story of my life for the past year of

04:33 like being like, what, what could tables do?

04:36 Like, yeah.

04:37 And then just being freaked out.

04:38 I mean, df.head, doesn't it?

04:39 I just solve it.

04:40 Come on.

04:40 Right.

04:41 We already got that.

04:41 We already live.

04:42 It's already so perfect.

04:44 How could we improve on that?

04:45 But yeah, I'm really excited.

04:47 Cause this is, this is me kind of like fanning over rich, but like, I think it's like the

04:51 perfect way to frame it.

04:52 Like what's up with tables.

04:53 I feel like having pulled that thread for the last year, I'm always excited to talk about

04:58 these things here because like, I don't even be able to recapture how freaked out I am by

05:03 whatever I just experienced for the past year with tables that we're just shipping away at

05:08 is really exciting.

05:09 Tables.

05:12 So you know what I would say probably my impression, which I will grant you is a pretty

05:17 fresh take on this.

05:18 It's a little bit like, oh, well, we already got Excel.

05:20 You can draw graphs with Excel or Altair, Plotly, those types of, you know, it's like, oh, wait,

05:26 there's, there's actually a lot more we could do.

05:28 That's pretty awesome.

05:28 Yeah.

05:29 We have a language too.

05:30 I feel like having discovered there's a whole vocabulary for tables, I feel like was really

05:35 enlightening.

05:37 Yeah.

05:37 I would think so.

05:38 There's, there's almost like design patterns of tables or to get into that and like structure

05:43 and some history and many things.

05:45 But as per usual, before we jump into that, let's get a little bit of background on you

05:50 guys and, you know, maybe quick introduction on the two of you, how you got motivated to

05:54 get into work with great tables and that sort of thing.

05:56 Yeah.

05:57 Rich, you want to?

05:58 Yeah.

05:58 So I'm rich.

05:59 How do you get into tables?

06:01 I mean, I had to make tons of them.

06:03 So that was sort of a necessity.

06:06 So the problem was it was annoying to do like really annoying either doing it word was

06:11 repetitive, you know, not reproducible, which kind of annoying me.

06:14 I had to redo tables all the time because we had to use a tool that was outside of like

06:19 programming.

06:20 Right.

06:20 That was like a major problem.

06:22 Actually had to retranscribe them.

06:24 There was many of them.

06:25 It was kind of blah.

06:26 So.

06:27 Right.

06:27 I swear.

06:27 I swear.

06:28 Like kind of like I opened with you could, you could automate the graphs, but you couldn't

06:34 automate the tables.

06:34 Right.

06:35 Exactly.

06:36 Yes.

06:36 It was like the one last thing.

06:38 So that was like where I came from with tables.

06:40 And years later, I finally got to it.

06:43 I mean, better late than never.

06:44 Yeah.

06:45 Yeah.

06:45 I came in mostly to support rich.

06:48 So a lot of what I've done is work is like a data scientist in R and Python.

06:54 And I've, I've always kind of had a split and kind of split brain.

06:58 So doing a lot of data analysis in R and more engineering tasks in Python.

07:03 But I think as, as I noticed, Rich was kind of bringing this table idea to Python, I talked

07:10 with him a bit and then I peeked through the code he was working on.

07:13 And I was sort of really intrigued by how much was there.

07:16 Like, I would say the first thing that really kind of surprised me is just how much code he

07:23 had was just more than I kind of thought tables would need.

07:27 And so this is more than one's file.

07:30 I mean, come on.

07:31 Don't we just put out a header and put out the rows?

07:33 It was kind of like, there's a lot going on here for people who are not familiar with this.

07:36 Yeah.

07:37 Right.

07:37 I surprised myself too with how much I wrote.

07:40 I mean, it just kept snowballing the amount of things that had to be done.

07:43 Yeah.

07:44 I'm sure.

07:45 And we did, we both came from academics.

07:48 So we both did PhDs and I think we both produced like a decent number of tables as we did that.

07:55 But yeah, never, never quite dove as deep into them or as intricate.

08:01 Yeah.

08:01 And probably most of the time, you know, you're knocking out something in Excel or LaTeX or

08:05 something like that.

08:06 Yeah.

08:07 Once it's done, it's kind of done.

08:08 It's not, not live in the sense like every day I want a new popular view of the status or,

08:14 you know, whatever, like the dashboard sort of deal.

08:16 What were your PhDs in?

08:17 I did a chemistry of all things.

08:19 Chemistry.

08:20 I love chemistry.

08:21 Yeah.

08:21 A lot of tables.

08:22 Again, a chemistry was one of my favorite courses I ever took.

08:24 I love the two.

08:25 Yeah.

08:25 Organic was great.

08:26 Yeah.

08:26 Beautiful.

08:27 Yeah.

08:27 And I know, I know very little about chemistry.

08:30 I did cognitive psychology.

08:32 So.

08:32 Okay.

08:33 Tables in the mind or something.

08:35 I don't know.

08:35 Something like that.

08:37 Yeah, absolutely.

08:38 That's cool.

08:39 So you must have had a lot of experience with trying to present your research and your thesis

08:45 and stuff through, through tables, right?

08:47 Yeah.

08:47 Big time.

08:48 I think it counted like something like 100 tables, including that the raw data section.

08:52 So yeah, no shortage of tables there.

08:54 And a lot of it came up in presentations and stuff.

08:57 Had to remake them.

08:58 Again, no, no good tool.

08:59 So I was like, I felt like, I felt sad, but I had no time to like pursue a tool.

09:04 So I just kept going.

09:06 Yeah.

09:06 I will say, I kind of picked up too, as I started working with Rich, that he's a supreme

09:11 dabbler.

09:12 Like he has collected tables from across the world in a lot of different domains.

09:18 So that's, that's one thing that caught me too, was like, I don't doubt he produced a

09:22 lot of chemistry tables, but I was amazed that I felt like he was like a guy in an alleyway

09:27 trying to like sell me tables.

09:29 Like, oh, you like a sports table?

09:32 He's like, oh, you like basketball?

09:34 You know, you like.

09:35 Yeah.

09:36 Yeah.

09:36 The first, the first brackets free and then.

09:39 Yeah.

09:39 Yeah.

09:40 Yeah.

09:40 That's how it gets your hooks.

09:42 Yeah.

09:42 It kind of is like that.

09:43 Yeah.

09:44 Try to make tables possible for everybody, you know?

09:46 Yeah.

09:47 So how do we kick this off?

09:48 We got to, I just want to say to all the listeners who are not watching the YouTube live stream,

09:53 but are just listening, like most listen to the show do.

09:56 We're going to do our best to describe some of these things.

09:59 Some of them are visual, but I think we can pretty much capture the essence of them.

10:03 I will try to set up chapters and put some chapter, chapter art.

10:09 So keep an eye on your podcast player or your CarPlay view of now playing or whatever it is

10:14 that's playing the thing.

10:15 It might show, show some pictures.

10:17 We'll see what we can do with that.

10:19 But let's start off the conversation here, not on great tables exactly, but maybe just

10:24 some examples of some good, good tables to give people a sense of just, you know, what

10:29 more could you possibly do than just, you know, data frame dot header, something to that effect.

10:34 You know what I mean?

10:35 So I think it was Rich who sent this over.

10:38 Yeah.

10:39 What do you guys said?

10:39 A visualization gallery.

10:41 Yes.

10:42 From Andrew Weatherman.

10:43 And includes tables.

10:44 It has got some awesome tables.

10:45 The first thing is a table.

10:46 Yeah.

10:46 And there are 10 tutorials here.

10:50 And there are a lot of them are based on code.

10:53 A lot of them from R and stuff like that.

10:55 But which one should, which one do you think we should talk about?

10:57 Just as an introductory example, maybe.

10:59 Pretty good one is the first one.

11:01 The very first.

11:02 Like it leads with a table at the very top of the page.

11:04 It's called.

11:05 So there's conference realignment travel.

11:08 Yeah.

11:09 And right away.

11:09 So if you, yeah, if we open this table up, there is, there's pictures.

11:13 Like, so this talks about like, what is this?

11:16 A basketball college, American basketball.

11:18 Yeah.

11:19 So you've got the, the logo or the icon, you know, like the Stanford tree or the Miami, Florida

11:26 U, which is like the school colors.

11:30 And you've got little chiclets of the information.

11:34 It's, those are also embedded in sort of a heat map.

11:37 There's a lot more that there's a lot of pieces of this conveying information all at the same

11:42 time.

11:42 You want to maybe just riff on that a little bit.

11:45 Yeah.

11:45 I don't mind trying to tackle this table.

11:47 I'm just taking it in.

11:48 I think, I think it's really nice.

11:50 So this table, yeah.

11:51 Basketball realignment travel in the ACC.

11:54 So they're talking about, I guess, how hard would it be for these teams to drive to games?

12:00 And so in the table, every row is a team.

12:03 And they use like the logos next to the team name.

12:06 So it's really fast to look up.

12:08 And then in the columns are things about like, yeah, visiting arena to home arena.

12:13 So like how, how many miles would it take for this team to travel to the arenas they're going

12:19 to?

12:20 And I think that what's nice about this table is, for example, there are multiple measures

12:26 of how many miles it would take a given team to go to their visiting arenas.

12:32 So for example, on the first row, we have California.

12:35 And in columns, we have like median miles for one column and average miles for another column.

12:43 But I think what's nice about just kind of basic table structure is because these two columns,

12:48 medium miles and average miles are both kind of about a similar thing.

12:52 They're able to put a label over it called a column spanner, and they label it visiting

12:57 arena to home arena.

12:59 So they're able to do a little bit of grouping of information.

13:02 So you can see like these two measures, these two columns are related to each other.

13:07 And then I like...

13:08 Yeah.

13:08 Michael, that's a super interesting point that I think is very well represented in gray tables.

13:13 And that is almost multi-dimensional data.

13:17 Yeah.

13:17 Right.

13:18 And so, like you say, it's got three columns here, median miles, average miles, and then

13:22 this other geographic center column.

13:25 But above those, there's like groupings.

13:27 Like these two columns speak to how far is the visiting arena from home versus the other one

13:34 is just sort of an extra description, right?

13:37 So you're able to say, here's this axis of information, but you can really group them together

13:42 and think of them as something else it's representing almost or something combined, like a projection.

13:47 I don't know, something like this.

13:48 Yeah, totally.

13:49 And I think the interesting thing too is like in databases or in data analysis, this kind

13:54 of structure is there, but it's not visual oftentimes.

13:57 Like someone might have a table and they might have two columns, one for like median miles,

14:03 average miles, but they might actually like, oftentimes people in the name will put the grouping.

14:09 So like if those two columns are related, they might say like visiting, you know, prefix it

14:14 with like visiting underscore.

14:16 And so it's interesting, like tables, raw data tables often have these hierarchies, but

14:21 this kind of lets you pull it out visually.

14:23 So like the human eye can quickly kind of group things.

14:27 And then to your point about multi-dimensional, that is kind of like one of the big benefits

14:32 of a table is that this, for example, this table has three columns related to miles and

14:39 a plot.

14:39 You could show that in a plot.

14:41 You could plot two columns against each other with a scatter plot and maybe use the third

14:46 column another way through like color or size.

14:49 But at that point, you've kind of maxed out the plot.

14:52 There's not a lot more you can do.

14:53 A table, you can just keep adding columns and they don't even have to be numeric or like

14:59 kind of on the same scale.

15:01 You could add like all kinds of information in columns and keep going.

15:05 And I think to your point of it being multi-dimensional, that's one kind of nice thing table support is

15:10 many measures on the columns.

15:13 Yeah, absolutely.

15:14 This portion of Talk Python to Me is brought to you by Posit, the makers of Shiny, formerly

15:19 RStudio and especially Shiny for Python.

15:23 Let me ask you a question.

15:25 Are you building awesome things?

15:26 Of course you are.

15:27 You're a developer or data scientist.

15:29 That's what we do.

15:30 And you should check out Posit Connect.

15:32 Posit Connect is a way for you to publish, share and deploy all the data products that you're

15:38 building using Python.

15:39 People ask me the same question all the time.

15:42 Michael, I have some cool data science project or notebook that I built.

15:45 How do I share it with my users, stakeholders, teammates?

15:49 Do I need to learn FastAPI or Flask or maybe Vue or React.js?

15:53 Hold on now.

15:55 Those are cool technologies and I'm sure you'd benefit from them, but maybe stay focused on

15:59 the data project.

15:59 Let Posit Connect handle that side of things.

16:02 With Posit Connect, you can rapidly and securely deploy the things you build in Python.

16:07 Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Ports, Dashboards.

16:12 And APIs.

16:13 Posit Connect supports all of them.

16:15 And Posit Connect comes with all the bells and whistles to satisfy IT and other enterprise

16:21 requirements.

16:21 Make deployment the easiest step in your workflow with Posit Connect.

16:26 For a limited time, you can try Posit Connect for free for three months by going to talkpython.fm

16:31 slash posit.

16:32 That's talkpython.fm/P-O-S-I-T.

16:36 The link is in your podcast player show notes.

16:37 Thank you to the team at Posit for supporting Talk Python.

16:42 Out of the audience.

16:43 Carol points out that the ACC used to be mostly Southern based, but now they span the whole

16:48 U.S.

16:48 But what you can see from this chart is all, or this table rather, that all the Southeast

16:54 based universities have a much easier time.

16:59 The color of their little chiclets of distance are all very light colored.

17:05 Whereas if you look at all the California West Coast, it's like distance across the U.S.

17:10 is your average distance you travel to each game.

17:13 And it's pretty wild.

17:15 You could probably put in things like, what is the team's typical winning outcome based by distance?

17:24 If it has to play a team that is twice as far away, are the players more likely to show up tired

17:31 and not quite be able to play as good?

17:34 And then that might somehow affect the rankings just based where you physically are located.

17:39 Yeah.

17:39 That's what makes this a good table because it basically tries to get that slice of the

17:43 data, right?

17:44 That one view of the data and present it so nicely.

17:47 And then it causes you to think those things, right?

17:49 To explore further.

17:50 Yeah, exactly.

17:51 I don't know what I would have been able to do before.

17:53 Yeah.

17:54 Yeah.

17:54 But looking at the colors, you're like, oh, California, as in Berkeley and Stanford and those

18:00 types of schools, they're much darker versus Louisville, which is not very much, or Duke

18:08 or all these other teams.

18:09 And then you start to realize, oh, maybe distance does play an important role.

18:13 It just jumps right out, even though this is not a thing that I think about or care very

18:18 much about on a lot of levels.

18:19 But, you know.

18:20 Yeah.

18:20 And even to Michael's point, that last column is like a total qualitative measure or not

18:26 qualitative, but it's basically just like a label rather than like a number, essentially.

18:30 So it's much easier.

18:32 In a plot, I don't even know how you would do that.

18:34 You just, I guess.

18:35 But here you can.

18:36 Just to describe the last column is, you're talking about like furthest.

18:40 So the last column shows what's the furthest team from.

18:44 So each row is a team.

18:45 The last column is essentially what's the furthest team from you.

18:48 And they're able to just put a logo in to get a quick kind of snapshot.

18:52 And you can play on that extra column.

18:55 Basically, if you were to play an away game at that team, that's the worst travel for you.

19:00 But at least measured by distance.

19:01 It's not like by time.

19:03 And like if you're taking a flight, you could have three connections to a small town versus

19:07 a direct one that's far away.

19:08 Right.

19:08 Like there's different metrics you might apply to what defines furthest.

19:12 But that's a neat idea.

19:13 Yeah.

19:14 It's interesting though.

19:14 Right.

19:14 Because they didn't put bother to like rewrite the name of that team.

19:17 They just economize the space and just like threw it in in case you're interested.

19:21 It's like.

19:22 Yeah.

19:22 And the value for that column is not the name of the school.

19:25 It's the school logo.

19:27 Yeah.

19:28 And I will say, I can't believe that the year is 2024 and we just had a five minute

19:34 discussion on the design of a single, I didn't imagine myself being here a year ago, but it's

19:41 really.

19:41 This is the problem.

19:42 I'm telling about it to me.

19:43 Yeah.

19:45 Amazing.

19:45 I feel like Edward Tuftay must come into existence a lot for you guys.

19:52 But it kind of goes to show tables can be cool, right?

19:54 Like we're actually talking about this table and it's interesting.

19:58 And it's good to look at.

19:58 And I'm like, I don't know.

20:00 Like this is like a leisure thing.

20:02 I'm not compelled to look at this.

20:04 Yeah.

20:04 I'm looking at this for fun.

20:05 And like, it actually is okay.

20:06 I enjoy looking at this table.

20:09 Yeah.

20:09 Yeah.

20:10 I'll pull one more down.

20:11 I have no idea what I'm going to be getting here, but let's do Spotify listening.

20:15 How long do I listen to my top artists before skipping?

20:18 Yeah.

20:18 I think to your Tufty point too, this is a good example that you can have, you can have a

20:23 little bar chart inside your table.

20:27 So like this, this table, how long do I listen to my top artists before skipping?

20:31 Each row is an artist with some information about them.

20:37 And then the big thing it sounds like is this average stream duration column, which has a

20:42 bar of how long people stream them on average, maybe before skipping.

20:47 Yeah.

20:47 I love your point about Tufty.

20:48 Like this idea that you can encode a small graph, like visual, like a bar is very, I think

20:55 Tufty, like with sparklines and things like that.

20:57 Tables can have a really compact representation that bring in some of the value of charts.

21:04 But also it's like, it's like three quarters table, one quarter chart.

21:09 And that's kind of common.

21:10 For those out there who don't are not steeped in Edward Tufty sort of lore.

21:15 He's a guy who thinks very deeply about data presentation, not necessarily tables or graphs or

21:21 whatever, but just, I have a bunch of kind of data stats based information.

21:25 And how could I possibly present this in the way that conveys the most information easiest?

21:30 So on.

21:31 So yeah, he's got, he's got some great books, some great presentations out there.

21:35 That's why I brought it up.

21:36 Cause it's his whole career is about how do you present data basically.

21:40 Right.

21:40 And if you guys should add to that, what do you think?

21:42 I think there is a, I don't have it on hand.

21:45 There is a quote, I think from Tufty, I don't want to butcher about, there is some analysis

21:50 he's done of tables of like, like your entry architecture or something, which is essentially

21:54 like the structure of the table where you enter it as a user.

21:58 And then like how you read usually determines your like orientation.

22:02 So like, if you're, if you're reading left to right, there's usually like the table structure

22:06 has like an entry point.

22:07 And then if you're reading left to right, it kind of determines how you're going to go through.

22:12 He's done some thinking about tables as sort of architectures, I think, but I haven't, I haven't

22:19 gone too deep into it.

22:20 Me neither.

22:21 But I guess that informs like things like sorting that we saw in other tables.

22:24 Like there's a dominant sort, the most important things are at the top, maybe least important,

22:28 but there's, there's definitely some sort of like, you know, like conscious sorting.

22:31 And like, it's obvious what it is and it's just to sort of serve a purpose.

22:36 Yeah.

22:37 There's probably an art to finding the thing that attracts your attention first.

22:41 And so then as you progress through looking at it, you can do so in a deterministic, purposeful

22:46 way rather than just, you know, you land anywhere and you start looking around visually.

22:50 Yeah.

22:51 Yeah.

22:51 Yeah.

22:52 It's wild because like it is such a, such a creative act making a table.

22:56 It seems weird to say it.

22:57 It's not just results.

22:58 You just slap down.

22:59 You have to envision yourself in the, the viewer's shoes and like how to, how they would

23:03 ingest or like digest information in the most easiest manner possible with all sorts of like

23:09 little affordances, like the way you format things, space in between columns, all sorts

23:13 of wild, little tiny details.

23:15 They all really combine to like make something really good or to make the table shine.

23:20 I guess you can say.

23:21 Yeah.

23:21 Yeah, absolutely.

23:22 I want to come back to this one about the, the listening.

23:25 and I'll, I'll link to all these in the show notes, of course, here we've got the average

23:30 stream duration and it's represented by little bars in the table itself, which we've already

23:35 talked about, which is awesome.

23:36 And the values are things like 87%, 80%, 84%, whatever that little section is fraught with

23:43 possible misrepresentation.

23:45 So if that was just a number and it was average seconds, well, if you listen to say the scorpions

23:52 or some of the older rock bands or iron maiden, I don't know where they have songs that are

23:58 like seven or eight minutes long versus more modern songs that are, it may be two and a half

24:03 minutes long.

24:04 And it said, Hey, look, I only listened to these like two and a half minutes, but I listened

24:08 to the scorpions for three minutes.

24:09 That must be the winner, right?

24:10 Like, no, you're not even listening close to the same amount.

24:13 So putting numbers even there, even before you would normalize them, right?

24:17 Just put them as a percent of, of that, as like the total average of the song or something.

24:21 There's a lot to think about with these things and that can kind of go away with these graphs,

24:26 right?

24:26 You don't have to worry about that.

24:27 You're like, Oh, you don't have to worry about how long is their average song.

24:30 You can see, you know, just how long did I listen to their stuff?

24:34 That's kind of cool.

24:34 Yeah.

24:35 It's really cool.

24:35 And it's a great use of space too, right?

24:37 Like horizontal bar plot.

24:38 I mean, these are like rows, which are not that high, but you still fit a plot in.

24:43 We don't have that many choices, but, horizontal bar plot.

24:46 Yeah.

24:46 It could be just a couple of pixels high and you can still see, see that kind of information

24:49 super well.

24:50 Exactly.

24:50 As long as you use enough color or whatever.

24:52 All right.

24:52 Well, let's talk about great tables.

24:55 So I feel like we probably have set the stage well enough for, for people to go, okay,

25:01 maybe there's, maybe there's something here rather than just, you know, header rows and columns.

25:06 So tell us about great tables.

25:08 When did it come into existence?

25:09 Why do you guys build it?

25:11 What is it?

25:11 Yeah.

25:12 Yeah, sure.

25:13 So I actually started from an R program or package a long time ago back in 2018.

25:18 and like, there was like essentially like a list of projects that were sort of like,

25:23 I'm not being handled at our, at then our studio now posit and tables was there.

25:28 It was like, Oh, this seems like such an obvious one.

25:29 I will take that.

25:30 I will work on that.

25:31 And I have an interest in that.

25:33 I'm going to do it.

25:34 Nobody else did it.

25:35 It was on that list for years.

25:36 And it kind of begs the question, like why?

25:38 Because tables are such omnipresent, you know, way of presenting data.

25:43 I see it like in, I don't know, like journal articles, like half the things are tables, half

25:47 things are plots.

25:48 and they're, you know, everywhere else.

25:50 So like, yeah, there should be this thing.

25:52 So I decided to just get going on that.

25:55 And, as, as Michael said, like, it's a lot of code and like, I didn't envision

26:00 it would take that much code.

26:01 I thought it'd be rather simple working with HTML table and then, you know, expanding towards

26:05 other types of like table outputs, like a PDF, you know, word and stuff like that.

26:10 but I thought, you know, those would be bite-sized little things as it turns out tables

26:14 run pretty deep.

26:15 and I really only found that out when looking at, I was doing some like literature review

26:20 and I found that 1949 census manual, it's like the review of tabular, presentation.

26:25 Yeah.

26:26 It's a great PDF that lives on the internet.

26:28 It's always there.

26:28 Thanks.

26:29 Whoever put that up.

26:30 What year is this from?

26:30 1949.

26:31 49.

26:31 Yeah.

26:32 It's, it's like a janky scanned in thing.

26:36 Yeah.

26:36 But that's only a line, but it's, you know, this is like hundreds of pages of table stuff.

26:43 It's both a style guide for the census at the time.

26:46 And also just like an amazing reference for anything in tables and even just presenting

26:51 data.

26:51 Like they, they go so deep on the smallest little things.

26:55 It's incredible.

26:56 Yeah.

26:56 They sketch things out.

26:57 They give things names.

26:58 That was super important because otherwise he'd just be saying that area on the top left,

27:02 whatever that is.

27:03 It's good to have names.

27:04 It's like design patterns for.

27:06 Exactly.

27:07 Yes.

27:07 And this is what was needed.

27:09 And I'm pretty thankful I found this.

27:10 Otherwise we'd have to make it up, but this seems like as close to a standard as we'll

27:14 ever find on this sort of thing on making tables.

27:17 Right.

27:18 So in 1942, we have the formal table and its major parts continued, you know?

27:23 Yeah.

27:24 Yeah.

27:24 And there's so much good text.

27:26 Like it's expertly written.

27:28 this stuff, I mean, of course these are tables in print and of course we have the web.

27:33 So that changes things a little bit, but you'd be kind of surprised how, how, you know,

27:37 still relevant these recommendations are.

27:39 They teach you to be economical with how much data you put on, how much, basically the

27:43 Tufti thing, how much ink are you going to like spend on a visualization?

27:46 And, it just carries it through to all the different parts, which is, you know, amazing.

27:51 And we took a lot of that and applied to here as best as we could, you know, with some

27:54 more modern things as well, but, yeah, of course, but it's yeah, pretty much a very

27:59 useful reference, like the foundational reference for this.

28:02 So right on the great tables page, you've got the components of a table.

28:06 Yeah.

28:07 You want to give us a sense of what's beyond header rows and columns?

28:10 Yeah.

28:10 Michael, would you?

28:11 Sure.

28:12 Yeah.

28:12 Yeah.

28:13 I think so.

28:13 I mean, I think even just the very first thing at the top, this table header with a title and

28:19 a subtitle that's like, that, that seems like a really simple thing, but I think that it's

28:24 easy to overlook.

28:25 like a lot of people think of a table as just the like data and maybe some labels.

28:31 But if you think about like the information hierarchy and someone who has maybe 10 seconds,

28:37 five seconds to figure out what's going on.

28:39 As it turns out, the title is probably the most, one of the most important parts because it tells

28:44 them in two seconds, what should I expect from this thing?

28:48 So like, if you send to your boss a table, if you had to do one thing, you should probably

28:52 put a title on it so that they know what you're sending to them.

28:56 And I think, so that's like, maybe a silly thing.

28:59 Maybe it feels like table stakes, but I think like the title and the subtitle are small elements

29:04 that really kind of, if you're thinking of a table as communication, you, you wouldn't

29:09 write an article without a title.

29:10 so this really is just.

29:13 Probably wouldn't put a graph up without a title.

29:15 Right.

29:16 Yeah.

29:16 Right.

29:17 So it's kind of bringing tables up to the standard of other communications.

29:21 Yeah.

29:22 And then I think, the other thing I really like is these are really simple elements,

29:26 but these footnotes on the bottom and source notes, this idea that if you just create a

29:31 spot for your like references, it's a lot easier to get into the habit of, of referencing

29:37 where things came from.

29:39 So like we have some table examples in our gallery and they, in their footnotes, they often

29:43 say where, we got them from.

29:45 And it's just so, so helpful having that like hook that you can hang into to quickly, like

29:51 as a quick reminder that if you, like, if you make it easy to put your references and footnotes,

29:56 you know, you just can make a habit of it.

29:58 So those are kind of funny.

30:00 Those are like the bookends of a table, but I, I really appreciate that they're there to

30:05 emphasize how critical they are.

30:07 If tables are a form of, communication.

30:10 Yeah.

30:10 The one that stood out to me and I called out at the first table we talked about was the

30:15 spanner label.

30:16 That kind of says these three columns are kind of in some category together.

30:21 Yeah.

30:21 Yeah.

30:21 The hierarchy is pretty important.

30:22 It lets you sort of like, Oh yeah, I care about this package of, or this set of columns,

30:27 or maybe I don't, maybe I care about these ones over here.

30:29 You just quickly get oriented to like the columns that, you know, are more useful.

30:33 And also it allows you to do that thing that Michael mentioned before.

30:36 You don't have to like have tons of like label on the below columns, like on one label or

30:41 tons of text on one label.

30:42 You can split it up between like spanner and then like the things that it's associated to,

30:46 you know, the columns.

30:46 So you might have things like, some type of measure and then like the units for the

30:50 column labels, for instance.

30:51 Yeah.

30:52 In our basketball example, it could be distance from visiting arena that could just have average

30:57 or miles from visiting arena and then just average and median.

31:00 And that's all it has to say, but you wouldn't have average and median.

31:03 And that's called them like, unless it was really, really clear, but in a complicated

31:07 thing, it wouldn't make sense to just have those two words like average or what, you know?

31:10 Yeah.

31:11 As part of that visualization, thoughtful design is about, allows you to be creative and also

31:15 thoughtful for the user or the reader, I should say.

31:18 I think it helps too.

31:19 Cause a lot of tables I've seen go big on columns.

31:21 There's like quite a few, there can be tons of columns.

31:25 And so I think when you have like, let's say you have a dozen columns, if you have like

31:29 two or three spanners over them, it feels easier.

31:33 It feels like closer to a table with three columns in a sense.

31:36 When you get started, you kind of know what you're getting into.

31:39 So you can kind of ease into it.

31:41 Yeah.

31:41 You can kind of zoom in by picking a spanner and then paying attention to those pieces or

31:45 something.

31:46 Yeah.

31:46 Yeah, totally.

31:47 As well.

31:48 Having these tools at your disposal as well.

31:50 Also kind of like lets you be, I guess, creative, but also it gives you options to, to explore

31:55 different ways of creating a table.

31:57 If you didn't have like the ability to put a spanner, say in this program, this, this Python

32:01 library, you'd have to think of other ways you'd have to, it, it basically changes your

32:06 whole presentation, not having options to do so.

32:08 Right.

32:09 You'd want to try to leverage color maybe to try to infer their group somehow, but then

32:14 yeah, it's still, it's incomplete.

32:16 Yeah.

32:16 For sure.

32:17 Yeah.

32:17 So a non out in the audience asks, like there's a lot, R has lots of great examples, but I've

32:23 struggled since it's not equivalent in Python.

32:25 Let me suggest.

32:26 Oh yeah.

32:27 I kind of feel like, like great tables is the entry point for a lot of that.

32:32 Right.

32:32 What do you think?

32:33 Yeah.

32:33 I think too, maybe something rich you can hit on is there's a lot to appreciate from the

32:37 history of R. Like it's really surprising to people how hard people have gone on tables

32:43 in R.

32:44 Yeah.

32:44 And some of everything leading up to rich working on GT and R tables in R in 2018.

32:50 There's actually like a 20 year history leading into that.

32:54 Yeah.

32:55 That maybe rich.

32:55 You can talk a bit about the kind of.

32:57 Yeah.

32:58 Before you guys just jump in real quick.

33:01 You guys both are, we didn't catch this in the introduction, but you're both working at

33:07 Posit, the company that created Shiny.

33:09 It was an absolute leader in R and now has Shiny for Python and all those kinds of things.

33:14 So just adding some background.

33:16 Yeah.

33:16 History for everyone.

33:17 Okay.

33:17 Go ahead.

33:18 Yeah.

33:18 So basically R goes back probably about as far as Python, 90s or so.

33:24 But R seemed to have a head start on tables.

33:27 I mean, as far back as 2000, they had a LaTeX sort of table generating package.

33:33 And then in the intervening years, the decades since up to now, they've had upwards of eight

33:39 to 10 table packages.

33:41 I think had to do with the fact that data frames are just like, you know, as part of the language.

33:44 So it just made sense to make a table.

33:46 And also they just had, I guess, a lot of tooling for creating PDFs, especially for like

33:51 package like references.

33:54 So basically R has tons of like examples, vignettes and things like that.

33:57 Maybe not so much on the Python side.

33:59 We're trying to get that in.

34:01 Although I've been maximal on the table examples for the R version of this, which is called GT.

34:06 Haven't quite got there yet on the Python side.

34:09 So if you're seeing less examples, then you might see on the R side, make sure that they

34:13 are coming soon.

34:14 Yeah.

34:15 Still working on it.

34:17 Yeah.

34:17 So let's talk about, let's talk about the Python side about what is my data journey to

34:22 get something into great tables and then maybe what comes out of great tables.

34:27 So it starts with sounds like data frame libraries.

34:30 What ones are supported?

34:32 How's that work?

34:33 What's the most ideal one?

34:35 Yeah.

34:35 So we, we support out of the gate pandas and polars.

34:40 And we have, we've made this into lean pretty hard on polars in our examples, just because

34:47 it has some kind of special moves that work really well with great tables.

34:50 So for example, selectors and stuff, right?

34:53 Yeah.

34:53 Selectors.

34:54 So that really was eyeopening to us.

34:56 And I think the polars team a little bit, I think they hadn't, I think great tables was

35:00 one of the first packages that went really hard in integrating with polar selectors as part

35:05 of our API.

35:06 So there are like, if you want to set a spanner label, so you want to set a label over some

35:11 columns, you can use polar selectors to do that in great tables.

35:15 And that's a really nice, that's, that's what we do all the time.

35:18 And it's really convenient.

35:19 I think that that was kind of an early use of an external package for the polars team

35:25 using their selectors and stuff.

35:27 So we, we had a lot of communication with them about it early on, but I would say, so it takes

35:34 a pandas or a polars data frame.

35:35 And we've leaned pretty hard into polars just because of some of these really nice affordances.

35:41 Yeah.

35:41 If you go to the get started guide, you'll see it inside of selecting table parts and then

35:46 like selecting columns.

35:47 I believe we have this little sidebar, which sections up the getting started.

35:51 Oh boy.

35:52 Yeah.

35:52 Okay.

35:52 Getting started.

35:53 Oh, if you go to the very bottom on the sidebar.

35:56 So scroll down.

35:58 Yeah.

35:58 Yeah.

35:59 That.

35:59 Column selection.

36:00 Column selection.

36:01 So sorry.

36:01 There we go.

36:02 Yeah.

36:02 There's a whole using polar selectors section.

36:06 Yeah.

36:06 Here we go.

36:07 Here we go.

36:07 Yeah.

36:07 Yeah.

36:08 So that's just an example of dropping polar selectors into the great table method to move

36:15 certain columns to the beginning of the table.

36:17 Yeah.

36:18 Or you might do another example where you choose a number of columns to put a spanner over

36:22 top of, and they might be named in a systematic way, which is like the advantage here.

36:26 And you can do it for pandas as well.

36:29 Just not with selectors.

36:30 You just use lambdas and such to have similar functionality.

36:33 Interesting.

36:34 Okay.

36:34 Yeah.

36:35 So yeah.

36:36 Pandas are polar's polar's has some really nice benefits like selectors.

36:40 And then it also, it's used quite a bit whenever we style a table.

36:44 We often use polar's expressions to kind of target the areas to style.

36:50 So if you want to choose like for this column, make the maximum value of the column yellow.

36:56 It's really easy with polar's expressions to say, you know, when this column is equal to

37:03 the max value in itself.

37:06 I see.

37:06 Yeah.

37:07 That's really cool.

37:08 And because then you don't have to figure out what the max is and then try to come up

37:12 with a gradient range of it.

37:13 You just put it into little bins or whatever.

37:16 Yeah.

37:17 Yeah, exactly.

37:17 Totally.

37:18 One thing I just noticed that's pretty sweet about this actually is you, you already talked

37:23 about the sources and the references and things like that.

37:26 Yeah.

37:27 But I just noticed that you have markdown support for your references.

37:31 Yeah.

37:31 That's a good one.

37:32 Because like sometimes you want to bold italicize things, maybe make a list within your footer,

37:37 your source notes.

37:38 Yeah, sure.

37:39 I mean, even in the official way you're supposed to reference things, there's stuff that's supposed

37:44 to be italicized and stuff that's not right.

37:46 Exactly.

37:46 Yeah.

37:47 So you want these tables to be useful for like a wide range of people.

37:52 And if they didn't have that, I don't know what they would do.

37:54 They'd just use HTML, I guess, and try their best with tags.

37:57 But this is just an easy, obvious way to make it simple.

38:02 Yeah, that's really great.

38:02 So we start with some data.

38:04 We load it ideally into polars, it sounds like, but pandas is also supported.

38:08 And then you kind of visually, I say visually, the indentation of the way the code looks somewhat

38:15 represents the structure of the table.

38:17 A little Flutter or HTML-esque in that sense.

38:21 Yeah.

38:21 Yeah, totally.

38:22 I think we think a lot of it about in terms of three main activities, structure, format,

38:29 and style, and we've tried to make the method names match up a bit.

38:33 So everything starting with tab is basically about structuring the table.

38:37 So that's those big pieces in the diagram we looked at, whether it's setting the title with

38:43 tab title or tab header or setting footnotes.

38:47 Those are all structure activities.

38:49 So they all start with tab.

38:51 And then there's this just incredible bucket of formatters.

38:54 Rich really went ham on formatting methods.

38:57 So it's everything starting with FMT.

39:00 This is like, yeah.

39:01 Yeah.

39:02 This is like, maybe you want to format the number of decimal places, but Rich has gone even further

39:08 to like, maybe you want to put like a country flag in your table.

39:12 Rich has got you.

39:14 He wanted it to be easy for country flags.

39:17 And I'm here for it.

39:19 Yeah.

39:19 So those little graphical elements are amazing for quickly conveying information.

39:24 Kind of like the logo of the various universities that you would see on SportsCenter or whatever.

39:29 You're like, yeah, that's what I'm used to looking at.

39:30 Yeah.

39:31 That's right.

39:31 You just use format image for that.

39:32 And you just, yeah.

39:33 Give it an image on disk or wherever.

39:35 And it gets in there.

39:38 Yeah.

39:38 Yeah.

39:38 And I was kind of confused why I was adding so many formatters at first.

39:42 But I feel like the thing that really clicked for me is that if you think about, so if you

39:46 go, actually, if you go up to the very top and then into the reference on the top bar, that

39:52 will have lots of formatters under, yeah, format.

39:55 Yeah.

39:56 They're just listed out.

39:57 So what clicked for me is that spreadsheet users.

40:02 So people in Excel making a table are really used to tons of convenient things like formatting

40:07 dates, formatting.

40:08 Right.

40:09 You highlight a column, you press currency, then you press the move, the decimal point, two points

40:13 to the left to drop the portion, the partial ones and all that.

40:16 Yeah.

40:17 Yeah.

40:17 So there's a world like the engineer in me thought like, ooh, I kind of wish this wasn't

40:22 our job to do all this formatting.

40:24 But I think Rich very intuitively picked up on like, well, if people use Excel for tables,

40:30 I think you have to give them a compelling reason to be able to do it all in Python, or

40:35 they're not going to really want to make the jump.

40:37 And so I think having all these formatters is just a way of bringing that kind of nuts

40:42 aspect of Excel into great tables.

40:45 And some of this stuff is tough and fiddly if you do it yourself.

40:49 I'm not just talking about simple format.

40:51 Like strings.

40:52 Yeah.

40:53 Yeah.

40:53 I'm talking about.

40:53 Yeah.

40:53 Can I do like a colon and then a comma and then a 0.2 and we're done?

40:57 Yeah.

40:58 First you got to do that.

40:58 And then like, there might be some other weird things.

41:01 So we try to like make it super simple with just some simple options, some arguments and

41:05 you're pretty much there.

41:06 Yeah.

41:07 One thing I'll maybe give a shout out.

41:09 I know you talked about it, PyCon, was if you've got a bunch of numbers, like let's say

41:14 it's currency.

41:14 You might want to format that using like, if it's over a million, 1.2 M.

41:20 But if it's only 120,000, you might want to say 120 K.

41:25 Yeah.

41:26 Right.

41:26 Something where it's.

41:27 Yeah.

41:28 You don't need sort of an abbreviator.

41:30 Yeah.

41:30 It's, it's also useful if you have like tons and tons of numbers, like a wall of numbers,

41:33 you're running out of space.

41:35 So you just need to compactly.

41:36 And like the small figures don't really mean much.

41:38 Like you just want broad strokes.

41:39 Like it just can be the compact thing.

41:41 It's great.

41:41 And it works really well for like things where you don't need to have that much precision.

41:45 Based on the.

41:46 Yeah.

41:47 You also seem to have taken a pragmatic approach where it's not like, well, you're not, we're

41:52 not going to let you mix units in a column.

41:55 Right.

41:55 If it's, it's not, you know, kilometers and meters, like it's all kilometers.

42:00 It's all meters.

42:01 Take it or leave it.

42:02 You know?

42:02 Yeah.

42:02 I think.

42:03 Freedom, baby.

42:04 That's what you want.

42:05 You can like, you can target.

42:07 So we have like a column targeting system, but you can also like sort of like subset,

42:11 like what rows you want.

42:13 So that's the way.

42:14 So basically you can like format different ways down the column if you so choose to, which

42:18 makes it more flexible.

42:19 Yeah.

42:20 I think, I think it goes back to, to Excel and, you know, Excel is just a big lump of

42:24 clay.

42:25 You can just carve out your table however you want.

42:27 And it, it does like do formatting, but it doesn't prescribe.

42:30 Yeah.

42:31 Like you said, it doesn't constrain you to have to like do it for a whole column.

42:35 And I think that's like, we're endlessly surprised by the types of tables people create and the

42:41 things they communicate.

42:42 And in a way it's like leaving room for surprise that people are so creative.

42:48 You, you want some constraints, but you kind of also want to let them surprise you with things

42:54 you wouldn't think of, but that end up being really useful.

42:57 And I, I don't doubt someone's there like mixing all kinds of things in ways I wouldn't

43:02 imagine, but end up kind of working out.

43:05 I think it was like in our table contests where I was like first surprised.

43:08 Basically we, we, we throw together table contests at, at posit.

43:13 It's such a nerdy thing, but like people submit tables and they get judged on them.

43:17 Like they, they get prizes.

43:18 Everyone's a winner.

43:19 Rich is a very kind judge to be sure.

43:21 It's like everybody wins in Rich's house.

43:25 Yeah.

43:25 But they're so good.

43:27 Like when I see his tables, I'm like, I never even thought of that.

43:29 Like that, that could be done.

43:30 You'd want to do that.

43:32 And I actually like it.

43:33 Like, yeah, it's actually inspiring every time.

43:36 It's gotta be pretty hard to build a framework around such a visual flexibility, you know?

43:41 Yeah.

43:42 But still have some kind of something prescriptive, right?

43:44 Yeah.

43:45 I guess the trick is just like have easy conveniences just to do broad swaths of formatting and styling,

43:51 but also make it granular.

43:52 Give people power, I guess you could say with, along with simplicity.

43:55 Yeah.

43:56 So tell people about these contests.

43:57 If somebody wants to try, try their hand at great tables and submitting something, you know,

44:02 I think we're even in the advent of code time.

44:06 And I guess we just had the hour of code stuff, but there's a lot of these things where people

44:09 are trying, I just need some examples to play with, to try to, to learn a thing.

44:13 Right?

44:13 Yeah.

44:14 So we do it in the summer, so not wintertime, which is actually good for submissions.

44:17 So it's still coming up.

44:18 Yeah.

44:19 Yeah.

44:19 And we give a generous amount of time, I think like two or three months, if I remember correctly.

44:23 But yeah, we just, we just made the call, say, submit your table.

44:27 We have a nice form and people submit them.

44:30 And we sometimes get up to 50, not 80 entries.

44:33 It's like, we get quite a few.

44:34 And then we, we do the tough job of, you know, having a number of judges and just, you know,

44:40 multiple judge grading of the tables.

44:42 And we had so many that we had to split them up into categories, right?

44:45 Because we were going to celebrate all these different types of tables and different domains

44:49 of, you know, have typically different types of tables.

44:52 And so it just makes sense to break it up in that way.

44:55 Yeah.

44:56 And I do, if I had to give advice, it's like much more free form than a lot of the advent

45:00 of code.

45:00 There's no, because there's no target or even necessarily starting point.

45:05 It could be really hard for some people, I think, to get started or figure out like, what

45:09 could I do with this?

45:10 But I think if I had to give any advice, I think looking through the past entries is so

45:15 helpful, like scanning past entries and even choosing some that inspire you.

45:20 And then whether it's, whether you take data similar to those and maybe try to do something

45:25 related or go in like a similar direction and look for new data.

45:30 I think the existing entries are really, really inspiring.

45:33 I, if I had to-

45:35 Get that creative juices flowing, right?

45:36 Like, oh, I didn't even think you could go down that direction.

45:38 Let's try that with my data.

45:40 Yeah.

45:41 Yeah.

45:41 Because I think if you start with just a blank page and you're kind of new, even to

45:46 great tables, it can be hard to know where to go.

45:48 But once you start looking at entries, I think you get a feel for all the kinds of things you

45:51 can do with it.

45:53 Oh, that sounds great.

45:54 How important is it that you have interesting data to start with?

45:58 Like, hey, let's grab the data that they use to image the first black hole, try to make

46:04 a cool picture out of, you know, table picture out of that.

46:07 That's a great question.

46:09 I think it really hits on one of the neat things about these contests is that you don't

46:14 have to have interesting data at all.

46:15 Actually, one category of entry is kind of zany.

46:18 It's like recreating something that exists in, say, like great tables or some other library.

46:25 Like Rich and I were talking recently about GitHub issues.

46:29 So like GitHub issues are a really interesting example of a table because they're a table that

46:35 software developers work with nonstop.

46:37 Like I'm all in those tables all day.

46:41 And it's like an interesting entry would just be to recreate that.

46:46 Like if you recreated GitHub issues in a table and it looked really spot on, that's like its

46:52 own kind of special accomplishment.

46:55 Like, and I think those entries are fun to see like someone recreate an existing thing in great tables or something else.

47:02 So that, so yeah, to your point on data, I think you could start with even not having to find interesting data, but find an interesting table out in the wild and just try to

47:02 recreate it.

47:12 Yeah.

47:13 So I don't know if this has anything to do with your contest or not, but let me suggest an interesting angle to this in terms of like competitions for this or just broadly

47:13 for people who are listening.

47:23 One is I've got really interesting data.

47:24 I have a story to tell and let me try to like really inspire you with my custom story.

47:29 You know, if you look at racing, like specifically auto racing, we've got F1 where everyone designs their own car.

47:36 There are some rules, but they kind of, they all have a thousand or 500 people that work on it.

47:40 Then you've got spec racing series, like Indy, where everyone has the same car, the same engine, or there's two choices, whatever, same thing.

47:47 And then they compete.

47:48 So it would be interesting to have a category that is like, this is the data.

47:52 You have to work with this, but tell a story.

47:54 Like who could tell the story the most interesting with like the same data, which sounds a little bit Kaggle-like, I suppose.

48:00 Yeah.

48:01 Or Tidy.

48:02 Have you seen Tidy Tuesday?

48:03 Tidy Tuesday?

48:04 Yeah.

48:05 Tidy Tuesday.

48:06 I think it's exactly what you're describing.

48:07 Every Tuesday they release a data set.

48:09 Yeah.

48:10 And then people, oh, wow.

48:12 It's very, there's a GitHub repo somewhere where they.

48:16 Yeah.

48:17 If you search for Tidy Tuesday GitHub, you'll definitely find it.

48:19 But essentially, yeah, it's just like data and you're, you left your own devices about how to present it.

48:24 It could be tables, plots, anything.

48:26 Yeah.

48:27 I'm surprised the big Tidy Tuesdays, it's kind of a tricky one.

48:30 There's a, I think there's a Tidy Tuesday organization, but yeah, same data and yeah, people analyze it and then they share it out on socials.

48:39 Yeah.

48:40 So you get to, I think exactly to your point, like you start from the same place and you see where people get to.

48:45 And I know like one R user, Dave Robinson for a long time, would screencast himself live analyzing for an hour.

48:52 Oh, wow.

48:53 Okay.

48:54 So he plots like, what does it look like for someone to analyze the data?

48:57 And I think that's, that's a fun dimension too, that you don't see often.

49:01 Like just how does it unfold?

49:03 Like, right.

49:03 What point does he plot?

49:05 And when does he go backwards?

49:06 It's like the Bob Ross of, of tables, you know, like the happy little trees come first and then this and that, oh, that's how you make that picture.

49:15 Okay.

49:15 Right.

49:15 It does take time.

49:16 Yeah.

49:17 I mean, like just yesterday I was making new examples.

49:19 I'm not surprised how long it takes to make a compelling example for a table.

49:22 It's like, thanks time.

49:23 You have to mess with the data.

49:25 Yeah.

49:25 And I will say this, this coffee table example that Marco pointed out.

49:30 So the, the coffee table example for people listening is, it's an example table we created where every row is a like coffee device.

49:39 So it's the idea of this table is for like a fictional coffee device shop and every row is a coffee device.

49:46 And then there are columns on like, oh, how many did we sell?

49:50 How much profit did we make?

49:52 And then there's a nice little, there's a column that's little bar charts, that show the monthly sales broken down.

50:00 try hovering over one of those bars.

50:02 Yeah.

50:03 Yeah.

50:03 Yeah.

50:03 Yeah.

50:03 So this is one of the things I wanted to make sure that we got a chance to talk about, because this is, we've talked about having images in here, which is great.

50:11 And the, what is this thing called?

50:14 That this group spanner, I can only think splitter.

50:17 I'm like, no, it's not the spanners and the colors and stuff there, but then you have what in the framework referred to as nano plots.

50:26 Yeah.

50:26 Right.

50:27 So that we already saw the bar graph, but that was just an example, but here we've got like a little bar chart.

50:32 And Rich, as you point out, as I interact with it, it's, it's like a little plotly type thing.

50:36 Like if I had, if I were real young and my eyes were real good, I could read that.

50:40 Well, they're very small, but you have very low space.

50:45 So like, I mean, they're nano size.

50:47 That's why they're nano size.

50:48 They're not even micro plots.

50:49 They're just nano plots.

50:50 Exactly.

50:51 But yeah.

50:51 That's all we got here.

50:52 They're meant for HTML and meant to be lightweight, slightly interactive.

50:56 Just to give you a little something, you know, to, to mouse over and to, to drill down.

51:01 Massively important because it shows you it's, I had Stephanie, Stephanie Molin on a little

51:09 while ago and she has this project that sort of, if you want the same information about the

51:14 statistical summary of some set of data, like the standard distribution, the mean, the max,

51:20 the min, and it could be like one little blob or she has this thing, it'll animate it into

51:25 like a kangaroo or something.

51:26 And then it's, it's got all the same information.

51:28 You have the same problem with tables here, right?

51:30 Like the, the total amount sold, the percent sold.

51:32 But if you look at the, the nano plots, some of them are really spotty and some of them are

51:38 pretty much flat and that it tells you a whole nother dimension.

51:41 That's kind of, this is kind of what I was getting at.

51:43 Like you could make this multi-dimensional and that that's another piece of communication

51:47 that isn't just, you know, rows and columns.

51:50 Yeah.

51:50 It's a nice conversation too, because each of those bars could variable be a column, right?

51:54 I can have like a January column all the way up to.

51:56 Yeah, exactly.

51:57 Yeah.

51:57 So it just shrinks it down.

51:59 And it's useful.

52:00 So you're having a table and a plot and you got to like jump back and forth.

52:03 Like it's, it emerges these two ideas in a pretty cool way.

52:06 Yeah.

52:06 Sort of explain the difference between tables and plots.

52:09 I like too, that with the nano plots, like these little bar charts, maybe their job, like

52:14 in a lot of tables, their job is to focus you onto a row.

52:17 Like if you notice that pattern's interesting, it pulls you onto that record, that row.

52:22 And then that lets you visit other columns to see like supplemental information that maybe

52:28 suggests what could be happening or like gives you information to look up more as you're doing

52:33 an analysis.

52:33 So it's kind of a neat, like visual pattern as a way to index onto interesting cases.

52:39 And then table columns have the power to give you more information about these cases.

52:44 Yeah, that's cool.

52:45 It seems like because you can put markdown into these pieces as well, you could have them

52:50 maybe jump to the full size plot or something more, more rich as you want to kind of expand

52:55 a row or something.

52:56 Yeah, totally.

52:57 I think there are interactive table libraries too.

53:01 So one we have ported to Python recently is called Reactable.

53:05 So we made a port called ReactablePy that it offers expanding tables.

53:10 So if you're interested in a row, you can click it to expand down, say like more information

53:17 or even a lot of like detail and description.

53:21 But I think to your point, yeah, that's a really neat possibility of being able to like

53:25 click and either be taken somewhere or have the table like open up.

53:30 And yeah, that would be cool to have an inline capability.

53:33 Yeah, I do.

53:35 I do want to note one thing about this table that I think about a lot is we I'm so glad that

53:40 Marco brought up.

53:40 It was useful because we spent a long time.

53:43 We spent, I think, a couple of weeks creating this table.

53:46 And I talk about this with open source developers sometimes that I think about this, like how

53:53 long we spent on this table, because this table's job was to illustrate like structuring, formatting

53:59 and styling in like what should be like your first encounter with great tables.

54:05 And so we really, in a sense, as we created this table, we were on the hunt for a table that

54:10 both introduced people to a range of activities, like looked nice, but also had code that we

54:17 could explain.

54:18 And I it's like so hard.

54:20 I feel like as open source maintainers sometimes to find that like right example, like compelling

54:26 shows off like a range of activities, but also is approachable.

54:31 And I do remember like Marco, one thing that opened my eyes is he, I think he shared this

54:35 table on LinkedIn once, but he shortened it to like three or four rows.

54:39 And that I'm always really interested in seeing those types of activities because I think that

54:44 post really took off.

54:46 And it, I don't know, it always like teaches me the power and the value of really like getting

54:52 an example down and getting it.

54:55 So it's interesting, but also like short and compact is like such a art that some people

55:00 are so good at.

55:01 Yeah.

55:01 It's, it's very cool.

55:02 We don't have much time left.

55:03 So I want to focus on the last bit of this journey that data goes through great tables.

55:09 And that is what happens, where does it go at the end?

55:12 You know, like what can I export?

55:15 Yeah.

55:16 So we talked about HTML and this interactive bit, but what are the different places that you

55:22 could send this to?

55:23 Well, I think the big ones are notebook.

55:24 You're just iterating through some data and you want a table.

55:27 Maybe you want to present that as well.

55:28 So notebook is like a really good example.

55:30 So maybe you put that actually just in a cell and just as part of the notebook, it just runs

55:35 there.

55:35 Yeah.

55:35 That's right.

55:36 That's right.

55:36 And another one is just like, you want it somewhere else and maybe you want it in a presentation.

55:40 You want a graphic.

55:41 So we have a facility for that.

55:42 It's the save method.

55:43 We just like you get a image file or even a PDF from the table.

55:47 And sometimes you just want to take that HTML and run with it somewhere else, maybe embedded somewhere,

55:51 maybe put it inside of an email.

55:52 So we have an as raw HTML method, which gives you an HTML string that you can just like pop

55:58 in somewhere else.

55:59 Nice.

55:59 And does that just give you like the worst possible HTML in the sense that like everything that

56:05 has a color has a style set straight on it and stuff like that?

56:08 You have two different stages.

56:10 You can have it like the style block, sort of like as a one div with the table and the

56:14 style block embedded.

56:16 Or you can have all the styles in line, which is a new feature, which is actually useful for

56:19 emails.

56:19 Yeah, exactly.

56:21 Because as much as you want to write nice stuff, we can't have nice things because a lot of

56:26 the email clients forbid you from putting styles in there.

56:28 It's most notably Gmail.

56:30 If you put a style, even a solid style block, it'll just throw it away.

56:34 So you've got to just jam it onto everything.

56:36 If that thing can be bold, it's in a span and it has, you know, style, font weight bold

56:40 on it, you know?

56:41 When I first learned of that like years ago, because I'm in the email game as well.

56:44 I was like, really?

56:45 How does this exist in this year?

56:48 But it does.

56:49 It still exists because of certain clients.

56:52 So, yeah, but luckily there's lots of like great, and we have it here too, but there's

56:55 lots of people who have developed libraries to inline CSS into tags.

56:59 Yeah.

57:00 Yeah.

57:00 That's the right purpose.

57:02 I guess one more output, sorry, is a little attack, right?

57:07 Yeah.

57:07 As the final destination.

57:08 Yeah.

57:09 That's a new thing.

57:10 So if you're running a paper, you want a table in it, we now have some way to get the

57:14 table in attack.

57:15 If you want your academic paper or your dissertation to have it, like, in it goes.

57:19 Yeah.

57:20 Sorry, Michael, what were you going to say?

57:21 Oh, no, I think one interesting thing too, is I think .show, we have a really great contributor,

57:26 Ju Young, that prompted this.

57:29 I think because he was teaching a workshop to vision impaired, like data scientists.

57:35 And so .show is handy to be able to call .show and have it open a browser with the table, I

57:40 think was like really useful.

57:42 And that was so helpful to have as a issue.

57:45 And I think that contribution to GreatTable is really huge.

57:50 Yeah.

57:51 Yeah.

57:51 Because then a screen reader could read it.

57:53 Exactly.

57:54 Yeah.

57:54 Yeah.

57:54 Yeah.

57:55 Facility for that.

57:55 But also it's kind of disappointing to work in the console and not have anything being shown.

58:00 It's like, what?

58:01 Yeah.

58:02 So that's really good.

58:04 And I see Carol noted Quarto, which is actually the whole GreatTable's website is built with

58:09 Quarto.

58:10 Quarto.

58:11 And that's a nice format.

58:12 And that's a nice format.

58:13 So it uses a file called a QMD, which is a lot like Markdown, but it can run the code.

58:19 And Quarto is really convenient for building things like websites or HTML reports.

58:23 And it's also developed by Posit.

58:25 But we, yeah, we end up putting tables a lot in like Quarto documents or the like GreatTables

58:32 website.

58:32 Yeah.

58:33 Through Quarto.

58:34 And maybe he's too bashful to mention it, but he developed like the library Quarto doc,

58:37 which makes this site possible and makes those tables rendered.

58:41 Oh, that's awesome.

58:42 Like Michael.

58:42 That is.

58:42 Oh, yeah.

58:43 I guess for context.

58:44 So like some tools like IBIS, their API docs and the GreatTables docs.

58:49 Yeah.

58:50 We just created a small tool called Quarto doc to let them put their like API references in

58:56 these websites so they can use Quarto.

58:57 Yeah.

58:58 That's excellent.

58:58 Yeah.

58:59 Very, very cool.

58:59 All right.

59:00 Well, I think we're pretty much out of time.

59:03 I guess give people a sense of where you're going, like roadmap type stuff.

59:07 Is anything they should be looking out for?

59:09 Yeah.

59:10 I think we want to port quite a bit more from the R program over to GreatTables.

59:14 That includes things like merging, concatenating values from different columns into single columns.

59:18 Things like adding footnotes to tables.

59:21 So the footer is more put to better use.

59:24 And more refinements to formatters and additional formatters.

59:27 There's quite a few more things to go.

59:29 But even right now, it's pretty mature.

59:32 But there's probably things people will ask for that I haven't thought of.

59:35 So that's the nature of the game.

59:36 I think Anon is asking for Excel.

59:38 We'll put Excel in there, you know?

59:40 Yeah.

59:41 Yeah, I'll get to that.

59:42 The snake will eat its own tail.

59:43 Excel's a big one.

59:46 We'll put GreatTables in Excel so you can just go all the way to the bottom, wherever that is.

59:51 Well, I think one other thing to note is extensions that in R, tons of people have extended GT, the GreatTables for R.

59:59 And tons of helper packages.

01:00:01 Like if putting a bar in your table is something you want to do, there are a lot of these inside extra helper packages.

01:00:09 And so I think one nice thing would be we want to kind of create a example helper package just to give a feel for how people in Python could also create this kind of stuff for

01:00:09 GreatTables.

01:00:20 If you want to extend like, yeah, if you wanted to create your own little bar charts in GreatTables, it seems like people have done a lot of that in R.

01:00:30 And so it'd be cool to try to foster that kind of ecosystem and extension.

01:00:34 Absolutely.

01:00:35 Yeah, that's awesome.

01:00:36 All right.

01:00:36 Well, I'm going to leave everybody with one parting thought, and then I'll let you guys give a final call to action.

01:00:43 You're looking at some of these examples, especially those 10 ones that you called out, Rich, that I'll link in the show notes.

01:00:50 If you're doing a presentation to your company on your blog as a data scientist or just generally, it's so easy, I think, to just have the tables and then maybe somewhere

01:00:50 you have a picture or something.

01:01:03 But if you put something together like that, that will get people's attention straight away, right?

01:01:07 It's just really another level of production.

01:01:11 And the fact that you do it with Python means it's reproducible.

01:01:14 You do it once and off it goes.

01:01:16 Like that's pretty excellent.

01:01:17 So if that sounds like something you all do, you should check out Great Tables.

01:01:20 It looks pretty excellent.

01:01:21 All right.

01:01:22 Michael, give a final call to action.

01:01:25 People wanting to get started with Great Tables, what do you tell them?

01:01:27 You know what?

01:01:28 Visit Great Tables, probably on GitHub.

01:01:31 Check out the examples.

01:01:34 I think the examples are such a helpful way to get started.

01:01:37 And then if you get any questions at all, we have a Discord or we love hearing from people in the issues.

01:01:44 You know, we want to hear from you.

01:01:45 Read it all.

01:01:46 We consider it pretty, you know, pretty carefully.

01:01:48 Not carefully, but we want to do people.

01:01:50 Like style pays.

01:01:52 Like you just got to churn out those beautiful tables.

01:01:55 You know, there's no other choice.

01:01:57 That's right.

01:01:58 I think also once you set up one of them, if you kind of do similar types of tables, like you can reuse that code in a lot of ways.

01:02:06 So unlike doing a design or something.

01:02:09 All right, Rich, final word.

01:02:11 Final word.

01:02:11 Contribute.

01:02:13 Like if you want more of a call to action, just get into the site, discuss things with us.

01:02:18 Even if your idea is like way in left field, we'll probably consider it and maybe even do it.

01:02:22 That's how eager we are to please when it comes to tables.

01:02:26 Awesome.

01:02:26 So PRs are accepted.

01:02:27 PRs, anything.

01:02:28 Issues, complaints, whatever.

01:02:30 Beautiful.

01:02:32 All right.

01:02:33 Thanks for coming on the show, guys.

01:02:34 Yeah.

01:02:35 Thanks for having us.

01:02:36 Yeah.

01:02:36 Congrats on the project.

01:02:37 See you later.

01:02:37 See you.

01:02:38 This has been another episode of Talk Python to Me.

01:02:42 Thank you to our sponsors.

01:02:43 Be sure to check out what they're offering.

01:02:45 It really helps support the show.

01:02:46 This episode is sponsored by Posit Connect from the makers of Shiny.

01:02:51 Publish, share, and deploy all of your data projects that you're creating using Python.

01:02:55 Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports, Dashboards, and APIs.

01:03:02 Posit Connect supports all of them.

01:03:04 Try Posit Connect for free.

01:03:06 By going to talkpython.fm/posit.

01:03:09 P-O-S-I-T.

01:03:11 Want to level up your Python?

01:03:12 We have one of the largest catalogs of Python video courses over at Talk Python.

01:03:16 Our content ranges from true beginners to deeply advanced topics like memory and async.

01:03:21 And best of all, there's not a subscription in sight.

01:03:24 Check it out for yourself at training.talkpython.fm.

01:03:27 Be sure to subscribe to the show.

01:03:28 Be sure to subscribe to the show.

01:03:29 Open your favorite podcast app and search for Python.

01:03:32 We should be right at the top.

01:03:33 You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct

01:03:39 RSS feed at /rss on talkpython.fm.

01:03:43 We're live streaming most of our recordings these days.

01:03:45 If you want to be part of the show and have your comments featured on the air, be sure to

01:03:49 subscribe to our YouTube channel at talkpython.fm/youtube.

01:03:54 This is your host, Michael Kennedy.

01:03:55 Thanks so much for listening.

01:03:56 I really appreciate it.

01:03:58 Now get out there and write some Python code.

01:03:59 Bye.

01:04:00 Bye.

01:04:01 Bye.

01:04:02 Bye.

01:04:03 Bye.

01:04:04 Bye.

01:04:05 Bye.

01:04:06 Bye.

01:04:07 Bye.

01:04:08 Bye.

01:04:09 Bye.

01:04:10 Bye.

01:04:11 Bye.

01:04:12 Bye.

01:04:13 Bye.

01:04:14 Bye.

01:04:15 Bye.

01:04:16 you Thank you.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon