Learn Python with Talk Python's 270 hours of courses

Interactive graphs with Bokeh and Python

Episode #222, published Fri, Jul 26, 2019, recorded Wed, Jul 24, 2019

Do you have data you want to visualize and share? It's easy enough to make a static graph of it. But what if you want to zoom in and highlight different sections? What if you need to rerun your ML model on selected data? Then you might want to consider working with Bokeh. It does this and much more. Join me on this episode where you'll meet Bryan Van de Ven who heads up the Bokeh project.
Bryan on Twitter: @bigreddot
Bokeh on Twitter: @BokehPlots
Bokeh: bokeh.org
Bokeh demos: demo.bokeh.org
Bokeh's Discourse: discourse.bokeh.org
Dask: dask.org
microscopium: github.com/microscopium
Chartify: github.com/spotify/chartify
Holoviews / panel: pyviz.org
Light Kurve: github.com/KeplerGO/lightkurve
PyOxidizer: gregoryszorc.com/blog
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Episode Transcript

Collapse transcript

00:00 Do you have data you want to visualize and share? It's easy enough to make a static graph of it,

00:04 but what if you want to zoom in and highlight different sections? What if you need to rerun

00:09 your machine learning model on the selected data? Then you might want to consider working with Bokeh.

00:13 It does this and much more. Join me on this episode where you'll meet Brian Vandeven,

00:18 who heads up the Bokeh project. This is Talk Python to Me, episode 222, recorded July 24th, 2019.

00:25 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries,

00:43 the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter,

00:48 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm,

00:52 and follow the show on Twitter via at Talk Python.

00:55 This episode is brought to you by Ting and Linode. Please check out what they're offering

00:59 during their segments. It really helps support the show. Brian, welcome to Talk Python.

01:03 Hi, thanks for having me.

01:04 Yeah, it's great to have you here. You know, I've often thought about ways in which I could

01:09 use some of these cool Python visualization libraries, and I haven't recently had some

01:14 great excuses to use them, so I haven't really covered them enough on the show, but I'm really

01:18 excited to talk about Bokeh with you this week.

01:19 Oh, I'm super excited to be here. I think Bokeh has really developed a lot over the last year or so

01:24 in particular, and so this is a great opportunity.

01:26 Yeah, absolutely. Before we get to it, though, let's start with your story. How'd you get into

01:29 programming in Python?

01:30 In Python? So I think the first version of Python I ever used was Python 1.4, actually way,

01:34 way back in the day, and I was doing some system administration kind of job, so there was a lot

01:39 of Perl, but I happened to get into using Python for a few things, and it was a lot of fun. Put it

01:44 down for a while, picked it up here and there, but I've been using it pretty extensively

01:47 probably since about 2005 or 2006.

01:51 Okay, yeah, those are pretty early days, Python 1, right? We don't have that debate about 1 versus

01:56 2 anymore. It's moved on to 2 versus 3.

01:58 Yeah, I don't think there was ever really much debate. Everyone was ready for Python 2 for sure.

02:03 Yeah, absolutely. So how'd you get into programming in the first place?

02:05 Let's see. The first thing I ever did was on a TRS-80 that was actually checked out from our

02:10 local library. They had a program to check out TRS-80s for two weeks, and there was a logo cartridge

02:15 that came with it, so we could do logo programming. A little bit later, we had some Commodore computers,

02:19 and so I did, you know, basic, and I think at one point I even got into like 6502 assembly,

02:23 you know, when I was getting to be a teenager or something, but yeah, you know, just 8-bit

02:28 programming way back in the day.

02:30 Yeah, how interesting. Yeah, that's funny with assembly language, like that's not a super

02:34 easy compare it. Like you've got basic on one side and assembly language on the other. Not

02:38 a whole lot in between, huh?

02:39 Well, there's not a lot of different ways to program on a Commodore 64.

02:44 You had to earn your programming stripes back in the early days, that's for sure.

02:48 Nice. Okay, so Boca is a very visual thing. For a long time, you were at Anaconda Inc.

02:55 So, is there a science background as well that got you sort of in that path, or how do you get

03:01 interested in all of these things? Yeah, I've had a pretty tortured academic path. I went to school

03:05 for computer science, then left for a while, and I worked in some research labs, and I realized,

03:09 hey, I want to go back to school, and so I actually ended up in graduate school for physics eventually,

03:12 and so I have a pretty strong, you know, mathematics, physics background, but ultimately,

03:16 I did decide to sort of go back into computer science, software engineering. I really like working

03:20 on software, though, that's in the service of analytical endeavors or science and that sort of thing,

03:24 and so this is why, you know, being able to work at Anaconda on all these tools has been really

03:28 fantastic. Yeah, it's got to be super rewarding to have so much impact on the science side. Are you

03:34 still at Anaconda? What are you doing these days? Like, what's a, what kind of programming and work

03:38 do you do day to day now? Yeah, no, I actually just recently left earlier in the year, so I was at

03:42 Anaconda from the beginning. I think I was the last original employee to leave, in fact, except for

03:46 Peter Wang, of course, who's still there, but, you know, eight years is a long time, and so it's just time for

03:51 me to go look for something different, and I actually went to go work at Microsoft, and that was really on the

03:54 strength of some interactions I'd had with folks at DevDiv and Microsoft around Python, around open

03:59 source. Everyone there has been really terrific and really supportive of Python and open source, and so I

04:03 think it's a very different company than when I, you know, thought about it 15 years ago, where I

04:08 probably would have used M dollar sign very sincerely on an angry forum post or something, but, you know,

04:12 everyone there has been really terrific. It's been a good experience, and day-to-day, I work on Azure SDK

04:16 for Python these days, which is, you know, a lot of PR reviewing, writing some code, and

04:21 helping move the direction there.

04:23 Oh, that's really interesting. You probably feel like you're bringing a little bit of the outside

04:27 to Microsoft, right? Like, it is a very different company. They're more open to external stuff, but,

04:33 you know, historically, it hasn't always been that way, so it's probably like, let me tell you about the

04:37 Python scientific stack, folks, things like that, yeah?

04:40 It's definitely interesting, and there's a lot of give and take. So I've actually learned,

04:43 I haven't been in an organization this large in a very long time, and so it's been a lot of personal growth

04:47 and learning for me, just to be in that kind of environment where, you know, people have to interact

04:51 in different ways, and that's been very gratifying and helpful for me, but definitely, I think I have a

04:55 pretty useful perspective to bring as well, especially in terms of, yeah, data science applications

04:59 in Python and that sort of thing.

05:01 Yeah, yeah, super cool. It sounds like a fun job. So let's start off this conversation talking about

05:06 Bokeh by kind of getting the, like, a big picture of, you know, making pictures with Python, right?

05:13 So if I have a graph, I want to do a map, if I want to do some kind of bar chart or some

05:18 visualization of data, what are my options nowadays?

05:23 There are a lot. So if people want to Google, there's actually a chart made by Jake Vanderplass,

05:29 who, you know, is very active in sort of the PyData, SciPy community. He tried to draw a map,

05:34 basically, of all the Python visualization landscape, and there are a lot of tools available. And some

05:38 people think this is really great, and there's a lot of choice, and then some people think that

05:42 there's just, you know, too many things, and they don't know what to deal with. But there are a lot

05:45 of tools. So obviously, you know, MapPlotlib is the very big tool that's been around for a very long

05:49 time. It's a really fantastic tool, and all the devs there, you know, they work really hard. And

05:52 it's been great to see the sort of the strides that it's made in the last few years. In terms of,

05:57 like, web visualization, there's, you know, Bokeh, of course. Plotly is another offering that's out

06:02 there by the company Plotly. Altair is another tool that's been fairly recently added. Actually,

06:07 Jake Vanderplass and Brian Granger from Jupyter put that together. And so it's inspired by

06:12 the Vega plotting sort of toolkit that's available in browsers. And it's sort of a Python wrapper for

06:16 that.

06:17 Yeah, that's cool. I've heard a lot of good stuff about Altair, and that it's really quite nice as

06:21 well.

06:21 Yeah, I don't have a lot of experience with it. I mean, it's definitely intended for

06:24 very high level sort of exploratory data analysis. It's very, you know, useful, especially in notebooks

06:29 in particular. And so it looks very attractive from the things that I've seen. I just am so,

06:34 you know, I'm very involved in Bokeh. And it takes up so much of my time that I almost don't

06:38 have time to look at too many other things very often. But, you know, Jake's a fantastic guy.

06:41 Brian Granger, of course, is great and has made just amazing things for the PyData and SciPy

06:46 communities. So that's a great tool that they put together.

06:49 Cool. And, you know, there's always things like the JavaScript libraries, like D3 and stuff like

06:54 that. Is that really relevant? Or are we kind of got a handle with things like Bokeh and Plotly and so on?

06:58 You know, so people ask that a lot. Like, what's the difference between Python and D3? Or why would you use

07:02 one versus the other? And I think if you have people that are, you know, already using JavaScript

07:07 and they want to work on things with D3, D3 is an amazing tool and it can make incredible, you know,

07:12 output and really fantastic graphics. And there's probably things that are doable in D3 that maybe

07:17 would be more difficult in, you know, Bokeh, for instance. But where I think the sweet spot for Bokeh

07:21 is we've really tried to make it so that people who are already very productive in Python, they're doing,

07:25 you know, work in data science or science, who are using all these tools that are in the PyData stack,

07:30 you know, NumPy and SciPy and Pandas and Scikit-learn and, you know, Dask and Numba and all

07:34 these tools that are really productive with these in Python. We won't let them have access to very

07:38 interactive, powerful visualizations in the browser without having to reach for that JavaScript

07:42 and web tech and sort of be distracted from the actual work that they want to do. And so

07:46 in terms of productivity, I think if you're already working in Python, I think Bokeh is a great

07:50 choice, to put it that way.

07:51 Yeah. Well, Bokeh to me feels like I get a lot of the benefits of the rich JavaScript stuff,

07:56 but that I don't actually have to make it.

07:59 That's a very, yeah.

08:00 Yeah. A very succinct way to put it. Yeah.

08:02 Okay. Interesting. And then maybe we could talk really quickly about Plotly just as a compare and

08:06 contrast. So Plotly, like I don't fully understand Plotly. When I go to work with it, I feel like,

08:12 okay, I'm working with a library, but then it seems like it has like a backend that they provide that

08:17 I have to deal with. And then there's also a commercial version. Like what is Plotly? I don't really

08:21 know where it fits. So in terms of their business, I actually don't know a lot. To be honest, I don't,

08:26 I don't really follow that very closely. And so I think they've actually changed some of their

08:29 offerings from, from what they used to be. I think they used to, you know, sell Plotly and I think

08:32 they're not in that business anymore, but I can't really speak to that very carefully. But the main

08:36 similarities are it's a Python API that generates a declarative, you know, specification,

08:41 typically some kind of JSON that can be rendered by a front end library. Now, you know, Plotly is an

08:46 entire company centered around this. And so they have had some really nice resources for, I think,

08:50 developing both, you know, some things in ways that Bokeh hasn't had. Like I think, you know,

08:53 their front end is a little more polished and some of the, you know, design stuff is definitely

08:56 polished. And I'd love to get some help on the Bokeh side to sort of bring that up to speed.

09:00 But, you know, relatively speaking, I think we've done a pretty good job at, you know,

09:03 having the same set of features. They're very contemporaneous. They started at almost the same time,

09:07 you know, way back in sort of 2012 kind of era. They have a lot of similarities,

09:11 but definitely there's a little bit of difference. You know, my background comes from a lot of science

09:15 stuff. So I'm really familiar with folks that have use cases, for instance, around like dense

09:18 arrays, like, you know, big images. And so we've really focused on some things like having an

09:22 efficient array protocol for the Bokeh server that can transmit large arrays, you know, very efficiently.

09:27 And whereas I don't think they've sort of gone down that route, you know, again,

09:29 they've worked on some other features that are more around slick dashboarding and, you know,

09:34 that sort of thing.

09:35 Okay. Yeah. Interesting. You know, let's talk about the history of Bokeh. Like you're saying it that way,

09:40 and I guess I'm as well, I'm trying to anyway, the way the proper pronunciation is.

09:45 I usually say Bokeh, but Bokeh, I think is also fine. Amusingly, long ago when we were funded through

09:50 this DARPA X-Data initiative, there was some video that someone made, you know, unrelated to all the

09:55 actual projects, but they made about the projects. And I remember in that they were describing all the

09:59 projects that was under the X-Data initiative. And they mentioned the visual database Boke,

10:04 I think. That's the only wrong pronunciation.

10:05 Bokeh is not Bokeh. All right.

10:07 Okay. It's fine. Yeah.

10:08 All right. Excellent. And so where did it get started? It started out of research grants and

10:14 this DARPA funding. Is that where it came from?

10:16 No, the research grants really helped, but it started before then. So going back a little bit

10:20 further. So I've been interested in visualization for a long time. Actually, the first, one of the

10:24 first things I use in Python and Python 1.4 was a plotting plugin for Apache. And it was just,

10:28 it amazed me that you could like take data and have a website just make a plot. It was incredible.

10:33 I worked a little bit on VTK here and there years ago. A few, my first open source contributions

10:37 were to VTK. But in the middle aughts, I guess, I went to work for a company with Peter Wang,

10:43 who eventually founded Anaconda. And I worked with him on a library called Chaco, which was a rich client

10:49 library for interactive visualization. So instead of in the browser, you would write like a Python

10:52 application that was using like, you know, QT or, you know, the GTK, you know, kind of real

10:58 application. And it was also contemporaneous with Plotlib.

11:03 But ultimately, Matt Plotlib won sort of that battle. And the reason was pretty clear. It was

11:07 because, you know, while Chaco had all this really rich capability for an activity, it was a very sort

11:11 of fiddly API, very detailed, kind of verbose. And so later on, when Peter was getting Anaconda

11:19 started, and I was on board to help with that, we had talked about wanting to update this idea of

11:23 Chaco and create a new library that supported interactive visualizations and in browsers,

11:28 because right, browsers is the right place for it to be in 2012, right?

11:30 Of course.

11:31 And so we had this idea, we started it. But getting the, you know, the funding through the

11:35 DARPA X8 initiative in the early years of Anaconda, is then called Continuum Analytics,

11:39 really helped, you know, mitigate business risk for us to put more resources into it. So it really

11:42 accelerated the development, I would say. But we have certainly, you know, we started talking about

11:46 the ideas behind Bokeh in sort of middle 2011, probably.

11:49 Okay. Yeah, that's, that's pretty interesting. It's been around for a while. I guess that's,

11:53 at that point, Ajax and interactive browser stuff's pretty well established, right? So it's pretty

11:58 clear that was the right place.

11:59 Yeah, I mean, we just knew that, like, the future of presentation and the future of getting, you know,

12:03 this content in front of people was going to be in browsers. So writing another rich client library

12:06 was not something that was really interesting to us. And so we definitely wanted to do the browser.

12:11 And then we definitely wanted to make it architected in a way that it was very flexible,

12:15 that it had this declarative specification that described what you wanted to visualize.

12:19 Because, you know, that affords a lot of possibilities. So we've talked about the Python

12:21 side of Bokeh, but you can actually have other languages drive Bokeh plots in the browser,

12:25 you know, there's an R Bokeh binding, there is a Scala Bokeh binding that hasn't been updated in a while.

12:30 I'm interested in actually reviving a Julia Bokeh binding. But that's all because there's this

12:35 JSON specification. And any language that can, you know, dump out the right JSON can create these

12:39 Bokeh plots in the browser. Now, we've spent a lot of effort investing in the Python bindings,

12:42 because, you know, Anaconda is a big Python shop. But certainly the possibility is there for other

12:46 languages as well.

12:47 Okay, so maybe it's worth just touching on the architecture a little bit, and we can dive

12:51 into the details more later. So there's a, I guess there's a couple ways we can use this,

12:55 right? Like, there's a, probably the most straightforward way is we have a Bokeh server.

13:00 And then there's some front end stuff that is the rendering point, right? Like I want to put a,

13:06 some kind of graph in a browser, and the server handles all the data, and maybe it only prevents,

13:12 presents like a slice into that world of the data and things like that, right? Can you tell us about

13:18 that?

13:18 That's absolutely a great use case for the server. But I will say the server is,

13:21 in fact, optional. I would say most usage of Bokeh probably doesn't involve the server. So Bokeh can

13:25 generate this JSON, and it can send it to a web page that can be embedded in like a Flask app or a Django

13:30 app. It can be embedded in, you know, Jupyter notebook output cells. And it doesn't have to be

13:34 connected to the Bokeh server. Bokeh.js will take that JSON and render it. And you've got an interactive

13:39 plot that has panning and zooming. You can even have linked behaviors between plots. You can have custom JS

13:43 callbacks that, you know, do work whenever you make a selection or click a button. None of that

13:47 requires the Bokeh server. What the Bokeh server is really great for is when you want to connect all

13:51 those interactive features to real running Python code. Like you want to click a button and have a,

13:55 you know, a scikit-learn regression or a scikit-learn model run, or you want to, you know,

13:59 make a selection on a plot and compute a linear regression line through those selected points

14:03 with real Python code. That's what the Bokeh server is really great for, making that sort of two-way

14:07 connection between this front end and real running Python code. But you can use Bokeh very effectively

14:12 without the Bokeh server. And in fact, I guess, I think most usage is probably, we call it standalone

14:16 usage, where it's just generating this pile of JSON that is used to drive a Bokeh plot in a webpage

14:21 somewhere. This portion of Talk Python to Me is brought to you by Ting. Let me tell you about Ting,

14:27 a new mobile service available in the US that's targeted developers and other technically savvy folks.

14:33 First of all, their average customer only pays $23 a month, but they're no discount provider.

14:38 Their service runs over T-Mobile's and Sprint's fast nationwide network.

14:42 If you don't use that much data because you're usually on Wi-Fi, like many of you are,

14:45 then Ting will save you a ton of cash. But don't worry, you can still use as much data as you like

14:50 for just $10 per gig. One mobile feature I use all the time is tethering. And with Ting,

14:55 you get unlimited tethering at the same data rate with your account. $6 a month for a phone line,

15:00 $10 a gig, $3 a month for text if you usually chat over iMessage or WhatsApp. Think about it,

15:06 no contracts and super clear and fair billing. Visit python.ting.com. That's python.ting.com and check

15:14 out their savings calculator. Enter your usage and see exactly what you'd pay. Use that link and you'll

15:19 get a $25 credit to try them as well. That's python.ting.com or just click the link in the show notes.

15:26 The server is mostly about the interactive bits if you want to add smarts to your plots.

15:32 Yeah, absolutely. If you want to connect all those PyData tools, NumPy, SciPy, Pandas, Dask,

15:37 Numba, OpenCV, any of those tools. If you want to connect those things directly to

15:42 these interactive visualizations with a minimal amount of fuss, that is what the Bokeh server is for.

15:47 Exactly.

15:47 Okay, cool. Before we move off the history and Anaconda Inc. and all that, when you created it,

15:52 it sounds like you tried to create it as a standalone project with its own fundraising and its own

15:58 outreach. What was the thinking there rather than just making it part of Anaconda?

16:02 Well, I mean, in the very early days, it was definitely a project that was started at Anaconda.

16:06 And the DARPA thing came along somewhat serendipitously. Not something we counted on or

16:11 knew about when the company started. And that was a big sort of funding for a long time. After that ran

16:15 out, because that was a fixed number of years sort of support, Anaconda very generously supported the

16:20 development of Bokeh. But ultimately, it was always the goal to try to create these tools as sort of

16:25 self-sufficient, self-governing, push them out in the open kind of projects. And so it took a long time

16:30 to get to that point. The first step in that was for Bokeh to become a NumFocus fiscally sponsored

16:34 project. But of late, we've really ramped up the self-governance and the self-sufficiency. So pretty

16:39 much at this point, I think the cord's been cut and Bokeh is really out there. It's managing its own CDN

16:43 resources. We're doing a lot of outreach and fundraising on our own right now that wasn't happening even six months ago. We just had or still having actually

16:50 a July fundraiser going on to try to help pay for some of our infrastructure costs. But we're also

16:55 ramping up some corporate engagements and trying to talk to corporations and see if they want to offer

16:58 support. But that all is pretty new. For a long time, Anaconda was the primary beneficiary or

17:03 benefactor, I should say, of Bokeh.

17:04 Sure. And it just makes Python and data science stronger, which is really the heart of Anaconda Inc.

17:11 anyway. So it seems pretty reasonable. You said that the project was a NumFocus

17:15 project. And we spoke a little bit earlier about NumFocus a bit. And I guess my understanding is a

17:22 little bit off. I saw NumFocus as a thing that kind of provides funding to these projects. And

17:30 that's not quite exactly right, is it? What is NumFocus and what did it do for you all?

17:33 Yeah, it's sort of yes and no. So NumFocus was started by Travis Alphont, who was one of the other

17:38 co-founders of Anaconda. Of course, he's the original or he's the author of NumPy, building on previous work.

17:43 But NumFocus is this nonprofit, 501c3. And its main role is to be an umbrella organization for

17:51 open source projects, right? So open source projects often are not legal entities. And so that actually

17:55 makes it quite difficult for them to accept money, right? It sort of gets very complicated with taxes.

17:58 And so they're this legal entity that can accept donations on behalf of projects and then handle all

18:04 the tax stuff. They also do a lot of outreach. They hold, you know, they support the PyData meetups and

18:10 PyData conferences around the world. So they're this organization that sort of helps and supports

18:13 open source. And they do fund these projects in the sense that, you know, they help the donations that

18:17 come in get back to the projects. And also sometimes they spread some of those funds around. They get

18:21 bigger donations that they can also use to give to projects that, you know, don't necessarily raise

18:24 their own money. But yeah, that's their main role. Just to be this umbrella organization, they help out

18:28 with the bureaucracy and take away that load from these open source projects.

18:32 Yeah, that's really interesting. I guess it took me a long time to realize that it's

18:35 actually hard for companies to give money.

18:39 It's really hard.

18:40 To these projects, right? Like it's not just a matter of put aside the point whether they should

18:44 put aside the point whether they fiscally could. Of course, they most of them can and they should,

18:49 right? But just the way that they're set up is they buy things. They exchange money for a service or a

18:58 good. Even given to a charity like NumFocus probably is a little bit odd and hard for them to do.

19:03 That is exactly right. There is just an impedance mismatch. I mean, the sums of money that would help

19:07 a lot of open source projects, I think, are relatively small compared to the budgets of the

19:11 companies we're talking about that are using these projects. But yeah, there's just an impedance

19:16 mismatch for how do you actually it's not a purchase, right? And so it's not hiring someone. So what

19:21 exactly is it? And companies just, you know, right now don't know how to handle that and don't know how to

19:25 deal with it. And so there's some other efforts to make that easier. People are trying different

19:28 things. There's, you know, Travis's company, Quonsight, is trying out some models for, you know,

19:32 getting support for open source projects by engaging with companies in various ways. Tidelift as well

19:37 is also trying that. GitHub is, you know, trying their new sponsorship sorts of things. So people are

19:41 trying to find novel ways to attack this problem. And I hope we get there, but it's definitely

19:44 something that will take some time.

19:46 Yeah, I hope so as well, because it would make a huge difference and it would be

19:49 a blip for these companies to make that contribution, right?

19:52 Yeah. I mean, as an example, our goal for the July fundraiser was to raise a thousand dollars,

19:56 right? And we have, you know...

19:57 That's very modest, right?

19:58 I think several hundred thousand users and we did, you know, I'm really happy to report that we did,

20:02 but, you know, I had to tweet out a lot every day to make it happen. It was, you know, sort of

20:06 knocking on doors and, but yeah, I'd love to be able to go to companies and try to be more effective

20:10 and efficient at fundraising, if that makes sense.

20:12 Sure. Well, while we're on the subject, let's talk about your fundraiser real quick.

20:15 If you're going to raise a thousand dollars, that's probably not money to pay developers,

20:19 right? It's something else.

20:21 Yeah. I mean, I actually raised this exact issue around the sustainability conference that NumFocus

20:26 is going to have later in the year. I think there's sort of two different buckets where things go into

20:29 there's, you know, paying people. And I hope someday we can figure that out really well. And we can

20:33 support, you know, people to be maintainers of open source software by actually paying them to do

20:37 that work directly as a sort of a job or a living. But there's also just the matter of a lot of open

20:43 source projects, I think, could use a fairly small amount of money just to cover expenses or help

20:48 take some strain and stress off the maintainers. Like in the case of Bokeh, we have to run this,

20:52 you know, CDN to deploy Bokeh.js. So all the users around the world can get Bokeh.js to display their

20:57 plots. That is run on, you know, AWS CloudFront. And so we have to pay for that. Someone has to pay for

21:02 that. Right. And so that's what this fundraiser was for. And so in the sense that it, you know,

21:06 sort of reduces my stress because it helps me know that this is sort of taken care of for the next year.

21:10 That's what that level of sort of funding is for. And there's other stuff too. There's,

21:13 you know, it's good to get developers face to face sometimes. And so this could help with that

21:16 as well as other infrastructure costs.

21:18 Right. Some having some like yearly meet up with the core developers. It's odd. There's a lot of

21:23 projects where the core developers have never actually met.

21:25 Definitely. For sure. I think I've met everyone at least once, but there's some folks that are pretty

21:30 scattered for Bokeh. But for sure, I have no doubt that's the case.

21:33 Yeah, for sure. So you said that Bokeh works with this JSON output and it can just basically render

21:39 anything that can generate the JSON. It can go out and render that. You know, I think when I think of

21:44 graphs, I mostly think of notebooks and Jupyter and things like that. But also we can just plug this

21:50 into whatever. Is that right? I can plug it into just a Flask site. Can I plug it into even like a,

21:55 some kind of command line app and like somehow pop it up?

21:58 Yeah, I don't see why not. I mean, anything that can run JavaScript basically, right? Anything that can

22:01 load Bokeh.js. So if that's like an Electron app, I think that would be feasible, certainly in the,

22:06 you know, in the Jupyter notebook. But, you know, there's a variety of ways. We have a whole Bokeh.embed,

22:10 you know, API. And so there's a variety of different ways to embed Bokeh content. But if you're running a

22:13 Flask site, you can use one of those ways to pop up Bokeh content in the middle of your site and in

22:18 your template and sort of wherever you want to put it.

22:20 When I think about the data science space, there's some libraries and data structures that are just used

22:27 over and over and over. You mentioned Travis before. So there's NumPy. There's,

22:31 of course, Pandas. There's a bunch of other stuff built on top of that. Is there special

22:36 integration for those types of libraries? Like if I already have some NumPy array or I've got some

22:42 Panda data frame, is there a way to just like plug it into Bokeh?

22:45 Yeah. So Bokeh works really well with all those things. NumPy is an actual requirement,

22:49 runtime requirement for Bokeh. We tried to avoid that. Even in the beginning, we wanted to make Bokeh as

22:53 minimal and as accessible as possible. But NumPy is a requirement now. Pandas is not a hard

22:57 requirement, but Bokeh works really well with Pandas. If you have Pandas data frames,

23:00 you can basically plug them in anywhere a Bokeh column data source would go. I can automatically

23:05 sort of convert them or use them in a way that's useful, either data frames or group by objects.

23:09 And so we've tried to make it very easy to integrate with Pandas, but also make Pandas

23:13 not required. And so that's sort of the state where things are at. Anything that can sort of behave

23:17 like a list or an array or a Pandas series works pretty much out of the box with Bokeh.

23:21 Okay. Are there other major libraries that I don't know to ask about?

23:25 I think Bokeh works really well with Dask, I assume, because Dask has a data frame-like API.

23:29 Matt Rockland actually used Bokeh to develop the interactive dashboard that's sort of the cluster

23:34 monitor for Dask.

23:35 I think it's great as well. And I did have Matthew Rockland on before to talk about Dask,

23:39 but I don't know if everyone's listened to that one. And also if they've seen the actual

23:45 visualization of that. So could you maybe describe that real quickly, what Dask is and then that

23:49 dashboard? Because when I saw that, that just like blew me away.

23:53 Yeah. Dask is a tool for basically parallel distributed programming. And it's trying to

23:57 do so in a way that is a very sort of Pythonic, very Pandas-like API, right? So there's other

24:02 tools that do this sort of thing, but they come from other languages originally. And so their APIs are

24:06 maybe not very Pythonic and they kind of don't fit in well with Python tools. But Dask is meant to be

24:11 a very Pythonic tool for this distributed computing task. And so to that end, it has this dashboard

24:16 that Matt Rockland developed using Bokeh that can visualize everything that's going on around a

24:20 cluster that's doing computation all at once, right? So it can show you what

24:23 nodes are computing or waiting or they're transferring data in real time. And so that's

24:27 really helpful for diagnosing problems with parallel distributed computations. And so Matt has always

24:32 been very clear that Bokeh was great for him because he didn't have to write all this JavaScript to have

24:36 this really interactive dashboard. He could just write it in Python and connect it directly to his

24:40 telemetry that he was getting back from clusters and visualize it very quickly. And so I think it's

24:43 been a great tool for his users. I shouldn't say his users. It's a big project now. Dask has actually

24:47 grown up quite a bit. So I should say the Dask project's users. But we're happy to see that kind of thing happen.

24:52 You know, love to see Bokeh used in those kind of use cases.

24:54 Yeah. And it just looks so good and professional and, you know, live updating. It's really a nice

25:01 use case for Bokeh. And I think it's also a good testament to what you guys have built.

25:05 Well, it's also good to get feedback from real use cases like that. Nothing sharpens your tools

25:09 better than sort of having them honed against real problems, right? And so we love when people do

25:13 awesome things with Bokeh and tell us, you know, hey, this was great, but also this could be a little

25:16 bit better or easier. You know, this is how you could make my life, you know, simpler or here's some pain

25:20 points I had. That kind of feedback is really helpful from users.

25:23 Yeah. It's great to design something, but once it actually meets real users and real use cases,

25:29 like that's where it gets real. So you talked about Dask. Let's just touch on some of the other

25:35 things that is built upon Bokeh because Bokeh has been around since 2011, 2012, like you said,

25:41 and it's pretty stable for the most part.

25:43 It's a lot more stable recently. Yeah.

25:45 Yeah, that's great. So things are starting to build on top of it like Dask and just using it.

25:49 So what else is out there like that?

25:50 Well, there's a couple of different things in different classes of things, right? So first of

25:53 all, there's other libraries now that are starting to build on top of Bokeh. So there is Chartify,

25:57 which was created by the data labs at Spotify. And so it's their sort of high, very high level

26:02 sort of data science, opinionated data science API on top of Bokeh. There's a project called Pandas

26:06 Bokeh that just came out recently. That's sort of very tight integration with Pandas and using Bokeh to

26:10 generate interactive plots. There's also a set of tools created by some folks who are still at Anaconda

26:14 called sort of the efforts called PyViz. And so there's a tool called HoloViews, which is a very

26:18 data centric API, and it can generate interactive visualizations using Bokeh and other tools as well.

26:24 But there's also, you know, some tools like Data Shader, which are for, you know, very large data,

26:30 you can finally control how they're rendered. And so you can combine Data Shader with Bokeh, you know,

26:34 using HoloViews. So it can drive things at a very high level. So I'd love to see this effort where

26:37 people are building these things on top of Bokeh. And I'm also glad that now, you know, Bokeh was

26:41 a moving target for quite a while. And I very much appreciate the patience of all of our users who

26:45 sort of, you know, kept with us as we were figuring things out. But I would like, you know, we're trying

26:49 to be much more stable now. I think we've done a very good job since 1.0 was released at being,

26:53 you know, much more stable. That's very good. The other kind of things that get built on top of

26:58 Bokeh are more like applications or, you know, other projects. So there's a project called Microscopium

27:04 that is for, you know, sort of biosciences research. There's a tool called Light Curve that a bunch of

27:09 astronomers put together, which uses Bokeh to let you drill down. So you can see like an image of,

27:13 you know, some star or something and hover over a single pixel and really drill down into something

27:17 about, you know, that image using, you know, all the tools that are in Bokeh. You know, there's

27:21 actually quite a, Dask, of course, is a great example. And then there's actually a bunch of other

27:25 ones on GitHub. And I'm sorry, I don't have a list at the top of my head, but there are a lot of

27:29 exo-bioscience projects that are built on top of Bokeh. And, you know, financial trading is another

27:34 thing that comes up. People have done drug discovery type work on Bokeh. It's interesting where things are

27:39 popping up, especially, you know, now, and now things are more stable. I think it's really a great time for

27:43 that to happen.

27:44 Yeah, that's super cool. The Light Curve project looks amazing. I mean, to explore like data from the

27:50 the Kepler and TESS telescopes, that's pretty cool for exoplanet discovery. I mean, that's,

27:57 it's really exciting.

27:58 Like the project that you worked on is helping scientists like actually look for exoplanets.

28:03 Like, that's incredible.

28:04 That is a, no, it's really gratifying. Like, that's exactly the sort of thing that, you know,

28:07 I'd say we wanted to be able to enable and, and, and help, help make happen. So it's really

28:11 gratifying when people are able to use Bokeh for those kinds of situations.

28:14 Yeah, absolutely. Do you see Bokeh being useful or appropriate for like real-time dashboard type

28:23 of scenarios? I mean, obviously it draws great graphs. And then, you know, we talked about Dask

28:28 and the real view of that. So like, let's imagine I'm like building software for software, for

28:33 stock trading company or something like this. And I want to show in real time what the market's doing.

28:39 All the data the traders need. Would that be appropriate? Is it too low late or too high

28:43 latency or what's the story? I think it could be there's real time. And then there's quote,

28:46 unquote, real time. Right. So it's always depends on what exactly people mean when they say real time.

28:50 When I hear the word real time, I have images of like, you know, very low level, a certain kind

28:54 of system that's implemented that has very specific guarantees about its performance. And for that kind

29:00 of work, Bokeh is probably not suitable. Right. But if you're talking about real time, just in terms

29:03 of like streaming data coming in from a financial system or, you know, from, IOT type devices

29:09 or, you know, that sort of thing. And I think Bokeh can, has been useful. There's certainly

29:13 people who have come looking for support on our discourse or on our other support forums,

29:16 talking about using Bokeh, connecting it to real time sensors, for instance. Right. So they're in a

29:21 factory or a warehouse and they've got data coming in. I want to visualize something. So people do do

29:24 that. So if you mean, you know, quote unquote, real time, and just in the sense that I've got data

29:28 coming in and I want to visualize it in a best ever kind of way, then I think Bokeh is definitely a

29:32 good choice.

29:33 Yeah. And that's mostly what I meant. I mean, obviously not like we need seven millisecond response time or else

29:37 the plane crashes, something like that. Nothing like that. Right. But like, like when you're

29:42 thinking of graphs, right, like a human has to see the graph and interpret the graph. Right. So that's,

29:46 you know, how long is that? It can't be quicker than a hundred milliseconds. Right. Like the human

29:51 can't understand graphs that quick.

29:52 Well, and so real time is not even necessarily about quickness. It's really about predictability. It's

29:56 about specific kinds of guarantees. But, but yeah, so I should mention, yeah, Bokeh does have some APIs

30:00 for streaming specifically. Right. So, you know, if you've got data coming in or you want to update,

30:04 you know, just the newest point, you've got a, you know, a time series with a hundred thousand points,

30:08 you know, plotted and you've got new points coming in at the end. Bokeh can very efficiently just send

30:13 the new data, right. Without sending, you know, the entire data set. So it is very useful for that sort

30:16 of thing. Yeah. So then you load up the historical data and then you just take, you know, the update

30:21 every half a second or something. That sounds pretty doable. Yeah, absolutely. Okay. Even more often.

30:25 Yeah, yeah, yeah. Sure. That sounds, that sounds super interesting. One of the capabilities kind of

30:30 around that, that you talked about is being able to work with like quite large data, maybe having some

30:36 on the server or something like that. And then either interpreting that or running a machine learning

30:41 model against that or something like that. Maybe tell us some of the use cases there.

30:45 Yeah. There's a couple of different ways you could use Bokeh for that. So one is that, you know,

30:49 if you have a large data set, you know, you're not going to send a billion points into your browser,

30:51 right? Your browser will just fall over. So you've got to find some ways to sort of minimize the data.

30:55 And that can be done in a variety of ways. And one of the ways is for instance,

30:57 it's downsampling. So if you have a large data set and you've got some reasonable way that makes

31:01 sense for your use case to downsample it, the Bokeh server can just do that downsampling and then show

31:05 you the subset of data that's relevant, right? So that's one way that you can use the Bokeh server

31:09 to handle sort of large data sets. Another way is I mentioned this tool data shader, which is

31:13 actually specifically designed for being able to, you know, very efficacious visualizations, images of,

31:19 you know, hundreds of millions or billions of points, right? It gives you very fine control over the way,

31:22 basically the more sophisticated version of alpha compositing happens. So you can actually try to

31:25 emphasize the things you want to emphasize in a meaningful way. And so you could use data shader

31:30 to data shade those 100 million points. And then that just produces an image and then you can send

31:35 the image to Bokeh. And so that's a very fast operation. So the data shader is sort of a form of,

31:39 you know, data compression in that sense, but it's still very interactive because, you know,

31:44 you can use the events that Bokeh generates to when you, if you resize the plot to get a new image

31:48 generated, if you've, for instance, changed the balance of the plot, if you pan or zoom,

31:52 you can get a new data shader image based on those new, on those new dimensions. And in fact,

31:55 HoloView sort of does that all automatically for you. You can do it by hand with Bokeh and data shader,

32:00 or, you know, at a high level, HoloViews can sort of take care of that for you. That's another way in

32:04 which, you know, you can sort of reduce the amount of data that you're going to send into the browser.

32:07 Coming up next year, I hope we can actually raise though the ceiling of the number of points you can

32:11 send. We're going to try to do some work, hopefully, to better improve the WebGL support in Bokeh,

32:16 and maybe even just have Bokeh be based entirely on WebGL. In which case, I think we could,

32:20 you know, right now, Bokeh, you could send a few hundred thousand points to it,

32:23 and it's typically okay. But I think we could raise that sort of ceiling a little bit higher

32:27 once we are able to render completely in WebGL.

32:29 Yeah, that would be pretty amazing. Is it using like canvases or something now?

32:33 Yeah, exactly. So it uses the HTML canvas. There is currently some level of WebGL support.

32:37 And the person who maintained that and originally wrote that is just, he's moved on to other things.

32:40 And so that WebGL support is sort of, it needs a little work, a little love and care. And we'll

32:46 probably just go ahead and try to do things sort of from the beginning and re-found that in a cleaner,

32:51 better way going forward. But yeah, there's some WebGL support now, but most of the rendering happens

32:55 on HTML canvas.

32:56 Okay. So let's talk a little bit about some of the internal implementations of this. Like,

33:02 when most people interact with Bokeh, they're probably interacting with some Python API. And as far

33:10 as they're concerned, like that's the end of it, right? Like, I call these functions,

33:13 the plot comes up magic.

33:14 Yeah. And actually the API is actually quite light. So by and large, we've turned the problem

33:19 of creating an interactive data visualization web app into the Python problem of creating a bunch of

33:24 objects and setting their properties, right? So, you know, I mentioned this JSON representation and

33:28 that JSON representation actually mirrors on both sides, a set of objects and those objects being a

33:32 graph. So there's a set of objects like a plot, which has, you know, a bunch of renderers and has a

33:36 couple of axes and some ranges and some data sources. Maybe that's in a layout that also has some

33:40 buttons. So there's objects we call the models that represent all of those items. All of those get

33:45 turned into JSON. And then on the JavaScript side, there's a one-to-one correspondence basically of

33:49 objects that those get turned into JavaScript objects. The role of the Bokeh server is just to

33:53 keep those two sets of objects in sync bi-directional, right? But in terms of what you use from Python,

33:57 you create this plot, maybe use the figure function, which sort of puts a lot of these objects

34:01 together for you in a convenient, meaningful way. And then you can twiddle their properties. You can

34:06 change, you know, the start and end of a range, or you can add the data to the data source, or you can

34:10 change various properties of a circle glyph because you want to change how it appears. And so all of

34:15 this is, you know, just setting these properties on these Bokeh models. And that is the main, the main

34:20 thing that people do, I think. Apart from that, you might be writing callbacks. If you're using the

34:24 Bokeh server, you can write callbacks in Python, you know, for if a button gets clicked or a selection

34:28 is made, you can run Python code. But you can also create JavaScript callbacks for the standalone

34:32 case where, you know, I don't have a Bokeh server, but I still want something to happen when a button

34:35 gets clicked or, you know, the selection is made. You can write a little snippet of JavaScript and that

34:39 will, you know, do that amount of work. And typically those callbacks, the end result of that is, again,

34:43 setting some properties on these objects, right? They might update the data source, which causes the

34:47 plot to update, or they might change the range bounds, which causes the plot to zoom out, that sort of

34:51 thing. So there is definitely API. There are functions, you know, there are functions for embedding,

34:55 there are functions for showing things in notebooks, there are functions for creating plots to start with.

34:59 But most of the content of the Bokeh library is these objects, you know, we call models, and they all have

35:04 these typed properties that you can set values for. And that's the main interaction mode.

35:09 Yeah, so very declarative in that sense, right? You set the aspects or the features that you want,

35:15 and it just figures out how to make that interactive.

35:18 Yeah, exactly. Yeah.

35:19 Nice. So it sounds to me like, listening to you talk, there's a lot going on with JavaScript

35:24 here, even though the typical consumer user of it, the developer doesn't have to care or work with it.

35:31 What are you using there? Like, what was the history? Was that always just straight JavaScript?

35:35 Or what's what are you doing?

35:36 Yeah, it's actually never been straight JavaScript. But you're right, the bulk of the work of Bokeh is

35:40 actually in this library, Bokeh JS, right, which is JavaScript library.

35:42 How big is it?

35:43 So minified, I think the main core library is about 600k. It's a pretty hefty library.

35:48 That's a pretty hefty library.

35:50 It is, right. We're looking to make things, you know, as optimized as we can. We definitely could

35:53 use help from, you know, more experienced JavaScript developers. So when Bokeh started, I mean, it was

35:58 started by me and a few other folks who are working in none of us, I think, had a lot of front end

36:02 experience. I didn't have any JavaScript experience when this project started. And so we actually chose

36:05 CoffeeScript at the time. And so that, I think, was maybe a good choice for the time, because it

36:10 allowed us to iterate very quickly and sort of make, you know, mistakes more quickly, I guess.

36:13 You try out things, you know, it's sort of Python looking like, you know, it's one of these

36:17 transpiled languages that turns into JavaScript. But ultimately, once the project grew very large,

36:22 it wasn't really suitable for that. And so we actually did a large effort. Most of that work

36:26 was done. Heavy lifting was done by one of our core contributors, Mateus, to port Bokeh to TypeScript.

36:31 And that's been a huge win for the project. I mean, just in doing the port to TypeScript,

36:34 a lot of latent bugs and problems were uncovered. Certainly since it's been done, you know,

36:39 I've been prevented from checking in things that would have been an error, you know, by the TypeScript

36:42 compiler. So I'm a big fan of that and glad for that. There are certainly new contributors who find it a

36:47 little bit more difficult or daunting sometimes to work with TypeScript so that there is a barrier

36:50 to entry that's a little bit high for Bokeh. And that's actually just in general, been a problem

36:55 for us, I think, to attract sort of contributors on that side, right? Because Bokeh is targeted towards,

37:00 you know, Python developers with the promise that they really don't have to worry about JavaScript

37:03 if they don't want to. But all the work's actually in JavaScript. And so, you know,

37:07 we need JavaScript developers to come help make Bokeh better. And so for the most part,

37:11 and so that's been a challenge for us a little bit.

37:13 You have this bimodal distribution of skills and desires and stuff like the Python folks and the

37:19 JavaScript folks. And yeah, it's interesting.

37:21 So we're trying to make Bokeh itself like a, you know, there are people who use the Bokeh

37:24 JavaScript library just by itself as a JavaScript library. I would say that from my perspective,

37:28 quite a bit of work is needed to do to make that a serious sort of contender for something people

37:32 want to use. But we definitely would like to get that done. And we'd love to get help doing that.

37:35 I think making Bokeh JS as sort of a first class JavaScript library in its own right would be very,

37:39 helpful for our project. And certainly it'd be great to get a community around that as well.

37:43 But that's a longer term goal.

37:47 This portion of Talk Python To Me is brought to you by Linode. Are you looking for hosting that's

37:52 fast, simple, and incredibly affordable? Well, look past that bookstore and check out Linode at

37:57 talkpython.fm/Linode. That's L-I-N-O-D-E. Plans start at just $5 a month for a dedicated server

38:04 with a gig of RAM. They have 10 data centers across the globe. So no matter where you are or where your

38:09 users are, there's a data center for you. Whether you want to run a Python web app, host a private Git

38:14 server, or just a file server, you'll get native SSDs on all the machines, a newly upgraded 200

38:20 gigabit network, 24-7 friendly support, even on holidays, and a seven-day money-back guarantee.

38:26 Need a little help with your infrastructure? They even offer professional services to help you with

38:30 architecture, migrations, and more. Do you want a dedicated server for free for the next four months?

38:35 Just visit talkpython.fm/Linode.

38:38 It's interesting that you found TypeScript to be a nice way of working and whatnot. And I find it'd be

38:46 pretty nice as well. Certainly, if I had to choose between CoffeeScript and TypeScript, I would

38:50 definitely choose TypeScript. You know, I think TypeScript is interesting in that it's a superset of

38:55 JavaScript. So all your regular JavaScript just works, but you can like typify it and make it have other

39:00 features and capabilities that that language brings. And that's a pretty interesting way to approach that

39:07 problem.

39:07 Oh, definitely. Yeah. And to be clear, I think CoffeeScript was the right choice in 2012. I don't,

39:11 it's not at all the right choice for anything, I don't think, in 2019. I think at the time,

39:15 Bokeh was one of the largest CoffeeScript libraries probably ever developed, which is interesting,

39:20 sort of a bit of trivia. But like I said, it let us move fast, especially not having a lot of

39:24 experience in front-end dev. But, you know, after time, we just needed something more,

39:29 a little more serious, for lack of a better word.

39:30 Yeah, sure. I'd just like to get your thoughts real quick. Like, so TypeScript is all about,

39:35 you know, sort of static typing and checking and whatnot of your code. And we kind of have that

39:42 in Python a little bit now, to the extent that people want to bring it in with mypy and type

39:47 annotations. But it's not really the main zen of the language of Python.

39:52 What are your thoughts of like working in these two languages, kind of side by side on the same

39:55 project?

39:55 Well, so this is a really interesting question. So there is actually a history of various projects

39:59 that add what's called, I think, manifest typing to Python. And so that goes back to,

40:03 there's definitely a project called Traits that Joseph Morrill created that was, you know,

40:07 sort of, you could add types to classes, and those would get checked at runtime. And you could also

40:11 do things like reactive programming and event-based programming based off changes to those values.

40:15 Traits auto-created like QT GUIs, I think, from classes as well, the panels.

40:20 And there's another one called Param. And I think there's now one called Struct. But Bokeh

40:24 also has its own property system. I mentioned these properties of models. Bokeh has its own

40:28 property system, which is rooted in a bunch of fun metaclass programming that lets you add these

40:33 declarative types. So the actual models I mentioned for Bokeh objects are typically have no code in them.

40:38 They're just classes with these property definitions that say, oh, you know, my plot width is an int,

40:44 or my source property is an instance of a column data source, or the range has two floating point

40:50 values start and end. And so we're able to provide runtime feedback. If people try to set, you know,

40:55 the range.start equals some string value, we say, hey, that's not an appropriate value. It needs to

40:59 be an integer. And we also know what properties are on objects. So a feature people have really

41:03 complimented it's about is people sort of fat finger a property name, we'll actually give a suggestion

41:07 and say the nearest property names are named this. And so we-

41:10 Oh, that's nice.

41:10 Kind of a type system. Yeah, a really nice feature. It's sort of one of those simple things

41:13 you don't think about until you see it. But we've had a type system in Bokeh since the beginning,

41:17 right? And so it's a little interesting now that mypy is becoming more popular. We are interested in

41:22 looking to use mypy basically after Bokeh 2.0 comes out and we drop Python 2 support. We're

41:28 interested in trying to integrate mypy, you know, wherever we can. I think it's a useful tool.

41:31 I hadn't used it much until recently, but I have seen it used to good effect. And so I'd like to try to

41:35 improve that. I don't know how much we'll be able to use mypy to replace our existing

41:39 sort of type property system because that would be a huge endeavor because our properties,

41:43 they aren't just the type checking. They also plug into our documentation system so we can auto

41:47 generate our reference documentation. Wow.

41:49 And of course, all the auto synchronization is based on this too, right? A lot of the machinery for the

41:53 automatic synchronization and serialization is based off these property definitions, right?

41:57 Right. Like to notify that something has changed to people who are interested and things like that.

42:02 Yeah. So replacing it with mypy is not something I'm sure we can do for the properties,

42:06 but there's plenty of other places in the library where mypy would be a great benefit

42:10 to help us sort of tighten things up. And so we're looking at that after Bokeh 2.0.

42:13 Okay. Yeah. Yeah. Very cool. Maybe we could do a quick tour of some of the interesting graphs

42:18 or visualizations that you find, you know, like kind of interesting and worth talking about,

42:23 like over at demo.bokeh.org or just bokeh.org and just click on the gallery and demos and stuff.

42:29 There's a bunch of cool ones that has the source code. There's some interactive bits and so on.

42:33 You want to tell us about something you think are worth checking out?

42:35 Yeah. So for sure. So first off, if you go to demo.bokeh.org, these are all specifically

42:39 Bokeh server applications. So these are all backed by running Python process. And when you click a

42:44 button or make a selection, that triggers real Python code. If you go to the gallery on the docs,

42:48 most of those are standalone. And so they don't, they aren't backed by a Bokeh server just to get

42:52 that distinction out of the way. But at demo.bokeh.org, there's a couple of interesting ones here.

42:55 The first one on the upper left is this movie data explorer. And this is actually a fairly direct

43:00 comparison, intentional on our part, to a tool called the Shiny Movie Explorer. So people have

43:05 asked for a long time, where, you know, where is Shiny for Python? So Shiny is this tool for creating

43:08 sort of interactive data visualization applications from the R language. People ask, where is Shiny for

43:13 Python? So we're trying to answer that question. And I think Bokeh is a pretty good, decent answer to

43:16 the question of where is Shiny for Python. But so we made that as a pretty direct comparison. So that's

43:20 one that's interesting. Right next to it, there's this selection histogram, which I think is pretty

43:23 cool. So it's got a couple of distributions of scatter points on a plot. And if you make a

43:27 selection, it shows the histograms on both axes. And if you make a selection across those points of,

43:31 you know, a subset of those points, it then highlights and shows you the histogram of just

43:35 the selected points. And then sort of in the opposite direction, the select the histogram of

43:39 the unselected points and sort of a shadow faded out version. Wow. Yeah, that one's really cool.

43:43 That's a cute one. We've been working on that one for quite a while. It's gone through several

43:46 iterations that actually helped us uncover some problems with the Bokeh server early on. It was just sort

43:51 of behaving in a weird way and stuttering and realized that events were sort of boomeranging,

43:54 sort of making a boomerang effect. And so we had to sort of fix that out. But that was a great example

43:59 to help us figure out some of those problems. And we have a lot more things under a lot more rigorous

44:02 tests now. So that's good. But yeah, I like that example a lot. Another one we have is this

44:07 reproduction of the gap, the gap minder demo. So, you know, Hans Rosling did this, you know,

44:12 famous TED talk where he showed all this data. And so we've reproduced that in Bokeh. We've also

44:16 embedded the YouTube video. We wanted to be able to show being able to use, you know,

44:19 a template to embed Bokeh content in a template with other content. So this also has this YouTube

44:23 video embedded.

44:23 Yeah, that talk by Hans Rosling, you have the video there. It's really worth watching. Like

44:30 that guy really makes statistics and just data like relevant for humanity in a great way.

44:37 Absolutely. No, I'd recommend anyone to go watch the video, regardless of where they look at the

44:42 Bokeh bar. It's a great video. And I think it's a really compelling one. Tells a great story. So

44:46 I'd recommend anyone to go check that out. For sure. Let's see, lower left, there's actually a

44:50 financial chart. So here you can have time series from two sort of financial, you know,

44:54 data sets and you can do sort of a cross correlation between them. And you can see the

44:58 pandas sort of statistical summary there. And you can, you know, use the dropdown to choose

45:02 different time series and then the table updates and the data, you know, the plots update. So that's

45:06 a nice one as well. And then on the bottom right, there's kind of interesting one that's got this 3D

45:10 plot. And this is maybe confusing for some people. Bokeh itself is not a 3D plotting library and has

45:14 no inherent 3D capability built in. But Bokeh is very extensible. At some point, we realized that,

45:19 you know, lots of users have use cases that are eminently reasonable and, you know, really cool

45:24 that we're just not ever going to have the capability or resources to sort of do in the library. I mean,

45:28 you have to sort of limit the scope of the core library at some point. Yeah. So we work to make

45:32 Bokeh extensible. And so you can create these custom extensions that behave just like built-in

45:36 Bokeh models. And they plug in just like, you know, the plot object or a widget object, right,

45:41 into Bokeh content. And so this is an example of that. And so this is a custom extension that wraps

45:46 a little 3D JS library. And you use the standard Bokeh data sources and you update them. And then

45:51 this little 3D plot updates because basically the custom extension just wires together the Bokeh data

45:55 source with whatever this other library expects. And so it's really neat example of that. And there's

46:00 other examples of extensions in the docs as well for different kinds of use cases. If you want to like,

46:04 if you have some really cool JavaScript widget, you want to connect to Bokeh content. If you actually,

46:07 if you have a cool JavaScript widget that you want to connect to, you know, all these PyData tools,

46:12 like you want to connect this JavaScript widget to, you know, scikit-learn or to Dask or to Numbo or,

46:16 you know, pandas, Bokeh is a great bridge for that, right? You can just write a custom extension that wraps

46:21 the JavaScript component and then it's automatically Bokeh server can connect it to all those tools.

46:25 Yeah. And get all the change notification and interactivity and everything. Yeah.

46:29 That's super cool. Okay. Let's see. What else do you want to talk about? You all have Bokeh 2.0 on the

46:35 roadmap. What's going on with that? Yeah, absolutely. I would say we were targeting August,

46:39 but I think maybe a little more realistic at this point is September. We're always a little optimistic

46:43 in our estimates for our schedule. Welcome to software development, right? That's how it goes.

46:48 We're all like that way. I don't even want to speculate the first time we promised Bokeh 1.0 and

46:52 sort of stability. That was probably a couple of years too early, but we're a little more on track

46:57 for Bokeh 2.0. But the main thing about Bokeh 2.0 is just that we are dropping Python 2 support and

47:02 also Python 3.4 support. So Python 3.5 will be the minimum version. As long as we are doing a major

47:06 version bump, we're also going to take the time to clean up a few other minor things. So there's a few

47:10 minor changes that are coming. Hopefully nothing that's too disruptive for anyone. We're going to

47:14 be sure to outline and document all those in a migration guide. But that's the main thing is the

47:18 Python 2 support. And it gives us a chance to do some things like move to native coroutines. So we use

47:23 tornado as the base for the Bokeh server. But if we move to Python 3.5 as a base, we can use native async and

47:29 await coroutines everywhere. I'm still with tornado, but it helps us clean up the code a lot. And just

47:33 in general, it'll help us clean up the code base and make it a lot more maintainable and sort of

47:36 shrink it. And it's always good to delete and shrink code for sure.

47:39 Yeah. If you maintain it by deleting it, like you're good. Yeah, that's a good way to do it.

47:44 Do you think that'll help attract more maintainers to say like, hey, you could work on this cool async

47:48 IO, async and await library rather than, you know, this thing called tornado and these

47:53 coroutines?

47:54 Well, so it's still going to use tornado. And tornado is a really great tool, but I think it may.

47:57 It broadens the thing a little bit to hopefully some more developers. And there are, I stress that

48:02 there's a lot of work in Bokeh TS, but there's plenty of work on the Python side to do as well. And we'd

48:06 love to have contributors. And honestly, there's actually a lot of work that's not coding. I'd love

48:10 to get other contributors involved in all kinds of ways. And if I can speak a minute about that, I mean,

48:14 yeah, go for it.

48:14 Obviously, people talk about, hey, we need testing help and docs and design help and that's,

48:18 or docs help. And that's certainly true for us as well. But other maybe ways people don't think

48:22 about it is, you know, we'd love to get like designers, front end designers to come help

48:25 make our assets better, to come, you know, help us improve the visual appearance of Bokeh. Cause

48:29 you know, we've done okay, but we're not designers. And so it'd be great to get that kind of help.

48:34 We actually have a lot of infrastructure now on places like DigitalOcean and AWS, and it would be

48:39 great to get experienced people that know those, those systems and those DevOps on those systems to

48:44 come help us optimize them for cost, optimize them for usage, you know, whatever.

48:47 Yeah. You guys are doing cool stuff with Docker, right?

48:50 Yeah, we do a couple of things with Docker. So we run the demo sites, actually a Docker image that's

48:53 run on Elastic Beanstalk. And I actually just recently changed some of the instances that that

48:57 was running on to hopefully make them a little bit more cost effective for us. But we also just

49:01 recently had a spike in S3 usage on one of our buckets that I couldn't really explain just yet.

49:06 And so I'd love to get experienced people that can, you know, help with those sorts of things.

49:09 Outreach is another area. We're really trying to ramp up our outreach, both to the community in terms of,

49:13 you know, fundraising, but also talking to companies. And we've had a couple of people help with that.

49:17 And actually just offering support, right? We just moved our mailing list to a discourse instance,

49:22 discourse.bokeh.org, which is infinitely better. I mean, the discourse is great for users because

49:27 there's a lot of features for code highlighting, for math texts, just, you know, all kinds of things

49:31 we could imagine maybe putting an extension to put in actual bokeh content into these discourse posts.

49:35 But it's also great for us as maintainers because discourse has a lot of information about what are

49:39 people searching for, you know, what topics are popular, that sort of thing. So that helps us know maybe

49:43 where attention needs to go. But just answering questions there, people want to go offer support

49:47 and help other people use bokeh. That is also a huge deal. I think bokeh has been successful because

49:52 we've had a few people that have been able to put a lot of time into helping, you know, the community.

49:55 But as the community grows, that's got to scale. It needs to have more and more people helping each

49:59 other. And so that kind of thing would also be a great way to contribute to the project. And so

50:03 there's all kinds of ways people can plug in. And we'd love to, you know, engage with anyone,

50:07 really, about any of those tasks.

50:08 Right, right. If you're a designer and you want to make the website look shiny,

50:12 that'd be great. If you want to make the graphs look better, or maybe you're a visualization

50:17 expert and you've got a different kind of graph you want to bring, whatever, right?

50:20 Yeah, absolutely. Or just even making new examples for the docs, you know, making really cool uses of

50:25 bokeh to show off, to tweet about, to put in our docs and our gallery. I mean, there's all kinds of

50:29 ways to make very valuable contributions to the project just because there's a lot of things to do. And

50:33 you know, presently not enough people do them, probably never enough people do them,

50:36 right. But obviously, the more help we can get, the better.

50:39 Sure. So if I could summarize, you're willing to accept contributors to the project.

50:43 Yeah, absolutely. If I hadn't made that clear, yes.

50:46 That's awesome. Yeah, it's a cool project. It would be fun to work on.

50:50 Yeah.

50:50 As part of this bokeh 2.0 thing and the dropping of Python 2, which I like to refer to as legacy Python,

50:56 and Python 3 just as straight Python. But as part of dropping legacy Python, one of the things you did,

51:03 this is kind of a trend in the data science space, not, I haven't seen it as broadly adopted,

51:06 and I'm not really sure why, you signed the Python 3 statement. You want to tell folks about that?

51:12 Yeah. So the Python 3 statement is just, you know, it's a GitHub repository where projects can go and

51:16 sort of make a PR to list themselves on this website. And it says, we're going to, we pledge to drop

51:20 Python 2 support, you know, by sort of this date or this timeframe, and support Python 3 going forward.

51:25 And so there are a lot of projects that have signed that. And it's interesting, I thought going in that

51:29 bokeh was going to be maybe kind of a leader in this, I wanted to be fairly aggressive. But

51:33 all of a sudden, this year, a ton of projects have started releasing, you know, new releases that sort

51:37 of are cut off from from that. And so like things like I think Matt Potlib, and yeah, and I think,

51:42 you know, Dask, maybe and I forget what else, but there's all these sort of big projects that are

51:46 just suddenly, you know, we're never behind the curve, right? They've already dropped Python 2 support,

51:50 and we're sort of lagging behind. But I think it's time. I mean, bokeh is definitely used a lot in,

51:54 you know, analytics space. And I think things do move a little bit faster there.

51:57 Part of that is because of Conda and Anaconda and Conda Forge, they sort of push things forward.

52:02 I think also data scientists, you know, do a lot of exploratory work, and they're willing to sort of

52:06 move a little bit, you know, in that exploratory work, they're willing to sort of move and put up

52:09 a little bit more change to get new features and to, to get that level of performance better.

52:14 Once things get deployed, that's when things get a bit more sticky. And that's where you see a lot of

52:17 people still using Python 2 and, you know, finance and, you know, other venues like that.

52:21 Yeah, absolutely. I feel like these the data science exploration stuff and the models,

52:25 like the underlying technology is changing so quick there, right? Like TensorFlow has come out and

52:31 pandas and all these things are just changing so quickly that if you're going to come back to it,

52:36 you may want to just move to something new or shiny or better anyway. And you just it's much easier to

52:41 stay on Python on the later version of Python, whereas like, that website that that guy that used to

52:46 work here ran that now we just have to keep running like nobody wants to touch that, right?

52:51 As soon as you touch it, it's your problem to fix it if it ever has a problem. And nobody wants that

52:55 puppy, right? Yeah, yeah. So some of the companies that are projects that sign the Python 3 statement,

53:00 just Python 3 statement, the number three statement.org, TensorFlow, requests, XGBoost,

53:06 NumPy, IPython, like that kind of stuff, right? Cython, Spider. There's a ton of projects here.

53:12 Yeah, it's great.

53:13 A lot of those projects already have. I thought we'd be sort of leading the pack, but we're actually

53:17 behind the curve. And that's what made it very easy for us to say, okay, you know,

53:21 Q4, it's going to be really easy for us to drop Python 2 because all these other projects will

53:24 have already dropped Python 2.

53:26 Right, right. For example, Tornado is in there and you guys are built on that. So in a sense,

53:29 they're kind of calling your, not calling your bluff, but making sure you're going to have to

53:32 follow along anyway if you want to stay in the latest of that, right?

53:35 Yeah, even NumPy, right? I mean, you know, obviously, we could pin to a lower version of NumPy, but we don't want to do that.

53:40 Yeah, of course, you wouldn't want to do that. Interesting. So we're just about out of time,

53:43 but you want to talk about Portland real quick? Sure. Yeah. Yeah. So we were both in Portland,

53:48 right? You recently, somewhat recently, not super recently, but you're somewhat new here and you're

53:54 trying to get some stuff going in the data science space in Portland as well, right?

53:57 Yeah, absolutely. Yeah. So I've been here about a year and a half and it's been a really great

54:00 experience being here in Portland. I really love it. But I am trying to get a PyData meetup here

54:06 started. In fact, we have a first meetup scheduled for, I think, August 14th. And we're going to

54:11 alternate sort of between an east side and a west side, you know, downtown location, hopefully,

54:15 every other month. But me and a colleague of mine are getting that off the ground. And I'm really

54:20 excited about it. So PyData is this series of meetups slash conferences, if the meetups get big

54:24 enough, that is sort of sponsored by NumFocus. And, you know, I've been involved with NumFocus since the

54:28 beginning. I think it's a terrific, amazing organization. The people that are there are really

54:31 great. And I think the PyData meetups in particular have been really, really great, you know,

54:35 both meetups and also the PyData conferences are also really good as well. So really excited to get that

54:39 started in Portland. I was almost kind of surprised that it wasn't here already. There's a, you know,

54:42 there's a PyData Seattle meetup and there's PyData meetup. There's like 105 PyData meetups,

54:47 I think, around the world. So it was by far time that Portland gets one. So I'm really excited to

54:51 be helping get that off the ground. Yeah, that's awesome. I mean, that only really is interesting

54:55 to like 5% of the listeners, maybe. But it's still really cool that you're doing that here in

54:59 Portland. And, you know, other folks, they can create a PyData, their city's airport acronym if

55:05 they want, right? Absolutely. I think most places don't use the airport acronym,

55:08 but I'm really fond of PDX. So I invited a PDX on a good.

55:11 Yeah, it's definitely a good one. All right. Well, you know, Bokeh is a really cool project,

55:15 and I'm glad you all have been working on it. And it's great to see all this progress and excitement

55:21 around it. It's a great, great one. So people should definitely check it out.

55:24 Yeah, yeah. Well, thank you very much for having me. I love to have the opportunity to sort of spread

55:27 the word and talk about Bokeh. And it's been really great.

55:30 Yeah, absolutely. Let me ask you the final two questions before you get out of here, though.

55:33 If you're going to write some code, probably Python code, but maybe JavaScript as well,

55:37 I guess. What editor do you use?

55:39 I have been won over by VS Code.

55:40 Okay.

55:41 Yeah, I still use VI binding. So I grew up using VI and that's still in my fingers. And so I love

55:45 VI bindings. But I used to use Sublime Text, but I moved to VS Code and haven't looked back.

55:49 Yeah, that seems a pretty straightforward choice to go from Sublime to VS Code.

55:53 Those are, you know, one has so much more energy and they're super similar in their sort of workflow.

55:58 And then notable PyPI package, I'll go ahead and throw out there Bokeh for you. And what else? If

56:04 there's something like, hey, I ran across this and people might not know about it, but it's really

56:08 amazing. What would you say?

56:09 Yeah, let's see. Well, I'll say PyPI or Konda, right? So don't forget Konda. Very important

56:13 to remember. But, you know, it's hard to say. I'm so focused on, you know, using and working on Bokeh.

56:21 That's like my day to day. Honestly, I have a little bit of tunnel vision, maybe to put it one

56:25 way. But, you know, I think a lot of the tools that are built on top of Bokeh are really interesting to

56:29 me. And so I like, you know, looking at what is happening with them and seeing what developments

56:32 are going. Obviously, I think all the tools in the PyData ecosystem are amazing. I think Numba in

56:37 particular is really interesting. So Numba is a compiler for Python, lets you really accelerate,

56:41 you know, certain kinds of code. And it was originally created again by Travis, you know,

56:45 Oliphant, but it's been moved on since then. And it's actually grown really successful in certain

56:50 kinds of venues. So I think Numba is a pretty interesting use case. And I certainly, of course,

56:53 think Dask is really fantastic as well. Yeah, those are definitely good ones. All right,

56:57 Brian, final call to action. People want to get started with Bokeh. What do they do?

57:01 Yeah, absolutely. Love to get people involved. So if you want to, you know, talk about development or

57:06 have questions about support, we have this discourse, discourse.bokeh.org. If you want to just

57:09 get started from a very high level, just bokeh.org is a great one stop to get to a lot of other

57:14 resources like documentation, like the gallery, like the GitHub page, and just to see what,

57:19 going on. But in terms of like talking to us, yeah, the discourse is a great spot to make a

57:23 poster topic there. And of course, GitHub is a great place if you have ideas for suggestions,

57:27 you know, or want to report problems, of course, you know, GitHub is a great place to contact us.

57:31 Yeah, for sure. And PRs are accepted.

57:32 PRs are always accepted. Yes.

57:34 Yeah, very cool. At least considered.

57:36 Consider.

57:37 At least considered for sure.

57:38 Cool. All right. Well, thanks so much for being on the show. It's good to talk with you.

57:41 Absolutely. Thank you so much, Michael.

57:42 Yeah, bye.

57:43 Bye.

57:43 This has been another episode of Talk Python to Me. Our guest on this episode was Brian Vandevin,

57:50 and it's been brought to you by Ting and Linode. Ting is the fast mobile network custom built for

57:55 technical folks. Use their savings calculator to see exactly what you'd pay. Visit python.ting.com

58:01 to get a $25 credit and get started without a contract. Linode is your go-to hosting for whatever

58:08 you're building with Python. Get four months free at talkpython.fm/linode. That's L-I-N-O-D-E.

58:14 Want to level up your Python? If you're just getting started, try my Python Jumpstart by Building 10 Apps

58:21 course. Or if you're looking for something more advanced, check out our new async course that digs

58:27 into all the different types of async programming you can do in Python. And of course, if you're interested

58:31 in more than one of these, be sure to check out our everything bundle. It's like a subscription

58:35 that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for

58:41 Python. We should be right at the top. You can also find the iTunes feed at /itunes, the Google

58:46 Play feed at /play, and the direct RSS feed at /rss on talkpython.fm. This is your host,

58:53 Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write

58:57 some Python code.

59:01 We'll see you next time.

Talk Python's Mastodon Michael Kennedy's Mastodon