#222: Interactive graphs with Bokeh and Python Transcript
00:00 Do you have data you want to visualize and share? It's easy enough to make a static graph of it,
00:04 but what if you want to zoom in and highlight different sections? What if you need to rerun
00:09 your machine learning model on the selected data? Then you might want to consider working with Bokeh.
00:13 It does this and much more. Join me on this episode where you'll meet Brian Vandeven,
00:18 who heads up the Bokeh project. This is Talk Python to Me, episode 222, recorded July 24th, 2019.
00:25 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries,
00:43 the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter,
00:48 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm,
00:52 and follow the show on Twitter via at Talk Python.
00:55 This episode is brought to you by Ting and Linode. Please check out what they're offering
00:59 during their segments. It really helps support the show. Brian, welcome to Talk Python.
01:03 Hi, thanks for having me.
01:04 Yeah, it's great to have you here. You know, I've often thought about ways in which I could
01:09 use some of these cool Python visualization libraries, and I haven't recently had some
01:14 great excuses to use them, so I haven't really covered them enough on the show, but I'm really
01:18 excited to talk about Bokeh with you this week.
01:19 Oh, I'm super excited to be here. I think Bokeh has really developed a lot over the last year or so
01:24 in particular, and so this is a great opportunity.
01:26 Yeah, absolutely. Before we get to it, though, let's start with your story. How'd you get into
01:29 programming in Python?
01:30 In Python? So I think the first version of Python I ever used was Python 1.4, actually way,
01:34 way back in the day, and I was doing some system administration kind of job, so there was a lot
01:39 of Perl, but I happened to get into using Python for a few things, and it was a lot of fun. Put it
01:44 down for a while, picked it up here and there, but I've been using it pretty extensively
01:47 probably since about 2005 or 2006.
01:51 Okay, yeah, those are pretty early days, Python 1, right? We don't have that debate about 1 versus
01:56 2 anymore. It's moved on to 2 versus 3.
01:58 Yeah, I don't think there was ever really much debate. Everyone was ready for Python 2 for sure.
02:03 Yeah, absolutely. So how'd you get into programming in the first place?
02:05 Let's see. The first thing I ever did was on a TRS-80 that was actually checked out from our
02:10 local library. They had a program to check out TRS-80s for two weeks, and there was a logo cartridge
02:15 that came with it, so we could do logo programming. A little bit later, we had some Commodore computers,
02:19 and so I did, you know, basic, and I think at one point I even got into like 6502 assembly,
02:23 you know, when I was getting to be a teenager or something, but yeah, you know, just 8-bit
02:28 programming way back in the day.
02:30 Yeah, how interesting. Yeah, that's funny with assembly language, like that's not a super
02:34 easy compare it. Like you've got basic on one side and assembly language on the other. Not
02:38 a whole lot in between, huh?
02:39 Well, there's not a lot of different ways to program on a Commodore 64.
02:44 You had to earn your programming stripes back in the early days, that's for sure.
02:48 Nice. Okay, so Boca is a very visual thing. For a long time, you were at Anaconda Inc.
02:55 So, is there a science background as well that got you sort of in that path, or how do you get
03:01 interested in all of these things? Yeah, I've had a pretty tortured academic path. I went to school
03:05 for computer science, then left for a while, and I worked in some research labs, and I realized,
03:09 hey, I want to go back to school, and so I actually ended up in graduate school for physics eventually,
03:12 and so I have a pretty strong, you know, mathematics, physics background, but ultimately,
03:16 I did decide to sort of go back into computer science, software engineering. I really like working
03:20 on software, though, that's in the service of analytical endeavors or science and that sort of thing,
03:24 and so this is why, you know, being able to work at Anaconda on all these tools has been really
03:28 fantastic. Yeah, it's got to be super rewarding to have so much impact on the science side. Are you
03:34 still at Anaconda? What are you doing these days? Like, what's a, what kind of programming and work
03:38 do you do day to day now? Yeah, no, I actually just recently left earlier in the year, so I was at
03:42 Anaconda from the beginning. I think I was the last original employee to leave, in fact, except for
03:46 Peter Wang, of course, who's still there, but, you know, eight years is a long time, and so it's just time for
03:51 me to go look for something different, and I actually went to go work at Microsoft, and that was really on the
03:54 strength of some interactions I'd had with folks at DevDiv and Microsoft around Python, around open
03:59 source. Everyone there has been really terrific and really supportive of Python and open source, and so I
04:03 think it's a very different company than when I, you know, thought about it 15 years ago, where I
04:08 probably would have used M dollar sign very sincerely on an angry forum post or something, but, you know,
04:12 everyone there has been really terrific. It's been a good experience, and day-to-day, I work on Azure SDK
04:16 for Python these days, which is, you know, a lot of PR reviewing, writing some code, and
04:21 helping move the direction there.
04:23 Oh, that's really interesting. You probably feel like you're bringing a little bit of the outside
04:27 to Microsoft, right? Like, it is a very different company. They're more open to external stuff, but,
04:33 you know, historically, it hasn't always been that way, so it's probably like, let me tell you about the
04:37 Python scientific stack, folks, things like that, yeah?
04:40 It's definitely interesting, and there's a lot of give and take. So I've actually learned,
04:43 I haven't been in an organization this large in a very long time, and so it's been a lot of personal growth
04:47 and learning for me, just to be in that kind of environment where, you know, people have to interact
04:51 in different ways, and that's been very gratifying and helpful for me, but definitely, I think I have a
04:55 pretty useful perspective to bring as well, especially in terms of, yeah, data science applications
04:59 in Python and that sort of thing.
05:01 Yeah, yeah, super cool. It sounds like a fun job. So let's start off this conversation talking about
05:06 Bokeh by kind of getting the, like, a big picture of, you know, making pictures with Python, right?
05:13 So if I have a graph, I want to do a map, if I want to do some kind of bar chart or some
05:18 visualization of data, what are my options nowadays?
05:23 There are a lot. So if people want to Google, there's actually a chart made by Jake Vanderplass,
05:29 who, you know, is very active in sort of the PyData, SciPy community. He tried to draw a map,
05:34 basically, of all the Python visualization landscape, and there are a lot of tools available. And some
05:38 people think this is really great, and there's a lot of choice, and then some people think that
05:42 there's just, you know, too many things, and they don't know what to deal with. But there are a lot
05:45 of tools. So obviously, you know, MapPlotlib is the very big tool that's been around for a very long
05:49 time. It's a really fantastic tool, and all the devs there, you know, they work really hard. And
05:52 it's been great to see the sort of the strides that it's made in the last few years. In terms of,
05:57 like, web visualization, there's, you know, Bokeh, of course. Plotly is another offering that's out
06:02 there by the company Plotly. Altair is another tool that's been fairly recently added. Actually,
06:07 Jake Vanderplass and Brian Granger from Jupyter put that together. And so it's inspired by
06:12 the Vega plotting sort of toolkit that's available in browsers. And it's sort of a Python wrapper for
06:16 that.
06:17 Yeah, that's cool. I've heard a lot of good stuff about Altair, and that it's really quite nice as
06:21 well.
06:21 Yeah, I don't have a lot of experience with it. I mean, it's definitely intended for
06:24 very high level sort of exploratory data analysis. It's very, you know, useful, especially in notebooks
06:29 in particular. And so it looks very attractive from the things that I've seen. I just am so,
06:34 you know, I'm very involved in Bokeh. And it takes up so much of my time that I almost don't
06:38 have time to look at too many other things very often. But, you know, Jake's a fantastic guy.
06:41 Brian Granger, of course, is great and has made just amazing things for the PyData and SciPy
06:46 communities. So that's a great tool that they put together.
06:49 Cool. And, you know, there's always things like the JavaScript libraries, like D3 and stuff like
06:54 that. Is that really relevant? Or are we kind of got a handle with things like Bokeh and Plotly and so on?
06:58 You know, so people ask that a lot. Like, what's the difference between Python and D3? Or why would you use
07:02 one versus the other? And I think if you have people that are, you know, already using JavaScript
07:07 and they want to work on things with D3, D3 is an amazing tool and it can make incredible, you know,
07:12 output and really fantastic graphics. And there's probably things that are doable in D3 that maybe
07:17 would be more difficult in, you know, Bokeh, for instance. But where I think the sweet spot for Bokeh
07:21 is we've really tried to make it so that people who are already very productive in Python, they're doing,
07:25 you know, work in data science or science, who are using all these tools that are in the PyData stack,
07:30 you know, NumPy and SciPy and Pandas and Scikit-learn and, you know, Dask and Numba and all
07:34 these tools that are really productive with these in Python. We won't let them have access to very
07:38 interactive, powerful visualizations in the browser without having to reach for that JavaScript
07:42 and web tech and sort of be distracted from the actual work that they want to do. And so
07:46 in terms of productivity, I think if you're already working in Python, I think Bokeh is a great
07:50 choice, to put it that way.
07:51 Yeah. Well, Bokeh to me feels like I get a lot of the benefits of the rich JavaScript stuff,
07:56 but that I don't actually have to make it.
07:59 That's a very, yeah.
08:00 Yeah. A very succinct way to put it. Yeah.
08:02 Okay. Interesting. And then maybe we could talk really quickly about Plotly just as a compare and
08:06 contrast. So Plotly, like I don't fully understand Plotly. When I go to work with it, I feel like,
08:12 okay, I'm working with a library, but then it seems like it has like a backend that they provide that
08:17 I have to deal with. And then there's also a commercial version. Like what is Plotly? I don't really
08:21 know where it fits. So in terms of their business, I actually don't know a lot. To be honest, I don't,
08:26 I don't really follow that very closely. And so I think they've actually changed some of their
08:29 offerings from, from what they used to be. I think they used to, you know, sell Plotly and I think
08:32 they're not in that business anymore, but I can't really speak to that very carefully. But the main
08:36 similarities are it's a Python API that generates a declarative, you know, specification,
08:41 typically some kind of JSON that can be rendered by a front end library. Now, you know, Plotly is an
08:46 entire company centered around this. And so they have had some really nice resources for, I think,
08:50 developing both, you know, some things in ways that Bokeh hasn't had. Like I think, you know,
08:53 their front end is a little more polished and some of the, you know, design stuff is definitely
08:56 polished. And I'd love to get some help on the Bokeh side to sort of bring that up to speed.
09:00 But, you know, relatively speaking, I think we've done a pretty good job at, you know,
09:03 having the same set of features. They're very contemporaneous. They started at almost the same time,
09:07 you know, way back in sort of 2012 kind of era. They have a lot of similarities,
09:11 but definitely there's a little bit of difference. You know, my background comes from a lot of science
09:15 stuff. So I'm really familiar with folks that have use cases, for instance, around like dense
09:18 arrays, like, you know, big images. And so we've really focused on some things like having an
09:22 efficient array protocol for the Bokeh server that can transmit large arrays, you know, very efficiently.
09:27 And whereas I don't think they've sort of gone down that route, you know, again,
09:29 they've worked on some other features that are more around slick dashboarding and, you know,
09:34 that sort of thing.
09:35 Okay. Yeah. Interesting. You know, let's talk about the history of Bokeh. Like you're saying it that way,
09:40 and I guess I'm as well, I'm trying to anyway, the way the proper pronunciation is.
09:45 I usually say Bokeh, but Bokeh, I think is also fine. Amusingly, long ago when we were funded through
09:50 this DARPA X-Data initiative, there was some video that someone made, you know, unrelated to all the
09:55 actual projects, but they made about the projects. And I remember in that they were describing all the
09:59 projects that was under the X-Data initiative. And they mentioned the visual database Boke,
10:04 I think. That's the only wrong pronunciation.
10:05 Bokeh is not Bokeh. All right.
10:07 Okay. It's fine. Yeah.
10:08 All right. Excellent. And so where did it get started? It started out of research grants and
10:14 this DARPA funding. Is that where it came from?
10:16 No, the research grants really helped, but it started before then. So going back a little bit
10:20 further. So I've been interested in visualization for a long time. Actually, the first, one of the
10:24 first things I use in Python and Python 1.4 was a plotting plugin for Apache. And it was just,
10:28 it amazed me that you could like take data and have a website just make a plot. It was incredible.
10:33 I worked a little bit on VTK here and there years ago. A few, my first open source contributions
10:37 were to VTK. But in the middle aughts, I guess, I went to work for a company with Peter Wang,
10:43 who eventually founded Anaconda. And I worked with him on a library called Chaco, which was a rich client
10:49 library for interactive visualization. So instead of in the browser, you would write like a Python
10:52 application that was using like, you know, QT or, you know, the GTK, you know, kind of real
10:58 application. And it was also contemporaneous with Plotlib.
11:03 But ultimately, Matt Plotlib won sort of that battle. And the reason was pretty clear. It was
11:07 because, you know, while Chaco had all this really rich capability for an activity, it was a very sort
11:11 of fiddly API, very detailed, kind of verbose. And so later on, when Peter was getting Anaconda
11:19 started, and I was on board to help with that, we had talked about wanting to update this idea of
11:23 Chaco and create a new library that supported interactive visualizations and in browsers,
11:28 because right, browsers is the right place for it to be in 2012, right?
11:30 Of course.
11:31 And so we had this idea, we started it. But getting the, you know, the funding through the
11:35 DARPA X8 initiative in the early years of Anaconda, is then called Continuum Analytics,
11:39 really helped, you know, mitigate business risk for us to put more resources into it. So it really
11:42 accelerated the development, I would say. But we have certainly, you know, we started talking about
11:46 the ideas behind Bokeh in sort of middle 2011, probably.
11:49 Okay. Yeah, that's, that's pretty interesting. It's been around for a while. I guess that's,
11:53 at that point, Ajax and interactive browser stuff's pretty well established, right? So it's pretty
11:58 clear that was the right place.
11:59 Yeah, I mean, we just knew that, like, the future of presentation and the future of getting, you know,
12:03 this content in front of people was going to be in browsers. So writing another rich client library
12:06 was not something that was really interesting to us. And so we definitely wanted to do the browser.
12:11 And then we definitely wanted to make it architected in a way that it was very flexible,
12:15 that it had this declarative specification that described what you wanted to visualize.
12:19 Because, you know, that affords a lot of possibilities. So we've talked about the Python
12:21 side of Bokeh, but you can actually have other languages drive Bokeh plots in the browser,
12:25 you know, there's an R Bokeh binding, there is a Scala Bokeh binding that hasn't been updated in a while.
12:30 I'm interested in actually reviving a Julia Bokeh binding. But that's all because there's this
12:35 JSON specification. And any language that can, you know, dump out the right JSON can create these
12:39 Bokeh plots in the browser. Now, we've spent a lot of effort investing in the Python bindings,
12:42 because, you know, Anaconda is a big Python shop. But certainly the possibility is there for other
12:46 languages as well.
12:47 Okay, so maybe it's worth just touching on the architecture a little bit, and we can dive
12:51 into the details more later. So there's a, I guess there's a couple ways we can use this,
12:55 right? Like, there's a, probably the most straightforward way is we have a Bokeh server.
13:00 And then there's some front end stuff that is the rendering point, right? Like I want to put a,
13:06 some kind of graph in a browser, and the server handles all the data, and maybe it only prevents,
13:12 presents like a slice into that world of the data and things like that, right? Can you tell us about
13:18 that?
13:18 That's absolutely a great use case for the server. But I will say the server is,
13:21 in fact, optional. I would say most usage of Bokeh probably doesn't involve the server. So Bokeh can
13:25 generate this JSON, and it can send it to a web page that can be embedded in like a Flask app or a Django
13:30 app. It can be embedded in, you know, Jupyter notebook output cells. And it doesn't have to be
13:34 connected to the Bokeh server. Bokeh.js will take that JSON and render it. And you've got an interactive
13:39 plot that has panning and zooming. You can even have linked behaviors between plots. You can have custom JS
13:43 callbacks that, you know, do work whenever you make a selection or click a button. None of that
13:47 requires the Bokeh server. What the Bokeh server is really great for is when you want to connect all
13:51 those interactive features to real running Python code. Like you want to click a button and have a,
13:55 you know, a scikit-learn regression or a scikit-learn model run, or you want to, you know,
13:59 make a selection on a plot and compute a linear regression line through those selected points
14:03 with real Python code. That's what the Bokeh server is really great for, making that sort of two-way
14:07 connection between this front end and real running Python code. But you can use Bokeh very effectively
14:12 without the Bokeh server. And in fact, I guess, I think most usage is probably, we call it standalone
14:16 usage, where it's just generating this pile of JSON that is used to drive a Bokeh plot in a webpage
14:21 somewhere. This portion of Talk Python to Me is brought to you by Ting. Let me tell you about Ting,
14:27 a new mobile service available in the US that's targeted developers and other technically savvy folks.
14:33 First of all, their average customer only pays $23 a month, but they're no discount provider.
14:38 Their service runs over T-Mobile's and Sprint's fast nationwide network.
14:42 If you don't use that much data because you're usually on Wi-Fi, like many of you are,
14:45 then Ting will save you a ton of cash. But don't worry, you can still use as much data as you like
14:50 for just $10 per gig. One mobile feature I use all the time is tethering. And with Ting,
14:55 you get unlimited tethering at the same data rate with your account. $6 a month for a phone line,
15:00 $10 a gig, $3 a month for text if you usually chat over iMessage or WhatsApp. Think about it,
15:06 no contracts and super clear and fair billing. Visit python.ting.com. That's python.ting.com and check
15:14 out their savings calculator. Enter your usage and see exactly what you'd pay. Use that link and you'll
15:19 get a $25 credit to try them as well. That's python.ting.com or just click the link in the show notes.
15:26 The server is mostly about the interactive bits if you want to add smarts to your plots.
15:32 Yeah, absolutely. If you want to connect all those PyData tools, NumPy, SciPy, Pandas, Dask,
15:37 Numba, OpenCV, any of those tools. If you want to connect those things directly to
15:42 these interactive visualizations with a minimal amount of fuss, that is what the Bokeh server is for.
15:47 Exactly.
15:47 Okay, cool. Before we move off the history and Anaconda Inc. and all that, when you created it,
15:52 it sounds like you tried to create it as a standalone project with its own fundraising and its own
15:58 outreach. What was the thinking there rather than just making it part of Anaconda?
16:02 Well, I mean, in the very early days, it was definitely a project that was started at Anaconda.
16:06 And the DARPA thing came along somewhat serendipitously. Not something we counted on or
16:11 knew about when the company started. And that was a big sort of funding for a long time. After that ran
16:15 out, because that was a fixed number of years sort of support, Anaconda very generously supported the
16:20 development of Bokeh. But ultimately, it was always the goal to try to create these tools as sort of
16:25 self-sufficient, self-governing, push them out in the open kind of projects. And so it took a long time
16:30 to get to that point. The first step in that was for Bokeh to become a NumFocus fiscally sponsored
16:34 project. But of late, we've really ramped up the self-governance and the self-sufficiency. So pretty
16:39 much at this point, I think the cord's been cut and Bokeh is really out there. It's managing its own CDN
16:43 resources. We're doing a lot of outreach and fundraising on our own right now that wasn't happening even six months ago. We just had or still having actually
16:50 a July fundraiser going on to try to help pay for some of our infrastructure costs. But we're also
16:55 ramping up some corporate engagements and trying to talk to corporations and see if they want to offer
16:58 support. But that all is pretty new. For a long time, Anaconda was the primary beneficiary or
17:03 benefactor, I should say, of Bokeh.
17:04 Sure. And it just makes Python and data science stronger, which is really the heart of Anaconda Inc.
17:11 anyway. So it seems pretty reasonable. You said that the project was a NumFocus
17:15 project. And we spoke a little bit earlier about NumFocus a bit. And I guess my understanding is a
17:22 little bit off. I saw NumFocus as a thing that kind of provides funding to these projects. And
17:30 that's not quite exactly right, is it? What is NumFocus and what did it do for you all?
17:33 Yeah, it's sort of yes and no. So NumFocus was started by Travis Alphont, who was one of the other
17:38 co-founders of Anaconda. Of course, he's the original or he's the author of NumPy, building on previous work.
17:43 But NumFocus is this nonprofit, 501c3. And its main role is to be an umbrella organization for
17:51 open source projects, right? So open source projects often are not legal entities. And so that actually
17:55 makes it quite difficult for them to accept money, right? It sort of gets very complicated with taxes.
17:58 And so they're this legal entity that can accept donations on behalf of projects and then handle all
18:04 the tax stuff. They also do a lot of outreach. They hold, you know, they support the PyData meetups and
18:10 PyData conferences around the world. So they're this organization that sort of helps and supports
18:13 open source. And they do fund these projects in the sense that, you know, they help the donations that
18:17 come in get back to the projects. And also sometimes they spread some of those funds around. They get
18:21 bigger donations that they can also use to give to projects that, you know, don't necessarily raise
18:24 their own money. But yeah, that's their main role. Just to be this umbrella organization, they help out
18:28 with the bureaucracy and take away that load from these open source projects.
18:32 Yeah, that's really interesting. I guess it took me a long time to realize that it's
18:35 actually hard for companies to give money.
18:39 It's really hard.
18:40 To these projects, right? Like it's not just a matter of put aside the point whether they should
18:44 put aside the point whether they fiscally could. Of course, they most of them can and they should,
18:49 right? But just the way that they're set up is they buy things. They exchange money for a service or a
18:58 good. Even given to a charity like NumFocus probably is a little bit odd and hard for them to do.
19:03 That is exactly right. There is just an impedance mismatch. I mean, the sums of money that would help
19:07 a lot of open source projects, I think, are relatively small compared to the budgets of the
19:11 companies we're talking about that are using these projects. But yeah, there's just an impedance
19:16 mismatch for how do you actually it's not a purchase, right? And so it's not hiring someone. So what
19:21 exactly is it? And companies just, you know, right now don't know how to handle that and don't know how to
19:25 deal with it. And so there's some other efforts to make that easier. People are trying different
19:28 things. There's, you know, Travis's company, Quonsight, is trying out some models for, you know,
19:32 getting support for open source projects by engaging with companies in various ways. Tidelift as well
19:37 is also trying that. GitHub is, you know, trying their new sponsorship sorts of things. So people are
19:41 trying to find novel ways to attack this problem. And I hope we get there, but it's definitely
19:44 something that will take some time.
19:46 Yeah, I hope so as well, because it would make a huge difference and it would be
19:49 a blip for these companies to make that contribution, right?
19:52 Yeah. I mean, as an example, our goal for the July fundraiser was to raise a thousand dollars,
19:56 right? And we have, you know...
19:57 That's very modest, right?
19:58 I think several hundred thousand users and we did, you know, I'm really happy to report that we did,
20:02 but, you know, I had to tweet out a lot every day to make it happen. It was, you know, sort of
20:06 knocking on doors and, but yeah, I'd love to be able to go to companies and try to be more effective
20:10 and efficient at fundraising, if that makes sense.
20:12 Sure. Well, while we're on the subject, let's talk about your fundraiser real quick.
20:15 If you're going to raise a thousand dollars, that's probably not money to pay developers,
20:19 right? It's something else.
20:21 Yeah. I mean, I actually raised this exact issue around the sustainability conference that NumFocus
20:26 is going to have later in the year. I think there's sort of two different buckets where things go into
20:29 there's, you know, paying people. And I hope someday we can figure that out really well. And we can
20:33 support, you know, people to be maintainers of open source software by actually paying them to do
20:37 that work directly as a sort of a job or a living. But there's also just the matter of a lot of open
20:43 source projects, I think, could use a fairly small amount of money just to cover expenses or help
20:48 take some strain and stress off the maintainers. Like in the case of Bokeh, we have to run this,
20:52 you know, CDN to deploy Bokeh.js. So all the users around the world can get Bokeh.js to display their
20:57 plots. That is run on, you know, AWS CloudFront. And so we have to pay for that. Someone has to pay for
21:02 that. Right. And so that's what this fundraiser was for. And so in the sense that it, you know,
21:06 sort of reduces my stress because it helps me know that this is sort of taken care of for the next year.
21:10 That's what that level of sort of funding is for. And there's other stuff too. There's,
21:13 you know, it's good to get developers face to face sometimes. And so this could help with that
21:16 as well as other infrastructure costs.
21:18 Right. Some having some like yearly meet up with the core developers. It's odd. There's a lot of
21:23 projects where the core developers have never actually met.
21:25 Definitely. For sure. I think I've met everyone at least once, but there's some folks that are pretty
21:30 scattered for Bokeh. But for sure, I have no doubt that's the case.
21:33 Yeah, for sure. So you said that Bokeh works with this JSON output and it can just basically render
21:39 anything that can generate the JSON. It can go out and render that. You know, I think when I think of
21:44 graphs, I mostly think of notebooks and Jupyter and things like that. But also we can just plug this
21:50 into whatever. Is that right? I can plug it into just a Flask site. Can I plug it into even like a,
21:55 some kind of command line app and like somehow pop it up?
21:58 Yeah, I don't see why not. I mean, anything that can run JavaScript basically, right? Anything that can
22:01 load Bokeh.js. So if that's like an Electron app, I think that would be feasible, certainly in the,
22:06 you know, in the Jupyter notebook. But, you know, there's a variety of ways. We have a whole Bokeh.embed,
22:10 you know, API. And so there's a variety of different ways to embed Bokeh content. But if you're running a
22:13 Flask site, you can use one of those ways to pop up Bokeh content in the middle of your site and in
22:18 your template and sort of wherever you want to put it.
22:20 When I think about the data science space, there's some libraries and data structures that are just used
22:27 over and over and over. You mentioned Travis before. So there's NumPy. There's,
22:31 of course, Pandas. There's a bunch of other stuff built on top of that. Is there special
22:36 integration for those types of libraries? Like if I already have some NumPy array or I've got some
22:42 Panda data frame, is there a way to just like plug it into Bokeh?
22:45 Yeah. So Bokeh works really well with all those things. NumPy is an actual requirement,
22:49 runtime requirement for Bokeh. We tried to avoid that. Even in the beginning, we wanted to make Bokeh as
22:53 minimal and as accessible as possible. But NumPy is a requirement now. Pandas is not a hard
22:57 requirement, but Bokeh works really well with Pandas. If you have Pandas data frames,
23:00 you can basically plug them in anywhere a Bokeh column data source would go. I can automatically
23:05 sort of convert them or use them in a way that's useful, either data frames or group by objects.
23:09 And so we've tried to make it very easy to integrate with Pandas, but also make Pandas
23:13 not required. And so that's sort of the state where things are at. Anything that can sort of behave
23:17 like a list or an array or a Pandas series works pretty much out of the box with Bokeh.
23:21 Okay. Are there other major libraries that I don't know to ask about?
23:25 I think Bokeh works really well with Dask, I assume, because Dask has a data frame-like API.
23:29 Matt Rockland actually used Bokeh to develop the interactive dashboard that's sort of the cluster
23:34 monitor for Dask.
23:35 I think it's great as well. And I did have Matthew Rockland on before to talk about Dask,
23:39 but I don't know if everyone's listened to that one. And also if they've seen the actual
23:45 visualization of that. So could you maybe describe that real quickly, what Dask is and then that
23:49 dashboard? Because when I saw that, that just like blew me away.
23:53 Yeah. Dask is a tool for basically parallel distributed programming. And it's trying to
23:57 do so in a way that is a very sort of Pythonic, very Pandas-like API, right? So there's other
24:02 tools that do this sort of thing, but they come from other languages originally. And so their APIs are
24:06 maybe not very Pythonic and they kind of don't fit in well with Python tools. But Dask is meant to be
24:11 a very Pythonic tool for this distributed computing task. And so to that end, it has this dashboard
24:16 that Matt Rockland developed using Bokeh that can visualize everything that's going on around a
24:20 cluster that's doing computation all at once, right? So it can show you what
24:23 nodes are computing or waiting or they're transferring data in real time. And so that's
24:27 really helpful for diagnosing problems with parallel distributed computations. And so Matt has always
24:32 been very clear that Bokeh was great for him because he didn't have to write all this JavaScript to have
24:36 this really interactive dashboard. He could just write it in Python and connect it directly to his
24:40 telemetry that he was getting back from clusters and visualize it very quickly. And so I think it's
24:43 been a great tool for his users. I shouldn't say his users. It's a big project now. Dask has actually
24:47 grown up quite a bit. So I should say the Dask project's users. But we're happy to see that kind of thing happen.
24:52 You know, love to see Bokeh used in those kind of use cases.
24:54 Yeah. And it just looks so good and professional and, you know, live updating. It's really a nice
25:01 use case for Bokeh. And I think it's also a good testament to what you guys have built.
25:05 Well, it's also good to get feedback from real use cases like that. Nothing sharpens your tools
25:09 better than sort of having them honed against real problems, right? And so we love when people do
25:13 awesome things with Bokeh and tell us, you know, hey, this was great, but also this could be a little
25:16 bit better or easier. You know, this is how you could make my life, you know, simpler or here's some pain
25:20 points I had. That kind of feedback is really helpful from users.
25:23 Yeah. It's great to design something, but once it actually meets real users and real use cases,
25:29 like that's where it gets real. So you talked about Dask. Let's just touch on some of the other
25:35 things that is built upon Bokeh because Bokeh has been around since 2011, 2012, like you said,
25:41 and it's pretty stable for the most part.
25:43 It's a lot more stable recently. Yeah.
25:45 Yeah, that's great. So things are starting to build on top of it like Dask and just using it.
25:49 So what else is out there like that?
25:50 Well, there's a couple of different things in different classes of things, right? So first of
25:53 all, there's other libraries now that are starting to build on top of Bokeh. So there is Chartify,
25:57 which was created by the data labs at Spotify. And so it's their sort of high, very high level
26:02 sort of data science, opinionated data science API on top of Bokeh. There's a project called Pandas
26:06 Bokeh that just came out recently. That's sort of very tight integration with Pandas and using Bokeh to
26:10 generate interactive plots. There's also a set of tools created by some folks who are still at Anaconda
26:14 called sort of the efforts called PyViz. And so there's a tool called HoloViews, which is a very
26:18 data centric API, and it can generate interactive visualizations using Bokeh and other tools as well.
26:24 But there's also, you know, some tools like Data Shader, which are for, you know, very large data,
26:30 you can finally control how they're rendered. And so you can combine Data Shader with Bokeh, you know,
26:34 using HoloViews. So it can drive things at a very high level. So I'd love to see this effort where
26:37 people are building these things on top of Bokeh. And I'm also glad that now, you know, Bokeh was
26:41 a moving target for quite a while. And I very much appreciate the patience of all of our users who
26:45 sort of, you know, kept with us as we were figuring things out. But I would like, you know, we're trying
26:49 to be much more stable now. I think we've done a very good job since 1.0 was released at being,
26:53 you know, much more stable. That's very good. The other kind of things that get built on top of
26:58 Bokeh are more like applications or, you know, other projects. So there's a project called Microscopium
27:04 that is for, you know, sort of biosciences research. There's a tool called Light Curve that a bunch of
27:09 astronomers put together, which uses Bokeh to let you drill down. So you can see like an image of,
27:13 you know, some star or something and hover over a single pixel and really drill down into something
27:17 about, you know, that image using, you know, all the tools that are in Bokeh. You know, there's
27:21 actually quite a, Dask, of course, is a great example. And then there's actually a bunch of other
27:25 ones on GitHub. And I'm sorry, I don't have a list at the top of my head, but there are a lot of
27:29 exo-bioscience projects that are built on top of Bokeh. And, you know, financial trading is another
27:34 thing that comes up. People have done drug discovery type work on Bokeh. It's interesting where things are
27:39 popping up, especially, you know, now, and now things are more stable. I think it's really a great time for
27:43 that to happen.
27:44 Yeah, that's super cool. The Light Curve project looks amazing. I mean, to explore like data from the
27:50 the Kepler and TESS telescopes, that's pretty cool for exoplanet discovery. I mean, that's,
27:57 it's really exciting.
27:58 Like the project that you worked on is helping scientists like actually look for exoplanets.
28:03 Like, that's incredible.
28:04 That is a, no, it's really gratifying. Like, that's exactly the sort of thing that, you know,
28:07 I'd say we wanted to be able to enable and, and, and help, help make happen. So it's really
28:11 gratifying when people are able to use Bokeh for those kinds of situations.
28:14 Yeah, absolutely. Do you see Bokeh being useful or appropriate for like real-time dashboard type
28:23 of scenarios? I mean, obviously it draws great graphs. And then, you know, we talked about Dask
28:28 and the real view of that. So like, let's imagine I'm like building software for software, for
28:33 stock trading company or something like this. And I want to show in real time what the market's doing.
28:39 All the data the traders need. Would that be appropriate? Is it too low late or too high
28:43 latency or what's the story? I think it could be there's real time. And then there's quote,
28:46 unquote, real time. Right. So it's always depends on what exactly people mean when they say real time.
28:50 When I hear the word real time, I have images of like, you know, very low level, a certain kind
28:54 of system that's implemented that has very specific guarantees about its performance. And for that kind
29:00 of work, Bokeh is probably not suitable. Right. But if you're talking about real time, just in terms
29:03 of like streaming data coming in from a financial system or, you know, from, IOT type devices
29:09 or, you know, that sort of thing. And I think Bokeh can, has been useful. There's certainly
29:13 people who have come looking for support on our discourse or on our other support forums,
29:16 talking about using Bokeh, connecting it to real time sensors, for instance. Right. So they're in a
29:21 factory or a warehouse and they've got data coming in. I want to visualize something. So people do do
29:24 that. So if you mean, you know, quote unquote, real time, and just in the sense that I've got data
29:28 coming in and I want to visualize it in a best ever kind of way, then I think Bokeh is definitely a
29:32 good choice.
29:33 Yeah. And that's mostly what I meant. I mean, obviously not like we need seven millisecond response time or else
29:37 the plane crashes, something like that. Nothing like that. Right. But like, like when you're
29:42 thinking of graphs, right, like a human has to see the graph and interpret the graph. Right. So that's,
29:46 you know, how long is that? It can't be quicker than a hundred milliseconds. Right. Like the human
29:51 can't understand graphs that quick.
29:52 Well, and so real time is not even necessarily about quickness. It's really about predictability. It's
29:56 about specific kinds of guarantees. But, but yeah, so I should mention, yeah, Bokeh does have some APIs
30:00 for streaming specifically. Right. So, you know, if you've got data coming in or you want to update,
30:04 you know, just the newest point, you've got a, you know, a time series with a hundred thousand points,
30:08 you know, plotted and you've got new points coming in at the end. Bokeh can very efficiently just send
30:13 the new data, right. Without sending, you know, the entire data set. So it is very useful for that sort
30:16 of thing. Yeah. So then you load up the historical data and then you just take, you know, the update
30:21 every half a second or something. That sounds pretty doable. Yeah, absolutely. Okay. Even more often.
30:25 Yeah, yeah, yeah. Sure. That sounds, that sounds super interesting. One of the capabilities kind of
30:30 around that, that you talked about is being able to work with like quite large data, maybe having some
30:36 on the server or something like that. And then either interpreting that or running a machine learning
30:41 model against that or something like that. Maybe tell us some of the use cases there.
30:45 Yeah. There's a couple of different ways you could use Bokeh for that. So one is that, you know,
30:49 if you have a large data set, you know, you're not going to send a billion points into your browser,
30:51 right? Your browser will just fall over. So you've got to find some ways to sort of minimize the data.
30:55 And that can be done in a variety of ways. And one of the ways is for instance,
30:57 it's downsampling. So if you have a large data set and you've got some reasonable way that makes
31:01 sense for your use case to downsample it, the Bokeh server can just do that downsampling and then show
31:05 you the subset of data that's relevant, right? So that's one way that you can use the Bokeh server
31:09 to handle sort of large data sets. Another way is I mentioned this tool data shader, which is
31:13 actually specifically designed for being able to, you know, very efficacious visualizations, images of,
31:19 you know, hundreds of millions or billions of points, right? It gives you very fine control over the way,
31:22 basically the more sophisticated version of alpha compositing happens. So you can actually try to
31:25 emphasize the things you want to emphasize in a meaningful way. And so you could use data shader
31:30 to data shade those 100 million points. And then that just produces an image and then you can send
31:35 the image to Bokeh. And so that's a very fast operation. So the data shader is sort of a form of,
31:39 you know, data compression in that sense, but it's still very interactive because, you know,
31:44 you can use the events that Bokeh generates to when you, if you resize the plot to get a new image
31:48 generated, if you've, for instance, changed the balance of the plot, if you pan or zoom,
31:52 you can get a new data shader image based on those new, on those new dimensions. And in fact,
31:55 HoloView sort of does that all automatically for you. You can do it by hand with Bokeh and data shader,
32:00 or, you know, at a high level, HoloViews can sort of take care of that for you. That's another way in
32:04 which, you know, you can sort of reduce the amount of data that you're going to send into the browser.
32:07 Coming up next year, I hope we can actually raise though the ceiling of the number of points you can
32:11 send. We're going to try to do some work, hopefully, to better improve the WebGL support in Bokeh,
32:16 and maybe even just have Bokeh be based entirely on WebGL. In which case, I think we could,
32:20 you know, right now, Bokeh, you could send a few hundred thousand points to it,
32:23 and it's typically okay. But I think we could raise that sort of ceiling a little bit higher
32:27 once we are able to render completely in WebGL.
32:29 Yeah, that would be pretty amazing. Is it using like canvases or something now?
32:33 Yeah, exactly. So it uses the HTML canvas. There is currently some level of WebGL support.
32:37 And the person who maintained that and originally wrote that is just, he's moved on to other things.
32:40 And so that WebGL support is sort of, it needs a little work, a little love and care. And we'll
32:46 probably just go ahead and try to do things sort of from the beginning and re-found that in a cleaner,
32:51 better way going forward. But yeah, there's some WebGL support now, but most of the rendering happens
32:55 on HTML canvas.
32:56 Okay. So let's talk a little bit about some of the internal implementations of this. Like,
33:02 when most people interact with Bokeh, they're probably interacting with some Python API. And as far
33:10 as they're concerned, like that's the end of it, right? Like, I call these functions,
33:13 the plot comes up magic.
33:14 Yeah. And actually the API is actually quite light. So by and large, we've turned the problem
33:19 of creating an interactive data visualization web app into the Python problem of creating a bunch of
33:24 objects and setting their properties, right? So, you know, I mentioned this JSON representation and
33:28 that JSON representation actually mirrors on both sides, a set of objects and those objects being a
33:32 graph. So there's a set of objects like a plot, which has, you know, a bunch of renderers and has a
33:36 couple of axes and some ranges and some data sources. Maybe that's in a layout that also has some
33:40 buttons. So there's objects we call the models that represent all of those items. All of those get
33:45 turned into JSON. And then on the JavaScript side, there's a one-to-one correspondence basically of
33:49 objects that those get turned into JavaScript objects. The role of the Bokeh server is just to
33:53 keep those two sets of objects in sync bi-directional, right? But in terms of what you use from Python,
33:57 you create this plot, maybe use the figure function, which sort of puts a lot of these objects
34:01 together for you in a convenient, meaningful way. And then you can twiddle their properties. You can
34:06 change, you know, the start and end of a range, or you can add the data to the data source, or you can
34:10 change various properties of a circle glyph because you want to change how it appears. And so all of
34:15 this is, you know, just setting these properties on these Bokeh models. And that is the main, the main
34:20 thing that people do, I think. Apart from that, you might be writing callbacks. If you're using the
34:24 Bokeh server, you can write callbacks in Python, you know, for if a button gets clicked or a selection
34:28 is made, you can run Python code. But you can also create JavaScript callbacks for the standalone
34:32 case where, you know, I don't have a Bokeh server, but I still want something to happen when a button
34:35 gets clicked or, you know, the selection is made. You can write a little snippet of JavaScript and that
34:39 will, you know, do that amount of work. And typically those callbacks, the end result of that is, again,
34:43 setting some properties on these objects, right? They might update the data source, which causes the
34:47 plot to update, or they might change the range bounds, which causes the plot to zoom out, that sort of
34:51 thing. So there is definitely API. There are functions, you know, there are functions for embedding,
34:55 there are functions for showing things in notebooks, there are functions for creating plots to start with.
34:59 But most of the content of the Bokeh library is these objects, you know, we call models, and they all have
35:04 these typed properties that you can set values for. And that's the main interaction mode.
35:09 Yeah, so very declarative in that sense, right? You set the aspects or the features that you want,
35:15 and it just figures out how to make that interactive.
35:18 Yeah, exactly. Yeah.
35:19 Nice. So it sounds to me like, listening to you talk, there's a lot going on with JavaScript
35:24 here, even though the typical consumer user of it, the developer doesn't have to care or work with it.
35:31 What are you using there? Like, what was the history? Was that always just straight JavaScript?
35:35 Or what's what are you doing?
35:36 Yeah, it's actually never been straight JavaScript. But you're right, the bulk of the work of Bokeh is
35:40 actually in this library, Bokeh JS, right, which is JavaScript library.
35:42 How big is it?
35:43 So minified, I think the main core library is about 600k. It's a pretty hefty library.
35:48 That's a pretty hefty library.
35:50 It is, right. We're looking to make things, you know, as optimized as we can. We definitely could
35:53 use help from, you know, more experienced JavaScript developers. So when Bokeh started, I mean, it was
35:58 started by me and a few other folks who are working in none of us, I think, had a lot of front end
36:02 experience. I didn't have any JavaScript experience when this project started. And so we actually chose
36:05 CoffeeScript at the time. And so that, I think, was maybe a good choice for the time, because it
36:10 allowed us to iterate very quickly and sort of make, you know, mistakes more quickly, I guess.
36:13 You try out things, you know, it's sort of Python looking like, you know, it's one of these
36:17 transpiled languages that turns into JavaScript. But ultimately, once the project grew very large,
36:22 it wasn't really suitable for that. And so we actually did a large effort. Most of that work
36:26 was done. Heavy lifting was done by one of our core contributors, Mateus, to port Bokeh to TypeScript.
36:31 And that's been a huge win for the project. I mean, just in doing the port to TypeScript,
36:34 a lot of latent bugs and problems were uncovered. Certainly since it's been done, you know,
36:39 I've been prevented from checking in things that would have been an error, you know, by the TypeScript
36:42 compiler. So I'm a big fan of that and glad for that. There are certainly new contributors who find it a
36:47 little bit more difficult or daunting sometimes to work with TypeScript so that there is a barrier
36:50 to entry that's a little bit high for Bokeh. And that's actually just in general, been a problem
36:55 for us, I think, to attract sort of contributors on that side, right? Because Bokeh is targeted towards,
37:00 you know, Python developers with the promise that they really don't have to worry about JavaScript
37:03 if they don't want to. But all the work's actually in JavaScript. And so, you know,
37:07 we need JavaScript developers to come help make Bokeh better. And so for the most part,
37:11 and so that's been a challenge for us a little bit.
37:13 You have this bimodal distribution of skills and desires and stuff like the Python folks and the
37:19 JavaScript folks. And yeah, it's interesting.
37:21 So we're trying to make Bokeh itself like a, you know, there are people who use the Bokeh
37:24 JavaScript library just by itself as a JavaScript library. I would say that from my perspective,
37:28 quite a bit of work is needed to do to make that a serious sort of contender for something people
37:32 want to use. But we definitely would like to get that done. And we'd love to get help doing that.
37:35 I think making Bokeh JS as sort of a first class JavaScript library in its own right would be very,
37:39 helpful for our project. And certainly it'd be great to get a community around that as well.
37:43 But that's a longer term goal.
37:47 This portion of Talk Python to me is brought to you by Linode. Are you looking for hosting that's
37:52 fast, simple, and incredibly affordable? Well, look past that bookstore and check out Linode at
37:57 talkpython.fm/Linode. That's L-I-N-O-D-E. Plans start at just $5 a month for a dedicated server
38:04 with a gig of RAM. They have 10 data centers across the globe. So no matter where you are or where your
38:09 users are, there's a data center for you. Whether you want to run a Python web app, host a private Git
38:14 server, or just a file server, you'll get native SSDs on all the machines, a newly upgraded 200
38:20 gigabit network, 24-7 friendly support, even on holidays, and a seven-day money-back guarantee.
38:26 Need a little help with your infrastructure? They even offer professional services to help you with
38:30 architecture, migrations, and more. Do you want a dedicated server for free for the next four months?
38:35 Just visit talkpython.fm/Linode.
38:38 It's interesting that you found TypeScript to be a nice way of working and whatnot. And I find it'd be
38:46 pretty nice as well. Certainly, if I had to choose between CoffeeScript and TypeScript, I would
38:50 definitely choose TypeScript. You know, I think TypeScript is interesting in that it's a superset of
38:55 JavaScript. So all your regular JavaScript just works, but you can like typify it and make it have other
39:00 features and capabilities that that language brings. And that's a pretty interesting way to approach that
39:07 problem.
39:07 Oh, definitely. Yeah. And to be clear, I think CoffeeScript was the right choice in 2012. I don't,
39:11 it's not at all the right choice for anything, I don't think, in 2019. I think at the time,
39:15 Bokeh was one of the largest CoffeeScript libraries probably ever developed, which is interesting,
39:20 sort of a bit of trivia. But like I said, it let us move fast, especially not having a lot of
39:24 experience in front-end dev. But, you know, after time, we just needed something more,
39:29 a little more serious, for lack of a better word.
39:30 Yeah, sure. I'd just like to get your thoughts real quick. Like, so TypeScript is all about,
39:35 you know, sort of static typing and checking and whatnot of your code. And we kind of have that
39:42 in Python a little bit now, to the extent that people want to bring it in with mypy and type
39:47 annotations. But it's not really the main zen of the language of Python.
39:52 What are your thoughts of like working in these two languages, kind of side by side on the same
39:55 project?
39:55 Well, so this is a really interesting question. So there is actually a history of various projects
39:59 that add what's called, I think, manifest typing to Python. And so that goes back to,
40:03 there's definitely a project called Traits that Joseph Morrill created that was, you know,
40:07 sort of, you could add types to classes, and those would get checked at runtime. And you could also
40:11 do things like reactive programming and event-based programming based off changes to those values.
40:15 Traits auto-created like QT GUIs, I think, from classes as well, the panels.
40:20 And there's another one called Param. And I think there's now one called Struct. But Bokeh
40:24 also has its own property system. I mentioned these properties of models. Bokeh has its own
40:28 property system, which is rooted in a bunch of fun metaclass programming that lets you add these
40:33 declarative types. So the actual models I mentioned for Bokeh objects are typically have no code in them.
40:38 They're just classes with these property definitions that say, oh, you know, my plot width is an int,
40:44 or my source property is an instance of a column data source, or the range has two floating point
40:50 values start and end. And so we're able to provide runtime feedback. If people try to set, you know,
40:55 the range.start equals some string value, we say, hey, that's not an appropriate value. It needs to
40:59 be an integer. And we also know what properties are on objects. So a feature people have really
41:03 complimented it's about is people sort of fat finger a property name, we'll actually give a suggestion
41:07 and say the nearest property names are named this. And so we-
41:10 Oh, that's nice.
41:10 Kind of a type system. Yeah, a really nice feature. It's sort of one of those simple things
41:13 you don't think about until you see it. But we've had a type system in Bokeh since the beginning,
41:17 right? And so it's a little interesting now that mypy is becoming more popular. We are interested in
41:22 looking to use mypy basically after Bokeh 2.0 comes out and we drop Python 2 support. We're
41:28 interested in trying to integrate mypy, you know, wherever we can. I think it's a useful tool.
41:31 I hadn't used it much until recently, but I have seen it used to good effect. And so I'd like to try to
41:35 improve that. I don't know how much we'll be able to use mypy to replace our existing
41:39 sort of type property system because that would be a huge endeavor because our properties,
41:43 they aren't just the type checking. They also plug into our documentation system so we can auto
41:47 generate our reference documentation. Wow.
41:49 And of course, all the auto synchronization is based on this too, right? A lot of the machinery for the
41:53 automatic synchronization and serialization is based off these property definitions, right?
41:57 Right. Like to notify that something has changed to people who are interested and things like that.
42:02 Yeah. So replacing it with mypy is not something I'm sure we can do for the properties,
42:06 but there's plenty of other places in the library where mypy would be a great benefit
42:10 to help us sort of tighten things up. And so we're looking at that after Bokeh 2.0.
42:13 Okay. Yeah. Yeah. Very cool. Maybe we could do a quick tour of some of the interesting graphs
42:18 or visualizations that you find, you know, like kind of interesting and worth talking about,
42:23 like over at demo.bokeh.org or just bokeh.org and just click on the gallery and demos and stuff.
42:29 There's a bunch of cool ones that has the source code. There's some interactive bits and so on.
42:33 You want to tell us about something you think are worth checking out?
42:35 Yeah. So for sure. So first off, if you go to demo.bokeh.org, these are all specifically
42:39 Bokeh server applications. So these are all backed by running Python process. And when you click a
42:44 button or make a selection, that triggers real Python code. If you go to the gallery on the docs,
42:48 most of those are standalone. And so they don't, they aren't backed by a Bokeh server just to get
42:52 that distinction out of the way. But at demo.bokeh.org, there's a couple of interesting ones here.
42:55 The first one on the upper left is this movie data explorer. And this is actually a fairly direct
43:00 comparison, intentional on our part, to a tool called the Shiny Movie Explorer. So people have
43:05 asked for a long time, where, you know, where is Shiny for Python? So Shiny is this tool for creating
43:08 sort of interactive data visualization applications from the R language. People ask, where is Shiny for
43:13 Python? So we're trying to answer that question. And I think Bokeh is a pretty good, decent answer to
43:16 the question of where is Shiny for Python. But so we made that as a pretty direct comparison. So that's
43:20 one that's interesting. Right next to it, there's this selection histogram, which I think is pretty
43:23 cool. So it's got a couple of distributions of scatter points on a plot. And if you make a
43:27 selection, it shows the histograms on both axes. And if you make a selection across those points of,
43:31 you know, a subset of those points, it then highlights and shows you the histogram of just
43:35 the selected points. And then sort of in the opposite direction, the select the histogram of
43:39 the unselected points and sort of a shadow faded out version. Wow. Yeah, that one's really cool.
43:43 That's a cute one. We've been working on that one for quite a while. It's gone through several
43:46 iterations that actually helped us uncover some problems with the Bokeh server early on. It was just sort
43:51 of behaving in a weird way and stuttering and realized that events were sort of boomeranging,
43:54 sort of making a boomerang effect. And so we had to sort of fix that out. But that was a great example
43:59 to help us figure out some of those problems. And we have a lot more things under a lot more rigorous
44:02 tests now. So that's good. But yeah, I like that example a lot. Another one we have is this
44:07 reproduction of the gap, the gap minder demo. So, you know, Hans Rosling did this, you know,
44:12 famous TED talk where he showed all this data. And so we've reproduced that in Bokeh. We've also
44:16 embedded the YouTube video. We wanted to be able to show being able to use, you know,
44:19 a template to embed Bokeh content in a template with other content. So this also has this YouTube
44:23 video embedded.
44:23 Yeah, that talk by Hans Rosling, you have the video there. It's really worth watching. Like
44:30 that guy really makes statistics and just data like relevant for humanity in a great way.
44:37 Absolutely. No, I'd recommend anyone to go watch the video, regardless of where they look at the
44:42 Bokeh bar. It's a great video. And I think it's a really compelling one. Tells a great story. So
44:46 I'd recommend anyone to go check that out. For sure. Let's see, lower left, there's actually a
44:50 financial chart. So here you can have time series from two sort of financial, you know,
44:54 data sets and you can do sort of a cross correlation between them. And you can see the
44:58 pandas sort of statistical summary there. And you can, you know, use the dropdown to choose
45:02 different time series and then the table updates and the data, you know, the plots update. So that's
45:06 a nice one as well. And then on the bottom right, there's kind of interesting one that's got this 3D
45:10 plot. And this is maybe confusing for some people. Bokeh itself is not a 3D plotting library and has
45:14 no inherent 3D capability built in. But Bokeh is very extensible. At some point, we realized that,
45:19 you know, lots of users have use cases that are eminently reasonable and, you know, really cool
45:24 that we're just not ever going to have the capability or resources to sort of do in the library. I mean,
45:28 you have to sort of limit the scope of the core library at some point. Yeah. So we work to make
45:32 Bokeh extensible. And so you can create these custom extensions that behave just like built-in
45:36 Bokeh models. And they plug in just like, you know, the plot object or a widget object, right,
45:41 into Bokeh content. And so this is an example of that. And so this is a custom extension that wraps
45:46 a little 3D JS library. And you use the standard Bokeh data sources and you update them. And then
45:51 this little 3D plot updates because basically the custom extension just wires together the Bokeh data
45:55 source with whatever this other library expects. And so it's really neat example of that. And there's
46:00 other examples of extensions in the docs as well for different kinds of use cases. If you want to like,
46:04 if you have some really cool JavaScript widget, you want to connect to Bokeh content. If you actually,
46:07 if you have a cool JavaScript widget that you want to connect to, you know, all these PyData tools,
46:12 like you want to connect this JavaScript widget to, you know, scikit-learn or to Dask or to Numbo or,
46:16 you know, pandas, Bokeh is a great bridge for that, right? You can just write a custom extension that wraps
46:21 the JavaScript component and then it's automatically Bokeh server can connect it to all those tools.
46:25 Yeah. And get all the change notification and interactivity and everything. Yeah.
46:29 That's super cool. Okay. Let's see. What else do you want to talk about? You all have Bokeh 2.0 on the
46:35 roadmap. What's going on with that? Yeah, absolutely. I would say we were targeting August,
46:39 but I think maybe a little more realistic at this point is September. We're always a little optimistic
46:43 in our estimates for our schedule. Welcome to software development, right? That's how it goes.
46:48 We're all like that way. I don't even want to speculate the first time we promised Bokeh 1.0 and
46:52 sort of stability. That was probably a couple of years too early, but we're a little more on track
46:57 for Bokeh 2.0. But the main thing about Bokeh 2.0 is just that we are dropping Python 2 support and
47:02 also Python 3.4 support. So Python 3.5 will be the minimum version. As long as we are doing a major
47:06 version bump, we're also going to take the time to clean up a few other minor things. So there's a few
47:10 minor changes that are coming. Hopefully nothing that's too disruptive for anyone. We're going to
47:14 be sure to outline and document all those in a migration guide. But that's the main thing is the
47:18 Python 2 support. And it gives us a chance to do some things like move to native coroutines. So we use
47:23 tornado as the base for the Bokeh server. But if we move to Python 3.5 as a base, we can use native async and
47:29 await coroutines everywhere. I'm still with tornado, but it helps us clean up the code a lot. And just
47:33 in general, it'll help us clean up the code base and make it a lot more maintainable and sort of
47:36 shrink it. And it's always good to delete and shrink code for sure.
47:39 Yeah. If you maintain it by deleting it, like you're good. Yeah, that's a good way to do it.
47:44 Do you think that'll help attract more maintainers to say like, hey, you could work on this cool async
47:48 IO, async and await library rather than, you know, this thing called tornado and these
47:53 coroutines?
47:54 Well, so it's still going to use tornado. And tornado is a really great tool, but I think it may.
47:57 It broadens the thing a little bit to hopefully some more developers. And there are, I stress that
48:02 there's a lot of work in Bokeh TS, but there's plenty of work on the Python side to do as well. And we'd
48:06 love to have contributors. And honestly, there's actually a lot of work that's not coding. I'd love
48:10 to get other contributors involved in all kinds of ways. And if I can speak a minute about that, I mean,
48:14 yeah, go for it.
48:14 Obviously, people talk about, hey, we need testing help and docs and design help and that's,
48:18 or docs help. And that's certainly true for us as well. But other maybe ways people don't think
48:22 about it is, you know, we'd love to get like designers, front end designers to come help
48:25 make our assets better, to come, you know, help us improve the visual appearance of Bokeh. Cause
48:29 you know, we've done okay, but we're not designers. And so it'd be great to get that kind of help.
48:34 We actually have a lot of infrastructure now on places like DigitalOcean and AWS, and it would be
48:39 great to get experienced people that know those, those systems and those DevOps on those systems to
48:44 come help us optimize them for cost, optimize them for usage, you know, whatever.
48:47 Yeah. You guys are doing cool stuff with Docker, right?
48:50 Yeah, we do a couple of things with Docker. So we run the demo sites, actually a Docker image that's
48:53 run on Elastic Beanstalk. And I actually just recently changed some of the instances that that
48:57 was running on to hopefully make them a little bit more cost effective for us. But we also just
49:01 recently had a spike in S3 usage on one of our buckets that I couldn't really explain just yet.
49:06 And so I'd love to get experienced people that can, you know, help with those sorts of things.
49:09 Outreach is another area. We're really trying to ramp up our outreach, both to the community in terms of,
49:13 you know, fundraising, but also talking to companies. And we've had a couple of people help with that.
49:17 And actually just offering support, right? We just moved our mailing list to a discourse instance,
49:22 discourse.bokeh.org, which is infinitely better. I mean, the discourse is great for users because
49:27 there's a lot of features for code highlighting, for math texts, just, you know, all kinds of things
49:31 we could imagine maybe putting an extension to put in actual bokeh content into these discourse posts.
49:35 But it's also great for us as maintainers because discourse has a lot of information about what are
49:39 people searching for, you know, what topics are popular, that sort of thing. So that helps us know maybe
49:43 where attention needs to go. But just answering questions there, people want to go offer support
49:47 and help other people use bokeh. That is also a huge deal. I think bokeh has been successful because
49:52 we've had a few people that have been able to put a lot of time into helping, you know, the community.
49:55 But as the community grows, that's got to scale. It needs to have more and more people helping each
49:59 other. And so that kind of thing would also be a great way to contribute to the project. And so
50:03 there's all kinds of ways people can plug in. And we'd love to, you know, engage with anyone,
50:07 really, about any of those tasks.
50:08 Right, right. If you're a designer and you want to make the website look shiny,
50:12 that'd be great. If you want to make the graphs look better, or maybe you're a visualization
50:17 expert and you've got a different kind of graph you want to bring, whatever, right?
50:20 Yeah, absolutely. Or just even making new examples for the docs, you know, making really cool uses of
50:25 bokeh to show off, to tweet about, to put in our docs and our gallery. I mean, there's all kinds of
50:29 ways to make very valuable contributions to the project just because there's a lot of things to do. And
50:33 you know, presently not enough people do them, probably never enough people do them,
50:36 right. But obviously, the more help we can get, the better.
50:39 Sure. So if I could summarize, you're willing to accept contributors to the project.
50:43 Yeah, absolutely. If I hadn't made that clear, yes.
50:46 That's awesome. Yeah, it's a cool project. It would be fun to work on.
50:50 Yeah.
50:50 As part of this bokeh 2.0 thing and the dropping of Python 2, which I like to refer to as legacy Python,
50:56 and Python 3 just as straight Python. But as part of dropping legacy Python, one of the things you did,
51:03 this is kind of a trend in the data science space, not, I haven't seen it as broadly adopted,
51:06 and I'm not really sure why, you signed the Python 3 statement. You want to tell folks about that?
51:12 Yeah. So the Python 3 statement is just, you know, it's a GitHub repository where projects can go and
51:16 sort of make a PR to list themselves on this website. And it says, we're going to, we pledge to drop
51:20 Python 2 support, you know, by sort of this date or this timeframe, and support Python 3 going forward.
51:25 And so there are a lot of projects that have signed that. And it's interesting, I thought going in that
51:29 bokeh was going to be maybe kind of a leader in this, I wanted to be fairly aggressive. But
51:33 all of a sudden, this year, a ton of projects have started releasing, you know, new releases that sort
51:37 of are cut off from from that. And so like things like I think Matt Potlib, and yeah, and I think,
51:42 you know, Dask, maybe and I forget what else, but there's all these sort of big projects that are
51:46 just suddenly, you know, we're never behind the curve, right? They've already dropped Python 2 support,
51:50 and we're sort of lagging behind. But I think it's time. I mean, bokeh is definitely used a lot in,
51:54 you know, analytics space. And I think things do move a little bit faster there.
51:57 Part of that is because of Conda and Anaconda and Conda Forge, they sort of push things forward.
52:02 I think also data scientists, you know, do a lot of exploratory work, and they're willing to sort of
52:06 move a little bit, you know, in that exploratory work, they're willing to sort of move and put up
52:09 a little bit more change to get new features and to, to get that level of performance better.
52:14 Once things get deployed, that's when things get a bit more sticky. And that's where you see a lot of
52:17 people still using Python 2 and, you know, finance and, you know, other venues like that.
52:21 Yeah, absolutely. I feel like these the data science exploration stuff and the models,
52:25 like the underlying technology is changing so quick there, right? Like TensorFlow has come out and
52:31 pandas and all these things are just changing so quickly that if you're going to come back to it,
52:36 you may want to just move to something new or shiny or better anyway. And you just it's much easier to
52:41 stay on Python on the later version of Python, whereas like, that website that that guy that used to
52:46 work here ran that now we just have to keep running like nobody wants to touch that, right?
52:51 As soon as you touch it, it's your problem to fix it if it ever has a problem. And nobody wants that
52:55 puppy, right? Yeah, yeah. So some of the companies that are projects that sign the Python 3 statement,
53:00 just Python 3 statement, the number three statement.org, TensorFlow, requests, XGBoost,
53:06 NumPy, IPython, like that kind of stuff, right? Cython, Spider. There's a ton of projects here.
53:12 Yeah, it's great.
53:13 A lot of those projects already have. I thought we'd be sort of leading the pack, but we're actually
53:17 behind the curve. And that's what made it very easy for us to say, okay, you know,
53:21 Q4, it's going to be really easy for us to drop Python 2 because all these other projects will
53:24 have already dropped Python 2.
53:26 Right, right. For example, Tornado is in there and you guys are built on that. So in a sense,
53:29 they're kind of calling your, not calling your bluff, but making sure you're going to have to
53:32 follow along anyway if you want to stay in the latest of that, right?
53:35 Yeah, even NumPy, right? I mean, you know, obviously, we could pin to a lower version of NumPy, but we don't want to do that.
53:40 Yeah, of course, you wouldn't want to do that. Interesting. So we're just about out of time,
53:43 but you want to talk about Portland real quick? Sure. Yeah. Yeah. So we were both in Portland,
53:48 right? You recently, somewhat recently, not super recently, but you're somewhat new here and you're
53:54 trying to get some stuff going in the data science space in Portland as well, right?
53:57 Yeah, absolutely. Yeah. So I've been here about a year and a half and it's been a really great
54:00 experience being here in Portland. I really love it. But I am trying to get a PyData meetup here
54:06 started. In fact, we have a first meetup scheduled for, I think, August 14th. And we're going to
54:11 alternate sort of between an east side and a west side, you know, downtown location, hopefully,
54:15 every other month. But me and a colleague of mine are getting that off the ground. And I'm really
54:20 excited about it. So PyData is this series of meetups slash conferences, if the meetups get big
54:24 enough, that is sort of sponsored by NumFocus. And, you know, I've been involved with NumFocus since the
54:28 beginning. I think it's a terrific, amazing organization. The people that are there are really
54:31 great. And I think the PyData meetups in particular have been really, really great, you know,
54:35 both meetups and also the PyData conferences are also really good as well. So really excited to get that
54:39 started in Portland. I was almost kind of surprised that it wasn't here already. There's a, you know,
54:42 there's a PyData Seattle meetup and there's PyData meetup. There's like 105 PyData meetups,
54:47 I think, around the world. So it was by far time that Portland gets one. So I'm really excited to
54:51 be helping get that off the ground. Yeah, that's awesome. I mean, that only really is interesting
54:55 to like 5% of the listeners, maybe. But it's still really cool that you're doing that here in
54:59 Portland. And, you know, other folks, they can create a PyData, their city's airport acronym if
55:05 they want, right? Absolutely. I think most places don't use the airport acronym,
55:08 but I'm really fond of PDX. So I invited a PDX on a good.
55:11 Yeah, it's definitely a good one. All right. Well, you know, Bokeh is a really cool project,
55:15 and I'm glad you all have been working on it. And it's great to see all this progress and excitement
55:21 around it. It's a great, great one. So people should definitely check it out.
55:24 Yeah, yeah. Well, thank you very much for having me. I love to have the opportunity to sort of spread
55:27 the word and talk about Bokeh. And it's been really great.
55:30 Yeah, absolutely. Let me ask you the final two questions before you get out of here, though.
55:33 If you're going to write some code, probably Python code, but maybe JavaScript as well,
55:37 I guess. What editor do you use?
55:39 I have been won over by VS Code.
55:40 Okay.
55:41 Yeah, I still use VI binding. So I grew up using VI and that's still in my fingers. And so I love
55:45 VI bindings. But I used to use Sublime Text, but I moved to VS Code and haven't looked back.
55:49 Yeah, that seems a pretty straightforward choice to go from Sublime to VS Code.
55:53 Those are, you know, one has so much more energy and they're super similar in their sort of workflow.
55:58 And then notable PyPI package, I'll go ahead and throw out there Bokeh for you. And what else? If
56:04 there's something like, hey, I ran across this and people might not know about it, but it's really
56:08 amazing. What would you say?
56:09 Yeah, let's see. Well, I'll say PyPI or Konda, right? So don't forget Konda. Very important
56:13 to remember. But, you know, it's hard to say. I'm so focused on, you know, using and working on Bokeh.
56:21 That's like my day to day. Honestly, I have a little bit of tunnel vision, maybe to put it one
56:25 way. But, you know, I think a lot of the tools that are built on top of Bokeh are really interesting to
56:29 me. And so I like, you know, looking at what is happening with them and seeing what developments
56:32 are going. Obviously, I think all the tools in the PyData ecosystem are amazing. I think Numba in
56:37 particular is really interesting. So Numba is a compiler for Python, lets you really accelerate,
56:41 you know, certain kinds of code. And it was originally created again by Travis, you know,
56:45 Oliphant, but it's been moved on since then. And it's actually grown really successful in certain
56:50 kinds of venues. So I think Numba is a pretty interesting use case. And I certainly, of course,
56:53 think Dask is really fantastic as well. Yeah, those are definitely good ones. All right,
56:57 Brian, final call to action. People want to get started with Bokeh. What do they do?
57:01 Yeah, absolutely. Love to get people involved. So if you want to, you know, talk about development or
57:06 have questions about support, we have this discourse, discourse.bokeh.org. If you want to just
57:09 get started from a very high level, just bokeh.org is a great one stop to get to a lot of other
57:14 resources like documentation, like the gallery, like the GitHub page, and just to see what,
57:19 going on. But in terms of like talking to us, yeah, the discourse is a great spot to make a
57:23 poster topic there. And of course, GitHub is a great place if you have ideas for suggestions,
57:27 you know, or want to report problems, of course, you know, GitHub is a great place to contact us.
57:31 Yeah, for sure. And PRs are accepted.
57:32 PRs are always accepted. Yes.
57:34 Yeah, very cool. At least considered.
57:36 Consider.
57:37 At least considered for sure.
57:38 Cool. All right. Well, thanks so much for being on the show. It's good to talk with you.
57:41 Absolutely. Thank you so much, Michael.
57:42 Yeah, bye.
57:43 Bye.
57:43 This has been another episode of Talk Python to Me. Our guest on this episode was Brian Vandevin,
57:50 and it's been brought to you by Ting and Linode. Ting is the fast mobile network custom built for
57:55 technical folks. Use their savings calculator to see exactly what you'd pay. Visit python.ting.com
58:01 to get a $25 credit and get started without a contract. Linode is your go-to hosting for whatever
58:08 you're building with Python. Get four months free at talkpython.fm/linode. That's L-I-N-O-D-E.
58:14 Want to level up your Python? If you're just getting started, try my Python Jumpstart by Building 10 Apps
58:21 course. Or if you're looking for something more advanced, check out our new async course that digs
58:27 into all the different types of async programming you can do in Python. And of course, if you're interested
58:31 in more than one of these, be sure to check out our everything bundle. It's like a subscription
58:35 that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for
58:41 Python. We should be right at the top. You can also find the iTunes feed at /itunes, the Google
58:46 Play feed at /play, and the direct RSS feed at /rss on talkpython.fm. This is your host,
58:53 Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write
58:57 some Python code.
59:01 We'll see you next time.