Learn Python with Talk Python's 270 hours of courses

#212: Python in Web Assembly with Pyodide Transcript

Recorded on Tuesday, Apr 23, 2019.

00:00 It's been said that JavaScript is the assembly language of the web, but should you be required to write an assembly language or even JavaScript if you don't want to?

00:08 Most platforms have a dizzying array of options for programming them.

00:12 Not the front-end web world, but that tide may be turning, and WebAssembly could be the key to making it happen.

00:17 With WebAssembly, we have a new compilation target for web browsers, and Michael Dropboom from Mozilla and his team have decided to help bring

00:25 the Python scientific stack to the front-end world with Pyodide.

00:29 Dive into Pyodide on this episode, 212, of Talk Python to Me, recorded April 23, 2019.

00:36 Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities.

00:55 This is your host, Michael Kennedy.

00:58 Follow me on Twitter, where I'm @mkennedy.

01:00 Keep up with the show and listen to past episodes at talkpython.fm, and follow the show on Twitter via at Talk Python.

01:06 This episode is brought to you by Microsoft.

01:08 Be sure to check out what they're offering during their segments.

01:11 It really helps support the show.

01:13 Michael, welcome to Talk Python.

01:16 Thanks. It's good to be here. Thanks for inviting me.

01:18 Oh, you have such an interesting topic and thing that you've been working on.

01:21 I was really excited when I heard about Pyodide a little bit, I don't know, maybe six months ago

01:26 or when I first heard about it and stuff.

01:28 And I was like, oh, this has some real possibilities.

01:30 So I'm really excited to have you here to dig into that.

01:33 Cool.

01:33 Yeah.

01:33 Now, before we do get into that, let's start at the beginning.

01:36 Let's start with your story.

01:37 How did you get into programming in Python?

01:39 So I've been programming almost as long as I can remember.

01:42 I think my parents brought home like an IBM XT sometime in the mid-80s.

01:47 And I learned, you know, basic on that as one does.

01:50 And just have been programming ever since.

01:53 I found Python, I think, in 1996 while I was at university.

01:56 And I sort of used it secretly in the background to like prototype my assignments that I had to

02:02 write in other languages.

02:03 I would kind of write these quick little hacks as I'm learning to program in Python because it

02:07 was a lot easier for me.

02:09 And then once I had figured out the problem, I'll convert it to LIS.

02:13 I can now solve the syntax problem to make this happen over here, right?

02:16 Exactly.

02:17 Yeah.

02:17 What languages were you using?

02:18 There was a lot of LIS where I went to school and as well as Java and ML was a big thing then,

02:25 which, you know, sort of grew into Hamill and that family of languages.

02:28 Yeah.

02:29 My first CS language that I only took a couple of CS classes as a minor type of thing,

02:34 but my first one was in Scheme, which is a derivative of LISP, right?

02:39 And I felt that it was both very mind shifting and very interesting and also super not useful.

02:46 Like what was your feeling of studying LISP early on?

02:49 Like I can't go build anything with this.

02:51 What is this crazy language?

02:53 Yeah, it definitely is kind of mind bending.

02:56 And, you know, I think it enforces a lot of good habits, like functional programming ideas

03:01 that you can bring into any language that are probably good habits to have.

03:04 But yeah, like you say, it's not always the most practical.

03:08 I know this is a question that usually is at the end of your show, but I use Emacs.

03:13 And so occasionally to write Emacs extensions, I'll bring out LISP.

03:17 And what's fascinating about it is Emacs is basically built by monkey patching everything else.

03:24 That's Emacs extensions, basically monkey patching is the functional thing that you do,

03:30 which I find interesting.

03:31 That is interesting.

03:32 Wow.

03:32 I guess tends to happen in LISP a lot.

03:35 Well, that's the only big LISP code I'm really familiar with.

03:38 Yeah, Emacs is definitely one of the bigger projects.

03:40 Quite, quite interesting.

03:41 Cool.

03:42 Okay, so you sort of used it.

03:44 You got into it pretty early, right?

03:45 Like, what is that?

03:46 Four or five years into the existence of Python, which for Python, you know, it grew grassroots.

03:53 It wasn't like busted on the scene by Microsoft or Apple or whoever, right?

03:57 That's pretty early.

03:57 Yeah, I went to my first Python conference in 2001.

04:00 And at that point, I don't remember the exact number, but I would guess it was maybe around

04:05 200 people, not much larger than that.

04:07 It was one track.

04:08 Everybody could fit in one room.

04:10 It was small enough that as a grad student, I was able to just walk up to Guido and have

04:15 a talk with him because there wasn't that many people demanding his attention.

04:19 And of course, now you go to PyCon and, you know, forget anything like that happening.

04:23 There's just thousands of people.

04:24 And yeah, it's completely crazy.

04:25 Yeah.

04:26 I mean, it's really fun to see how big the community has grown.

04:29 On the other hand, I do miss the days of the sort of smaller conferences where you feel

04:32 like you got a hold of it and you feel like you got a sense of seeing a lot of the stuff.

04:36 But yeah, these are good problems to have, I guess.

04:38 I guess they are.

04:39 How does it feel to do Python now versus then?

04:43 It's got to be kind of different with pip install, anti-gravity and all that.

04:47 Yeah, I think definitely having good package management means you're no longer compelled

04:54 to like bring everything into your projects in the same way.

04:57 So I used to be the developer on Matplotlib.

05:00 And one of its sort of, I guess, technical debt, for lack of a better word, is the fact

05:05 that for the longest time it was if someone had a great idea for a new plot type, we'd just

05:09 say, yeah, let's include it in Matplotlib.

05:11 Because the alternative was forcing all the users to install a bunch of packages, which

05:16 was really hard at the time.

05:17 So we would make this one really big package.

05:20 And, you know, in hindsight, that's not the greatest thing.

05:23 It means a lot of code for the core developers of that project to maintain.

05:27 So it is nice that we live in this world now where we can have lots of little packages that

05:31 interact.

05:31 And that's not a huge burden on the user as it once was.

05:35 Yeah, that's pretty interesting.

05:36 I mean, maybe it would have been better to have a bunch of little Matplotlib dash extensions

05:41 that you include in your requirements files if you want to do these kind of graphs and stuff.

05:45 And you could have kept it a little more distributed in terms of support.

05:48 But yeah, it's the right architecture and patterns for the right time, right?

05:53 And that's what you needed, yeah?

05:54 Yeah.

05:54 Cool.

05:54 So you both work at a cool web company and do data science.

05:58 So you kind of do both of the things that Python is really good at simultaneously, right?

06:03 Tell us about that.

06:04 Yeah, so I'm a data engineer at Mozilla.

06:06 I've been there about a year and a half.

06:07 And I work on the team that manages the telemetry that comes from our products.

06:12 So from Firefox on desktop and on Android and iOS and all these things.

06:17 You know, and the telemetry goes into improving the product.

06:20 It helps us discover when things are going wrong, whether things are getting better with changes

06:24 or worse with changes and things like that.

06:26 And so there's a whole team that manages collecting that data, ingesting that data, and then providing

06:31 ways for people to analyze it at the end of the day.

06:34 And what's really exciting about how they do that at Mozilla is we have this document called

06:39 our Lean Data Practices, where we try really hard to not collect anything that we don't

06:44 need to collect, not collect anything that will invade people's privacy.

06:48 We really are just collecting what we need in order to improve the product.

06:52 And so it's really nice to come at it from that point of view, not just to sort of snarf

06:55 it all up and see what we can do later, but to really think upfront, do we need to do this

07:00 and do we need to collect this?

07:01 Yeah, that's super cool.

07:02 Like maybe you had for a while, Firefox had the address bar and then like a search box,

07:07 right?

07:08 Most of the browsers have given up on this idea.

07:10 But you could tell that from telemetry, right?

07:12 How people are using each part and so on, yeah?

07:14 Exactly.

07:15 Yeah.

07:15 We're able to sort of see how the search bar is getting used even now that it's a unified

07:19 thing.

07:19 It's a really fun place to work.

07:21 The engineering talent there is just beyond anything I've ever had the privilege of working

07:26 with.

07:27 So I can imagine that it's super awesome.

07:28 I'm personally a big fan of Firefox.

07:30 I generally just use Firefox if at all possible.

07:33 And it frustrates me to no end where I go to places and they're like, this page is only

07:38 available on Chrome or this site only streams this video on Safari.

07:43 And you know there's no good reason for it other than they're just lazy to do like the

07:48 tiny bit of effort to make it work, right?

07:50 Yeah.

07:51 It's unfortunate.

07:52 You know, I think one of Firefox's biggest challenges is that unfortunately there's like

07:58 a network or a snowball effect, right?

08:00 It's that the more websites that don't work on Firefox, the fewer people are going to use

08:04 it.

08:05 And therefore, the fewer websites are going to work on Firefox.

08:07 And so trying to break out of that cycle is sort of a constant battle for us that we're

08:13 tackling on a number of fronts.

08:14 Props to you guys.

08:15 You're doing great stuff, keeping the web open.

08:17 I really am a big fan of Mozilla.

08:19 And I guess our topic today is just one more reason why.

08:23 Cool.

08:23 Yeah, absolutely.

08:23 So we're going to talk about WebAssembly and Pyodide.

08:27 But I think it maybe makes sense to just sort of lay out a little bit of the history of like

08:34 what even led to WebAs, not necessarily what led to WebAssembly, but like what preceded WebAssembly,

08:40 you know?

08:40 Sure.

08:41 Maybe I'll take it up to WebAssembly and you can take it from there.

08:43 So we have way back in, I don't know, 1995, browsers were things that looked at documents

08:49 and people imagine like what if they could do more than just show documents, right?

08:53 So Netscape, that company that used to exist, I also was a fan of them.

08:59 Guys at Netscape came up with JavaScript and it was named JavaScript not because it had anything

09:03 to do Java, but because Java was cool and hot.

09:05 So this is the scripty cool language.

09:07 In 10 days, like it was created in 10 days, start to finish and shipped in Netscape Navigator,

09:13 like three months later or something.

09:15 And boom, it oddly becomes the most popular language in the world and sort of becomes like

09:20 this assembly language of the internet or maybe C, I don't know how you think about it, like

09:25 just the base language of the internet, right?

09:29 And then some other projects came along, ASM, ASM, ASM.JS, and some other ones show that you

09:35 can take C code, compile it down to JavaScript and do incredible stuff.

09:40 Like there's a super, super funny video by Gary Barnhart called The Birth and Death of JavaScript.

09:46 Have you seen this video?

09:47 Yeah, it's fantastic.

09:48 It's like 15 minutes of beautiful history, but humor all mixed together.

09:53 It's really, really insightful.

09:55 And basically in there, he shows things like, I don't remember the exact one, like Chrome

10:00 running in JavaScript inside of Firefox or the other way around, just like really interesting

10:04 stuff like C games in the browser and all that stuff is amazing.

10:08 So it really like showed the possibility of, you know, like the web really can be, and JavaScript

10:15 can be the foundation of not just jQuery and Angular, but Doom or Quake or Firefox, right?

10:24 Like incredible power.

10:26 But it doesn't make a lot of sense to compile stuff to JavaScript, to ship it down, to then

10:31 reinterpret it, to like then compile it and JIT it and run it.

10:34 Like, wouldn't it be better if there was a binary way to like compile for the web, right?

10:40 And so that's my, I guess my introduction to this idea of WebAssembly, right?

10:45 Yeah, that's a great little history.

10:47 I mean, most of that predates my coming to WebAssembly.

10:50 I've only been using WebAssembly for about a year and a half when this whole project started.

10:54 So fortunately, Mozilla has a lot of people who do work on WebAssembly and have web data from the

10:59 beginning.

10:59 So it's really nice.

11:00 Did it originate with Mozilla?

11:01 It did.

11:02 Yeah.

11:03 I felt like Rust and WebAssembly and all that stuff kind of came from you guys.

11:07 Exactly.

11:07 Yeah.

11:08 It originated at Mozilla, but it is definitely an open standard that all the browsers are supporting

11:13 and all that stuff.

11:13 Yeah.

11:14 Super cool.

11:14 So, I mean, maybe give the elevator pitch of like, what is WebAssembly for folks who don't

11:20 necessarily know?

11:21 Yeah.

11:21 So it's basically a binary format.

11:23 It's designed to be something that your compiler writes to.

11:28 It's not something you would ever write by hand or people do.

11:31 It's a target for something like a C compiler or a Fortran compiler or a Rust compiler, right?

11:36 Any of these compiled languages.

11:38 And when that gets shipped to your browser, the browser then converts that into native machine

11:44 code that runs on your machine.

11:45 But it's also running inside of the same browser sandbox that runs your JavaScript.

11:49 So it has the same sort of security space and constraints as JavaScript.

11:54 Right.

11:55 One of the concerns people have is like, well, JavaScript is safe because it's this sandbox

11:58 thing and it only has so much in the language.

12:00 If you start running arbitrary binary code, all bets are off.

12:04 But it's not just like, here's some machine instructions.

12:06 It's, oh, here's a binary thing that runs in the WebAssembly world, right?

12:10 Yeah.

12:11 Yeah.

12:12 So these are details I don't know too much about.

12:14 But they definitely is making all these assurances that the sort of typical things you can do

12:21 in C that become security flaws, you cannot do in WebAssembly.

12:24 Or if you do them, you don't break out of the browser, right?

12:28 Right.

12:28 You just get an exception or something, right?

12:30 Exactly.

12:30 Yeah.

12:30 Cool.

12:31 Dan Callahan, another Mozilla person, gave a pretty interesting call to action around WebAssembly

12:38 last year in one of the PyCon keynotes at US PyCon.

12:41 Were you there for that?

12:43 I wasn't there.

12:44 And actually, what's interesting about that is I had already been working on PyDyde for

12:48 a few months when he gave that talk.

12:50 And he and I were not aware of each other at all.

12:52 It just sort of explains how big a place Mozilla is.

12:54 Yeah, yeah, yeah.

12:55 For sure.

12:55 You know, nobody's fault at either end at all.

12:58 And you guys also have like PyPyJS, which I don't think is active anymore.

13:02 But there's like a lot of these little flowers blooming in that world, right?

13:05 Exactly.

13:06 Exactly.

13:07 And so it was really cool to see his talk and sort of realize we were thinking along the

13:11 same lines.

13:12 And we sort of check in with each other periodically on what's going on there, which is great.

13:16 That's cool.

13:17 So I guess the quick summary of that, I'll link to the whole half hour presentation.

13:21 But it was like, Python is amazing.

13:22 We love Python.

13:23 But the web is one of the most important places where code runs right now.

13:28 And running in the browser, Python is sadly absent for the most part.

13:33 I mean, we have Sculpt and a few of those other things like PyPyJS.

13:36 But they're always in some kind of like seven caveats and some little sliver of use case, right?

13:41 What I was hoping for when I watched this was like, okay, this is the buildup.

13:45 Please let this be an announcement.

13:47 Like, please let this be an announcement involving WebAssembly.

13:49 And it just turned out to be a community.

13:51 We need to work on this.

13:52 And I thought it was really awesome when I saw Pyodide come out.

13:54 I'm like, oh my gosh, they actually were working on something.

13:56 But I guess since you didn't know about each other, you couldn't really.

13:59 It's not like you could do the big reveal of like, and Pyodide's a good start or something, right?

14:03 Yeah, yeah.

14:04 It would have been a little bit of a different talk, I guess.

14:07 Yeah.

14:07 But still, I mean, the points he makes are great points in terms of, I think, you know,

14:12 the web is where so much computing happens these days that if you aren't playing in that space,

14:17 you know, you are becoming limited.

14:19 And yeah, and like you say, there's been a bunch of other projects to bring Python to the web browser.

14:24 Yeah.

14:24 The thing that makes Pyodide a little unique is that it tries to be as close to upstream as possible.

14:30 So it's using upstream CPython, the upstream versions of NumPy and SciPy and all these things,

14:35 and tries to change them as little as possible so that the effort on those projects contributes

14:40 into our effort directly rather than reinventing and then constantly having to keep up with that,

14:47 right?

14:47 Right.

14:47 Yeah.

14:47 Like we spoke earlier, there's so much benefit to having all these packages.

14:51 And every day there's just more to grow upon.

14:54 But if your job is like, we have to have our own copy and implementation of Matplotlib in JavaScript,

14:59 like nobody wants to do that job, right?

15:03 Yeah.

15:03 That's just so many years of effort that I think, you know, it would always be a poor imitation of the real thing, right?

15:09 So it's...

15:09 Yeah, it'd also be behind as well.

15:11 Like TensorFlow came out, but we haven't written the JavaScript TensorFlow yet, so forget that or whatever, you know?

15:16 Something like that.

15:17 Right.

15:18 And you see this a little bit even with like PyPy.

15:20 I mean, PyPy is incredibly impressive.

15:22 Really cool project.

15:23 But they're still at the 3.6 level of syntax because they're just always sort of following.

15:29 It's just sort of in the nature of what they're doing.

15:31 And I don't mean that as criticism.

15:32 But if you aren't tracking the leader, you're always going to be a little bit behind, right?

15:37 Right.

15:37 Right.

15:37 Right.

15:38 So I guess before we move off of just WebAssembly on its own, I can dig into Pyodide.

15:43 How well supported is it?

15:45 Like WebAssembly sounds like new and futuristic.

15:47 How well supported is this?

15:49 It's in all the major browsers, in the stable versions of all the major browsers right now.

15:54 So it's pretty easy to rely on it.

15:57 It's at what they're calling sort of this MVP level of WebAssembly.

16:00 They sort of decided which features were the most critical, and that's everywhere.

16:05 Then there's a bunch of ways in which WebAssembly is already planning to be improved that will eventually trickle down to browsers.

16:13 So things like threading are newer features that are coming down.

16:17 And garbage collection is going to be added.

16:19 So there's a bunch of things that are coming that you can't rely on yet.

16:23 But for the core stuff, and actually for most of the stuff we needed for Pyodide, it's already there.

16:29 This portion of Talk Python is sponsored by Microsoft and Visual Studio Code.

16:33 Visual Studio Code is a free, open-source, and lightweight code editor that runs on Mac, Linux, and Windows with rich Python support.

16:40 Download Visual Studio Code and install the Python extension to get coding with support for tools you love like Jupyter, Black Formatting, Pilot, pytest, and more.

16:48 And just announced this month, you can now work with remote Python code bases using the new Visual Studio Code remote extensions.

16:55 Use the full power of Visual Studio Code when coding in containers, in Windows subsystem for Linux, and over SSH connections.

17:03 Yep, that's right.

17:04 Autocompletions, debugging, the terminal, source control, your favorite extensions.

17:08 Everything works just right in the remote environment.

17:10 Get started with Visual Studio Code now at talkpython.fm/Microsoft.

17:17 So it looks like I opened up Can I Use, and I'll put a link to the WebAssembly report for Can I Use, which talks about what browser support what.

17:25 So it looks like Edge, Firefox, Chrome, Safari, Opera, all those desktops just supported.

17:30 iOS on Safari supports it, and Android, Chrome, and Firefox there supported.

17:36 That kind of sounds like 99%, right?

17:39 Yeah, yeah.

17:39 I'm excited to see things like threading and stuff coming as well.

17:42 There's a possibility for more interesting stuff to come along.

17:46 Yeah.

17:46 Cool.

17:47 All right.

17:47 So that brings us to Pyodide.

17:49 Now, there's a couple of interesting projects that are saying WebAssembly plus other language plus interesting runtime means something in the browser.

17:58 What did it mean for you guys?

18:00 The reason Pyodide started is when I arrived at Mozilla, Hamilton, Ulmer, and Brendan Collar, and also William LaChanze and Tian Brooks were working on this sort of internal skunkworks project for data science at Mozilla.

18:13 The idea was data scientists at Mozilla were using a lot of things like Jupyter Notebooks, or there's a tool called Databricks that's very similar.

18:23 And the problem with these tools was that sharing them is harder than maybe it needs to be because you have a front end on the web, but your actual computation is happening somewhere else.

18:35 It might be on the same machine, but often it's a remote kernel somewhere else.

18:39 Right.

18:39 So you need maybe access to that compute cluster, or if you want to run it locally, you've got to pip install a bunch of stuff, right?

18:46 You're like, oh, you can run this.

18:47 It's easy to run, except for you have to now set up a virtual environment.

18:50 You probably should use Minicon.

18:52 And here's your stuff.

18:53 You're like, whoa, whoa, whoa.

18:53 I just want to look at the report.

18:55 What is this, right?

18:55 Like, for a lot of folks, that's super overwhelming.

18:57 Yeah, your choices are generally, you either require people to install, which, like you say, it's very difficult, or you have to sort of pay for some cloud computing resources somehow.

19:08 And so if you were to put that on, say, a public website to share your data science, you might end up with an unexpectedly huge bill, perhaps, right?

19:17 Right.

19:17 The worst case thing happens, exactly what you want, is a lot of people get interested in it.

19:21 Exactly.

19:22 Exactly.

19:22 And so the idea with iodide was, let's move all the computation into the browser, and then the computation's all happening at the edges in people's clients, right?

19:30 Right.

19:31 And to be clear for people, it's iodide, not pyodide, right?

19:34 Exactly.

19:34 This is another project.

19:36 Yeah, okay.

19:36 Yeah.

19:37 Yeah.

19:37 So that's iodide.

19:39 It's sort of this data science user interface front end, right?

19:42 And the first iteration of that, of course, was using JavaScript to do the data science.

19:49 And you start to realize, well, JavaScript's not a great language for that.

19:53 There's not this sort of mature ecosystem of libraries and tools you can just pull off the shelf like there is in Python or R or Julia, right?

20:01 And then there's even some kind of really nice to have language features that JavaScript doesn't have.

20:07 Like it doesn't have operator overloading, which is really handy when you're doing a lot of numerical computation.

20:12 Right.

20:12 And numbers are weird, right?

20:14 Like you can't have true integers, for example, and other stuff.

20:17 Yeah.

20:17 Exactly.

20:18 And there's a lot of movement in that space trying to make that all better.

20:22 But that's kind of a big lift.

20:24 And definitely the iodide project wants to encourage that.

20:26 And we're working on that kind of in one thread.

20:29 But in the meantime, we thought, why don't we try and come to data scientists where they are, which is in Python, and somehow bring Python to the browser.

20:38 And this seemed like a very crazy idea to me when it was first raised.

20:42 But fortunately, like I said, at Mozilla, we have a bunch of WebAssembly experts who got on a meeting with us and said, it can't be that hard.

20:50 Other people have done things much more difficult than that.

20:53 So why don't you just try it?

20:54 So I went off and found this project by a GitHub user named DGM called cPython Emscripten, which had done a lot of the initial footwork on this.

21:06 And starting with that, was able to get something going in probably a couple weeks.

21:11 That's kind of like that 80% in 20 weeks.

21:13 And then the remaining 20% is, sorry, in two weeks.

21:17 And then the remaining 20% takes forever.

21:18 But certainly getting to the proof of concept was pretty quick.

21:22 And realizing, hey, you can actually compile the real cPython interpreter and get that to run and then have the real NumPy loading in.

21:29 And all those things actually do kind of work.

21:31 It was pretty exciting.

21:32 Yeah, that's pretty amazing.

21:34 And how much of the cPython, like how big is the cPython.wasm or whatever it's called?

21:41 Like the core runtime bits you got to take down before you can start doing stuff, just roughly.

21:46 Yeah, roughly.

21:47 So I'm actually pulling up those numbers because they change all the time.

21:50 It's about 20 megabytes for Python core.

21:53 And then, of course, the libraries you're going to pull in will add to that.

21:57 So NumPy is around 8 megabytes.

21:59 It goes from there.

22:01 But one of the things that PyDyde does is it only downloads the libraries you actually import.

22:06 I see.

22:06 If you don't import Matplotlib or NumPy, like those are not things it has to go hit, right?

22:11 Exactly.

22:11 And also your browser will cache those things.

22:15 So the first time you'll pay the network penalty.

22:18 But once it's been done once, it's on your machine.

22:20 And it will actually recompile the WebAssembly each time, but it doesn't actually have to download it again.

22:25 Cool.

22:26 So that seems like a really good opportunity for a CDN.

22:29 And the fewer CDNs that host this, the better, right?

22:32 Like the more it is shared, the better, honestly.

22:36 You know, when I think about these types of things, one thing when I speak to folks like you, because you actually are within a sphere of influence that could possibly have some kind of difference.

22:45 It's cool that you can put that on a CDN and you can download it and make it go.

22:49 But, you know, what if like JavaScript is baked into Firefox?

22:55 What if CPython in WebAssembly was part of Firefox?

23:01 Like when I update Firefox, I get that latest runtime and I don't have to download it ever.

23:06 I don't know if it's a silly idea or not.

23:08 I mean, I remember from, you know, probably at least 20 years ago, people have talked about JavaScript's not a great language.

23:15 Let's put a better language in the browser instead, right?

23:17 That's been a theme for a very long time.

23:20 And what's interesting is now that we have WebAssembly, we sort of no longer have to pick one.

23:25 Which is a nice situation to be in.

23:27 We can get any of them in there.

23:29 Are there ways that it could be done like more efficiently or more conveniently?

23:33 Certainly.

23:33 Like I think, you know, maybe there'll be some like web extension you download that gives you languages so that it just kind of will always update those in the background and you don't have to worry about this kind of stuff.

23:44 There's ways that you could make it almost like a browser just with a little bit of extra added to make.

23:49 Yeah, I'm not nearly proposing Python in Firefox.

23:52 I'm proposing Firefox come with these preloaded or like you say in the background, like preload the latest one or something.

23:58 And then you could have Python.

24:00 You could have C Sharp.

24:01 You could have all the languages that are like building these runtimes in WebAssembly and make them available.

24:07 And just I think that would be super cool, actually.

24:10 Yeah, it definitely would open up the possibility for, say, writing web applications using this technology.

24:17 So one of the things I do warn people about Pyodide is the thing that's cool about Pyodide is you can actually see the Python in your web browser and run it there, which is great for the data science use case.

24:28 Right.

24:29 But if you don't have a use case where you need to show people the code, this is probably not how you want to write your web app.

24:36 Yeah, for sure.

24:37 You probably want to stay in a JavaScript world, which is a lot more efficient and a lot faster and all these things, right?

24:42 Yeah, so Pyodide is very much focused on basically what you do with Jupyter notebooks, but make that execution happen on the client side in the browser, not connected back to a Docker thing or some kernel elsewhere, right?

24:56 Like that is the use case and that's what it's built for and optimized.

25:00 Exactly.

25:00 That's the original use case that came out of Pyodide.

25:03 What's interesting is since then, we've been talking with the WebAssembly folks again who are really pushing the idea of WebAssembly as a containerization technology.

25:15 So WebAssembly that doesn't actually run in a browser but would run on cloud computers, right?

25:22 Yes.

25:22 Because it provides a really nice sandboxed way of running arbitrary code, but it does it in a way that's actually a lot lighter weight than Docker, right?

25:31 So like Docker is maybe the industry standard for this right now, but Docker essentially you're taking a whole Linux distribution and a whole OS and shoving that in a container and passing that around.

25:41 Right.

25:41 And you require the kernels to match of the host and the container, right?

25:46 Whereas WebAssembly doesn't care.

25:48 You know, there are projects like Wasmr that are bringing this to Python already, so it's not a far-fetched idea.

25:54 No, it's not.

25:55 I think it has a lot of promise, and for our own little data science community, what excites us about it is we can use the browser as like a prototyping tool where the computation is happening locally and it's really fast, maybe while you're working on a small part of your data.

26:11 But then to be able to very smoothly say, I'm now going to run that on a cluster without having to change anything and having it still be built on the same technologies could be a really powerful thing.

26:22 This is all kind of in the pie-in-the-sky streaming of its stage for us.

26:26 Yeah, but it's a sweet pie.

26:28 It looks really nice.

26:29 Yeah.

26:29 Exactly.

26:29 You could bring in things like Dask as well to help you do distributed computation without people really even knowing or caring that that's happening.

26:38 So there's a lot of stuff that could like expand, expand to the servers, expand to clusters, and so on.

26:42 Exactly.

26:43 Yeah.

26:43 Cool.

26:44 So I've seen some cool examples of this already working.

26:47 Like I pulled up the, was it LA data or some kind of like city map, the one that came from your article that you recently published, but there's a live example and a link to that.

26:55 It's pretty interesting.

26:56 It's doing real data science.

26:58 It's doing real computation.

26:59 So it takes like 15 seconds to load, right?

27:02 Yeah.

27:03 Even if you've already cached the download, it's like computing for 15 seconds.

27:06 So you got to be patient.

27:07 But then it comes up and it's got a really wonderful graph and visualizations.

27:11 Like how do you do the visualizations?

27:14 There's probably a lot of JavaScript and other tie-ins there, right?

27:17 Yeah.

27:17 So it actually depends.

27:19 There's a lot of different ways you can do it.

27:22 But that example you're talking about with the map and the call data, there we're using a JavaScript library called Regal, which is sort of a WebGL front end to do that 3D plotting.

27:33 And it's nice because it uses the hardware to do the 3D acceleration and works really well.

27:37 But you can just as easily use Matplotlib and exact like the full Python Matplotlib and do your plotting there.

27:45 And that all works.

27:46 There's only very maybe 300 lines of code that kind of glue Matplotlib to a Web Canvas that had to be written to make that work.

27:54 But otherwise, that's pure Python doing your plotting.

27:57 But then we've also played around with tools like Plotly, which is a JavaScript plotting library.

28:03 One of the things that PyDyde does really well is it can share your data between Python and JavaScript without copying it.

28:09 So it's really fast to share the data back and forth.

28:12 So you can do like all your heavy lifting computation in Pandas in Python and then just ship that over the JavaScript side for plotting and kind of mix and match.

28:23 So it's kind of exciting that like you no longer have to say, well, I'm working in Python.

28:27 So my choices for plotting are Matplotlib and maybe Boca and whatever the Python ones are.

28:31 You can choose from both Python and JavaScript ecosystems.

28:35 Maybe in this crazy future, you could also import some other WebAssembly based visualization thing that you don't even know what it's written in, right?

28:43 Like who knows?

28:43 Maybe it's in Swift or something crazy, right?

28:45 But yeah, interesting.

28:46 Yep.

28:46 Okay.

28:47 One thing I want to ask you is like if I'm a data scientist and I'm listening to this and I'm like super excited, should I be excited today or should I be excited in like a year and a half?

28:55 Is this something I can reasonably use now or is that a cool proof of concept or like what's the status?

29:00 I would say you have to have a little bit of patience still.

29:03 Okay.

29:04 You know, we certainly encourage people to come up and try to do the kind of things they're doing in Jupyter now in iodide and pyodide and sort of see where some of the rough edges are.

29:14 We do actually have it.

29:16 It is being used for real work within Mozilla for data science and people are using the Python parts of it and stuff.

29:22 So if you sort of know where the boundaries are and what you can get away with, it's already working.

29:27 It's sharp and jagged over there.

29:29 Don't go over there.

29:30 Exactly.

29:30 Yeah, yeah, yeah.

29:31 Right.

29:32 But I would be being disingenuous if I said it was ready for everything that people might want to do.

29:36 Sure.

29:37 I guess the way you find out and the way you get it ready is people try to use it and you're like, wait, everyone's trying to do this and it doesn't do that.

29:42 Well, maybe that's something it gets.

29:44 Exactly.

29:44 No, it's really helpful for us, actually, because we've been getting some really great bug reports.

29:49 We had a blog post last week that kind of brought a lot more traffic to the site and that turned into a lot of really great bug reports of things that seem obvious in hindsight, but we had never thought to check, is that going to work?

30:00 Yeah.

30:01 That's really helpful.

30:02 And also, it helps us prioritize what features need to be added.

30:07 If 10 out of the 20 people that show up all have problems with the same thing, well, that's a pretty good sign that that's where we should focus effort.

30:13 Yeah, that makes total sense.

30:15 Are there full-time employees working on it at Mozilla?

30:18 What is its status as a project for you all?

30:21 So there's probably a total of about three FTEs working on it divided among five people within Mozilla, and it's still primarily sort of an internally devoted project in that our internal users are kind of helping us prioritize what gets worked on and move forward.

30:39 We do, of course, have the public-facing website at iodide.io where anybody can come up and create notebooks, and we do look at that, too.

30:47 And it's all open source.

30:48 Yeah.

30:49 And we're just sort of hoping that if we can bootstrap it enough internally and prove it as a useful tool, then we can get, hopefully, some more resourcing to kind of make it something that will serve a broader community.

31:00 Yeah, that's super cool.

31:01 So if I am a person who maintains or is somehow in charge of a data science package that is not available in there, I'm guessing there are some that are not available.

31:11 Is that right?

31:12 Yeah.

31:12 Yeah.

31:13 Yeah.

31:13 So if I am, how do I get mine?

31:15 Like, I run, you know, whatever.

31:17 And I really want that available alongside NumPy.

31:20 How do I make that happen?

31:21 So right now, all our package building is built as part of kind of this monolithic tree of make files.

31:28 So to add a new package, you basically add it to the Plyodide source code, and then that causes it to automatically get built and then eventually distributed.

31:37 Where I'd really like to move is to something that works more like CondoForge where, and maybe even literally to use CondoForge if we can make that work so that anybody could just walk up and contribute a package sort of in their own repo and that would automatically get picked up and then distributed.

31:53 So it would be a more like distributed build system than what it is now.

31:57 Oh, yeah.

31:58 That's pretty interesting.

31:59 Yeah.

31:59 That's a cool idea.

32:00 Yeah.

32:00 To just basically have approved sources for packages and you guys just continually pull them kind of like, like you said, like a CI system almost.

32:08 Exactly.

32:08 Yeah.

32:09 Yeah.

32:09 So you'd have individual package maintainers who could maintain their own package, but the sort of infrastructure that makes that all work would be centralized.

32:18 That's the idea.

32:19 Yeah.

32:19 Right.

32:19 Is it hard?

32:20 If I have something that has like some C section and some Python section, what's the overhead to get it working in this?

32:28 It varies.

32:29 So like NumPy that has C, but it's fairly straightforward.

32:33 C is not too bad.

32:36 One of the things we've really struggled with is SciPy because SciPy actually has a fair bit of Fortran.

32:42 And so Fortran is its whole other set of problems for WebAssembly.

32:47 There's basically there's not a good compiling option for anything that's not Fortran 77 right now.

32:54 So we have to kind of push that forward somehow.

32:56 So there's kind of this range of easy to hard and it's hard to know maybe up front how hard something's going to be.

33:02 Pure Python stuff is very easy.

33:04 Pure Python.

33:05 There's actually even a little helper script where if you run this little helper script with the name of the package on PyPy and it will automatically generate the make file needed to build it as part of PyDy and it automatically goes in.

33:17 You know, you submit that as a PR and you're good to go.

33:20 That's pretty cool.

33:21 So when say my Python, suppose I have a pure Python package on PyPI and I want to have it in there.

33:27 Does it get somehow compiled to WebAssembly?

33:31 Does WebAssembly itself just like become an interpreter and just use Python bytecode?

33:36 Like what do you know what the process there is?

33:38 Yeah.

33:39 So PyDyde is literally running the Python interpreter inside your browser.

33:44 So if you have a pure Python package, it's actually just shipping Python to that interpreter running in the browser.

33:51 And that's how it runs.

33:52 So it basically just works off PYC bytecode.

33:55 Exactly.

33:56 And just feeds it off to the interpreter just happens to be executing not on C but in WebAssembly.

34:00 Exactly.

34:01 That was my first guess.

34:03 But I thought maybe there's some other magic like, oh, we had to add a JIT compiler to this or, you know, some weird thing.

34:08 No.

34:09 And so for that reason, what you're getting is something that performs pretty similar to CPython, right?

34:16 Something like PyPy.js has, of course, opportunities to JIT the Python itself and potentially get a lot more performance.

34:23 So we're not doing anything sophisticated like that in PyDyde at this point.

34:27 Yeah.

34:28 Yeah.

34:28 So because we have a really nice JIT sitting there in the browser, it's certainly there and could be on some point.

34:34 It's certainly possible.

34:35 I know Brett Cannon worked on Pigeon, I think is what it was called.

34:39 Yeah.

34:39 Which was looking at some of the JavaScript JIT stuff and seeing how that could be applied back into the CPython runtime.

34:47 I don't know.

34:48 Maybe there's some synergy there.

34:49 Who knows?

34:49 Because it sounds like they're all kind of swimming in the same soup of ingredients or whatever.

34:54 Absolutely.

34:55 Yeah.

34:55 What does the performance of WebAssembly relative to JavaScript look like?

35:00 So if I had like my Quake example and I ran Quake over Asm.js and I ran it in WebAssembly, what would I get?

35:08 I don't actually know those numbers.

35:10 I'm sure that WebAssembly at this point is quite a bit better than Assembly.js.

35:14 The numbers I do have is what happens to Python.

35:17 So like comparing Python running natively on a machine versus inside the browser.

35:23 And you generally get anywhere between like the same speed and 12 to 20 times slower.

35:30 Okay.

35:30 And what seems to matter.

35:32 So if your Python application is mainly just calling NumPy operations, which are at the bottom sort of C tight inner loops.

35:40 Yeah.

35:40 That stuff tends to be pretty much the same speed.

35:44 Right.

35:45 Those tight C loops tend to be pretty much the same in WebAssembly as out of WebAssembly.

35:49 If you're doing a lot of looping in Python or calling a lot of Python functions, those things tend to get quite a bit slower.

35:55 And the reason is that the Python interpreter is basically calling a lot of C function pointers all the time.

36:04 That's sort of kind of how it works.

36:05 It's calling other C code through C function pointers.

36:09 Py, object, star, all over the place.

36:10 Yeah.

36:11 Exactly.

36:11 Yeah.

36:12 And calling a C function pointer in WebAssembly is quite a bit slower than it is on native.

36:17 Okay.

36:18 For reasons that I don't know if I could fully articulate, but probably partly related to the security model.

36:24 Like there's just ways in which that's a lot slower.

36:27 And unfortunately, the way the Python interpreter is designed is making lots of C function pointer calls all over the place.

36:33 Right.

36:33 But it might not matter.

36:34 It depends.

36:35 Right.

36:35 Exactly.

36:36 So it's five times slower might make it go from half a millisecond to, I don't know, 2.5 milliseconds.

36:43 And like the user doesn't perceive these or care about them.

36:46 Right.

36:46 But now all of a sudden you have this great new deployment story and this execution engine.

36:49 And like what it opens up is way more valuable than that amount of slowness potentially.

36:53 Exactly.

36:54 It's certainly within the realm of like I can live with that.

36:57 You know, it's not hundreds of times slower.

36:59 Right.

36:59 Yeah.

37:00 Yeah.

37:00 Interesting.

37:01 Cython.

37:01 Can I do some, like I've got some doubly nested loop and I know that that's the problem.

37:08 Can I Cythonize that puppy and then WebAssembly the result or something?

37:13 Yeah.

37:13 Yeah.

37:14 So Cython works just fine.

37:16 And in fact, like Pandas is largely written in Cython.

37:20 So in order to get that to work, that needed to work.

37:22 But all of that compilation happens ahead of time on the native machine before shipping it

37:29 to the browser.

37:30 We don't actually have the ability to compile Cython code inside of the browser.

37:34 Right.

37:34 But as a package developer, maybe I could leverage that to avoid the, like the really, so where

37:40 there's a penalty in Python versus C now, the penalty may be worse.

37:43 It's in WebAssembly, but maybe the answer is Cython still.

37:46 Absolutely.

37:47 Yeah.

37:48 Okay.

37:48 Yeah.

37:48 Totally.

37:49 And then of course, people have gotten the Clang compiler to run in WebAssembly.

37:54 So theoretically, we could get, we could put that there as well.

37:59 And then we could send it Cython code and get something back and maybe run that right away.

38:04 Like that's a little bit getting into crazy territory maybe, but who knows?

38:09 Yeah.

38:09 Well, it's turtles all the way down or WebAssembly is all the way down or something like that.

38:12 Right.

38:13 Right.

38:13 Yeah.

38:13 Interesting.

38:14 Both having all of this experience and expertise in Rust and Rust being so built for WebAssembly

38:23 or being so well paired with WebAssembly.

38:25 And then also some projects trying to do, say, CPython's runtime in Rust.

38:32 I guess the question is rambling around is to say, if I take Rust and like rethink CPython,

38:39 what do you think the possibilities are there in this context?

38:43 Yeah.

38:43 I mean, I think what's exciting about Rust is because it's a newer technology, unlike C, like,

38:49 it's a lot easier to get into WebAssembly because they just, there's just not as much baggage.

38:56 And that's kind of Rust's advantage on WebAssembly.

38:58 Plus, there's the fact that I think Rust and WebAssembly both coming out of Mozilla with a lot of people overlapping between those two communities has really helped that make that story much smoother.

39:10 You know, for example, if I was going to write something from scratch that I wanted to run in WebAssembly, I would absolutely reach for Rust and not C in this day and age.

39:17 It just is going to be an easier experience.

39:19 So, like you say, there's a project, maybe more than one, I don't know, that to rewrite the C and the Python interpreter in Rust, right?

39:28 So then that would help us build something like Pyodide a lot easier because it's in Rust.

39:33 We don't have to deal with a lot of the sort of niggly details we've had to deal with C.

39:37 My worry there is historically, whenever people write an alternative Python interpreter, it's really hard for it to catch on because there's so much catch up to do.

39:48 You know, if they can, I think the sort of uphill battle for that project, and I think I've read about the project and the author is like clearly doing it for fun.

39:57 Right, exactly.

39:58 You don't have to have a better reason than for fun, and I'm not trying to say that that's not a valid reason.

40:03 But, like, I think for that to kind of take over from the CPython interpreter, which I know is not a goal, it would have to, like, convince that community that's currently maintaining the CPython interpreter that Rust is going to be a better way forward.

40:16 If they can do that and sort of replace it and become the leader, that would be an amazing outcome.

40:22 But I think that's a real uphill struggle.

40:25 It would be an interesting outcome, but certainly with Rust being so new and C being so, such a stalwart, right?

40:31 Like, it's the foundation of so many things.

40:34 It would be an interesting conversation for sure.

40:39 This portion of Talk Python to Me is brought to you by Microsoft and Azure Pipelines.

40:43 Azure Pipelines is a CI-CD service that supports Windows, Linux, and Mac.

40:47 It lets you run automatic builds and tests of your Python code on each commit or pull request.

40:53 It is fully integrated with GitHub, and it lets you define your continuous integration and delivery pipelines with a simple YAML file.

40:58 Azure Pipelines is free for individuals and small teams.

41:01 If you're maintaining an open source project, you'll even get unlimited build minutes and 10 concurrent pipelines.

41:07 Many Python projects are already using Azure Pipelines.

41:10 So get started for free at talkpython.fm/Microsoft.

41:16 I guess what I'm thinking also is, like, it's interesting that people have these, hey, I want to learn Rust, and let me do that by trying to rewrite CPython and Rust.

41:26 These are interesting and, like you say, super good goals, and people, I'm sure, are getting a lot out of it.

41:30 But I'm more thinking of, like, what if you rethought what it meant to be the runtime for Python specifically optimized for WebAssembly?

41:42 You know what I mean?

41:42 You try to make it, like, 99% compatible, and so you can bring in C libraries and stuff like you can through WebAssembly.

41:50 But is there an opportunity to, like, truly rethink CPython, not just rewrite it with the different syntax and compilers?

41:57 Yeah, that's a really good question.

41:58 I think...

41:59 I don't know the answer.

42:00 Yeah.

42:01 Just interesting to think about.

42:02 Yeah.

42:03 I think one of the things that a lot of various projects have kind of worked on rethinking the Python interpreter has sort of been, what if we can assume that there's a really good JIT around, right?

42:15 That's kind of what PyPy is, and, like, IronPython, when they built it on top of the .NET CLR and those sorts of things, they sort of go, if we have a really good JIT around, what can we do?

42:26 I think a lot of times they run up against the sort of really dynamic corner cases of the Python language that are really hard to deal with there.

42:34 Yeah.

42:34 And you end up with something that's slightly different from Python.

42:38 That, to me, always feels like where it gets a little stuck.

42:40 Right.

42:41 And if there's a way of unsticking that, like, unfortunately, the community has gone through this big transition from Python 2 to 3 already, and I don't know if there's a lot of appetite for another Python that would be slightly incompatible.

42:54 Exactly.

42:54 If you could live with something that's slightly incompatible, I think there's a lot of ways you could make it more performant with some of these newer technologies.

43:01 Yeah.

43:02 I mean, maybe there's an opportunity to do that.

43:05 And maybe PyOidide's not the right answer for that because you're not looking at, I don't know, like, so much of what happens there happens in C anyway.

43:13 Like, that's where the data science action really is, and Python is kind of the orchestration layer.

43:18 Maybe it's something like, what if I could write the equivalent of AngularJS or React with an Electron app with a nice UI, but I only touch Python?

43:26 You might be willing, like, in that context to say, well, I'd rather have a 99% compatible Python with packages than just switch 100% to JavaScript as a Python enthusiast, right?

43:37 Like, that might be a cell you would buy.

43:39 Yeah.

43:40 Yeah, absolutely.

43:41 Yeah.

43:41 And the file size doesn't matter in these, like, offline Electron apps and stuff because you're already downloading, like, a 60-meg Chrome binary.

43:49 Like, what's another 10 megs that's just zipped up in there anyway, right?

43:53 Right, exactly.

43:54 So who knows?

43:54 Maybe that's an interesting corner to explore.

43:56 Not necessarily for you, but for, like, anyone interested, right?

43:59 Yeah, absolutely.

44:01 Yeah, cool.

44:01 So I guess maybe tell us a little bit about where things are going.

44:05 Like, where are you and what's the future plans?

44:08 There's a bunch of things that don't work that we'd like to work on.

44:11 So, like, currently we don't support threading because when we started the project, WebAssembly didn't support threading.

44:17 Now WebAssembly does, so it would be good to go back and kind of build on top of that.

44:22 Is that true operating system threads or is that some kind of, like, preemptive threading?

44:27 Like, what does threading in WebAssembly mean?

44:28 It's based on the web worker technology in browsers.

44:34 My understanding is they are sort of true separate operating system threads, and they're probably even more isolated than you would think of with threads.

44:41 They're kind of their own little JavaScript interpreters.

44:44 Right.

44:44 They can kind of message passing, and that's all they get for data sharing and whatnot.

44:47 Exactly.

44:48 Exactly.

44:48 So you can take advantage of that in WebAssembly now and pass things between web workers.

44:53 And there should be a way to kind of hopefully build the Python threading API on top of that.

44:59 It sounds almost like Python's multiprocessing more than Python's threading.

45:04 That's probably accurate.

45:05 Yeah, yeah, yeah.

45:06 But it's still, like, it's cool that it would be some parallelism that you can do regardless of how that happens, right?

45:11 Exactly.

45:12 Exactly.

45:13 Okay.

45:13 Another big sticking point is networking is obviously very different for Pyodide than it is for native Python, largely because of the sandbox, right?

45:22 You can't just open up a Unix socket and start writing things to it because that would be a big security hole.

45:28 So what that means is a lot of the libraries that come in the Python data science community, like Pandas, they have ways of fetching things over the network, and those don't currently work because they try to open a socket and they fail.

45:41 I see.

45:41 So building some kind of abstraction layer, maybe on top of WebSockets or maybe something else that would at least let basic things work from Python would be really useful.

45:53 Right now, generally, what you have to kind of do is do your networking in JavaScript using Fetch or whatever the tools are there, and then you can bring that into the Python side.

46:01 Right.

46:01 Axios or something nice, yeah.

46:03 Yeah, exactly.

46:04 Okay.

46:04 Yeah, but so, like, if I installed imported requests and tried to use that, for example, that might not work?

46:11 That's definitely not going to work.

46:13 Yeah.

46:13 Okay.

46:14 Cool.

46:14 All right.

46:15 So what else?

46:16 You talked about this, like, Honda Forge-like distributed build integration thing.

46:20 That's pretty cool.

46:21 What else?

46:21 There's a whole area of research we're doing around the data sharing.

46:26 So, like, I mentioned this before, like, you can have a NumPy array living in Python and pass that over to JavaScript without copying it, and that's pretty cool that that works.

46:35 But when you bring it over to JavaScript, you don't get, like, how many dimensions it has or what the shape is.

46:41 You just sort of get this one-dimensional thing because that's all that JavaScript really supports.

46:45 So we'd like to build something that kind of makes that more transparent and smooth.

46:51 Would that be like a JavaScript wrapper that kind of has an API like Pandas a little bit?

46:56 Or what are you thinking there?

46:57 Yeah, something like that, I think.

46:59 There's a project called Apache Arrow, and its sort of purpose is to allow for sharing of these data structures in memory between different language runtimes.

47:10 And they've primarily been looking at, like, native runtime space.

47:15 But they have a JavaScript implementation.

47:17 They have a Python implementation.

47:19 It should be possible to kind of bring all that into the browser and use it there.

47:22 And then, again, building on top of work already been done and not have to build it ourselves.

47:27 But there's a lot of details there.

47:29 The other thing that's sort of exciting for Arrow for us is the sort of industry standard way to bring data in for data science computation is still the comma-separated value format, right?

47:42 They're not terribly efficient to read.

47:45 They're not very space-efficient or memory-efficient.

47:48 Whereas Apache Arrow provides this sort of nice, tight binary format that we could use.

47:52 Yeah.

47:53 And that would actually allow us to sort of shove more data into the browser, which is pretty memory-limited to begin with.

47:58 So anything that will let us kind of get away from CSVs is also on our roadmap.

48:03 Yeah, just parsing all those strings is going to be a slow thing wherever.

48:07 Yeah.

48:08 And most of the libraries that do it don't.

48:11 They assume tons and tons of memory.

48:13 So they don't necessarily do it in the most efficient way possible.

48:16 They don't necessarily stream it.

48:17 They might copy the whole thing.

48:18 And then, you know, so.

48:20 Right.

48:20 Yeah.

48:21 Okay.

48:21 Yeah.

48:21 That sounds pretty cool.

48:22 And then I was looking around the site.

48:24 I found like a pretty cool demo notebook that people can try out.

48:27 I guess, you know, there's that.

48:29 And I'll put a link to that.

48:30 And what else do you recommend for people just trying to play around with it?

48:32 There's a demo notebook that kind of goes through the language features that works also kind of as a tutorial for how to get started.

48:39 But then also linked on the blog post from last week.

48:42 And maybe we can link to that blog post.

48:44 In there, there's a bunch of other demo notebooks that kind of do more real world cool things, putting stuff together.

48:49 Like you say, the call data one is pretty fun.

48:52 There's another one that I used at Mozilla internally for figuring out how to time things in Firefox.

48:58 It's kind of fun.

48:59 So if you go to iodide.io, then all the notebooks that anybody has created on that public website, they're all public.

49:07 And so you can just kind of browse through there and see what interesting things other people are doing.

49:11 Yeah.

49:11 Cool.

49:11 There's always interesting stuff happening in the data science space.

49:14 Yeah.

49:14 Cool.

49:14 One thing that I just wanted to give a shout out to, I don't know if you've even looked at it or something.

49:19 I became aware of it basically like a week ago is WASM.

49:23 So W-A-S-M is often the extension for WebAssembly.

49:26 So W-A-S-M is like WebAssembly or right?

49:28 So are you familiar with this project?

49:30 Yeah.

49:31 Yeah, I am.

49:32 To me, my first impression is it's kind of a little bit like what Node.js did to JavaScript.

49:36 Like it used to be JavaScript ran in the browser and that was like its space.

49:39 And I guess you could open the console and like type and play with it if you want.

49:42 But then all of a sudden Node.js like sprung on the scene.

49:45 It's like, wait, I can just take my JavaScript code and run it in like a server process or doing other interesting stuff just in an, like on its own.

49:52 And WASM is kind of like that for Python.

49:55 Like it will let you run any WebAssembly code regardless of whether it's this Python code or just like random WebAssembly code.

50:02 And then directly import that into your Python code.

50:05 So it's kind of more like Node enabling than it is what you're working on, which is move Python to the browser.

50:12 It's like the opposite.

50:13 Move WebAssembly to Python.

50:14 Yeah, what excites me about it actually is it's going to make languages that aren't C a lot easier to integrate in Python.

50:21 So like Rust, for example, there is a way to integrate Rust in Python that's actually pretty good and works really well.

50:26 But if the story was compile whatever you have to WebAssembly and we can get to it from Python,

50:32 I think that makes it a lot easier to have things written in whatever language is the most convenient or the most, you know, at hand.

50:39 And what also is potentially exciting from the Pydyde point of view is if this causes there to be a big sort of community of WASM packages that work with Python, we can use those in Pydyde basically for free.

50:52 Yeah, exactly.

50:53 It just grows the pie for everyone.

50:55 Yeah, exactly.

50:55 Yeah.

50:56 Yeah.

50:56 So I mean, I don't have a whole lot more to say than just like, hey, people, if that sounds interesting to you, check it out.

51:01 It looks like a really cool project.

51:02 And it just it's just one more sign that there's like this excitement of WebAssembly integrating with these non JavaScript, non browser traditional languages and ecosystems.

51:13 Yeah, absolutely.

51:14 Yeah.

51:15 I guess if I'm throwing out other stuff that just kind of randomly to do here, one more I'll throw out.

51:19 That's it's really, really interesting.

51:21 I don't know if you've heard of this one at all.

51:22 Blazor.

51:23 Have you heard of that?

51:24 Yes, this is the C#.

51:26 Yeah.

51:26 Yeah.

51:27 So they had a totally different take, but they have gotten the .NET runtime, the CLR, all that stuff running a WebAssembly.

51:33 And now you can do C# in the browser.

51:35 Their take was to build an AngularJS like framework that lets you write front end code in C# and then run it in the browser.

51:43 I don't know if that's a good idea or not, but it's, you know, it's kind of the other half of the story, I think, for Python, right?

51:49 Like right now we've got the data science, like with your work going really well.

51:52 But there's no story around like, what would I use in Python instead of Vue or Angular?

51:59 Not necessarily saying those are bad or you should, but like you could, you know, C# is showing the way like on that side of the story.

52:06 Yeah, I think there's a real advantage to having your implementation and your back end and your front end be in the same language.

52:14 I think that's kind of what Node has proven.

52:16 Right.

52:16 There's definitely an appeal there.

52:17 Yeah.

52:18 Yeah.

52:18 And so I think for Blazor, it's the same thing.

52:21 If you're a shop that's done your back end in C# for a long time, well, now you can have your front end in it too.

52:25 That's really nice.

52:27 One of the things that from talking to some Jupyter developers, one of the things they're really excited about with Pyodide is now they could potentially start to have some of their front end stuff written in Python as well as their back end that's currently in Python.

52:41 So because right now the world they live in is there's sort of these arbitrary lines that get drawn between what you would write in one language versus another.

52:49 And they're not always the right thing.

52:51 And sometimes you have to write the same thing in two different languages just so that you can put it in both places.

52:57 Right.

52:57 Validation or something.

52:58 Get rid of those.

52:59 Yeah.

53:00 Validation or even with Jupyter, it's certain kinds of computation that they need in the widget as well as in the back end and they need to match.

53:07 But one's written in Python.

53:11 It's not fun.

53:11 And it's just sort of this arbitrary speed bump that gets created because of the world we live in.

53:16 But if you imagine a world where all languages run everywhere, suddenly, hopefully, you're doing less work.

53:23 Yeah.

53:23 Yeah, absolutely.

53:24 And you can reuse stuff in a context where you wouldn't.

53:27 Like, until your work, it would have been kind of insane to say, well, let's reuse NumPy in the browser, right?

53:32 Yeah.

53:33 But now these doors are open.

53:35 So it just creates more synergy, I think.

53:37 It's pretty awesome.

53:38 All right.

53:38 Well, I think we're pretty much out of time there.

53:40 Yeah.

53:40 Definitely a fun conversation.

53:42 I think the future is bright.

53:44 What do you think?

53:44 Yeah, absolutely.

53:45 I'm really excited about all this stuff.

53:47 Yeah.

53:47 Same.

53:47 All right.

53:48 Now, before I let you out here, let me ask you the two final questions.

53:50 I kind of think I can guess this first one because of the way you opened the whole show.

53:56 But favorite editor for writing Python code?

53:58 Yeah.

53:59 Yeah.

53:59 So I use Emacs, but I actually use SpaceMax.

54:01 So it's kind of this like weird Emacs VI hybrid.

54:05 But I find it works for me.

54:06 Yeah.

54:07 Right on.

54:07 Very cool.

54:08 And then notable PyPI package?

54:10 Oh, gosh.

54:11 I mean, the one I'm most familiar with is Matplotlib because I worked on it for years and years.

54:14 And, you know, if you're not familiar with it, go check it out.

54:17 It's the kitchen sink of plotting for Python.

54:21 Yeah, it absolutely is.

54:22 And, you know, it's something exciting and like both silly but also kind of real sense.

54:26 As I saw, now XKCD style plots have come to Matplotlib.

54:30 Did you see that?

54:31 Oh, yeah.

54:31 I implemented that actually.

54:33 You did?

54:34 How awesome.

54:36 Yeah.

54:36 Yeah.

54:36 How hard was that?

54:38 Not too bad.

54:39 Strangely, the infrastructure that was already there kind of made it easier than it might have been.

54:45 So.

54:45 Yeah.

54:45 I mean, it looks like the sort of cartoony hand-drawn like plots and stuff.

54:51 But it actually looks really hard to do because it's imprecise and it has these imperfections, right?

54:57 It seems like it would be hard to tell a computer to be imprecise in like a human way.

55:00 But well done.

55:01 And that looks great.

55:02 So fun.

55:03 Thanks.

55:03 Yeah, absolutely.

55:04 Awesome.

55:05 All right.

55:05 Final call to action.

55:06 People are excited about iodide, pyodide.

55:08 They want to check it out, maybe contribute.

55:09 What do you got for them?

55:11 Yeah.

55:12 Check out iodide.io where you can check out all the notebooks that people have created.

55:16 And then we have a GitHub site at github slash iodide project.

55:20 Nice.

55:21 And I'll put those links in the show notes.

55:23 Are you looking for contributors or people working on it any?

55:26 Or is it kind of still an internal project at the moment?

55:29 We're definitely looking for contributors.

55:30 Find us on Gitter if you have any great ideas.

55:32 We'd be glad to help you make them reality.

55:35 Super cool.

55:36 All right.

55:37 Well, I am very thrilled to see you all working on getting this take on Python in the browser.

55:43 I think the more attempts that we have here, the better.

55:46 And it's an exciting time.

55:48 And I think it'll take off.

55:49 Thanks a lot.

55:50 It was fun talking to you.

55:50 Yeah, you as well.

55:51 Thanks for being on the show.

55:52 Bye.

55:52 All right.

55:52 Take care.

55:53 Bye.

55:53 This has been another episode of Talk Python to Me.

55:57 Our guest on this episode was Michael Dropboom.

55:59 And it's been brought to you by Microsoft.

56:01 If you're a Python developer, Microsoft has you covered.

56:05 From VS Code and their modern editor plugins, to Azure Pipelines for continuous integration,

56:10 and serverless Python functions on Azure.

56:13 Check them out at talkpython.fm/Microsoft.

56:17 Want to level up your Python?

56:19 If you're just getting started, try my Python Jumpstart by building 10 apps course.

56:23 Or if you're looking for something more advanced, check out our new async course that digs into all the different types of async programming you can do in Python.

56:32 And of course, if you're interested in more than one of these, be sure to check out our everything bundle.

56:36 It's like a subscription that never expires.

56:39 Be sure to subscribe to the show.

56:40 Open your favorite podcatcher and search for Python.

56:43 We should be right at the top.

56:44 You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm.

56:53 This is your host, Michael Kennedy.

56:55 Thanks so much for listening.

56:57 I really appreciate it.

56:58 Now get out there and write some Python code.

56:59 I really appreciate it.

57:20 Thank you.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon