WEBVTT

00:00:00.001 --> 00:00:03.320
It's been said that JavaScript is the assembly language of the web,

00:00:03.320 --> 00:00:08.180
but should you be required to write an assembly language or even JavaScript if you don't want to?

00:00:08.180 --> 00:00:12.140
Most platforms have a dizzying array of options for programming them.

00:00:12.140 --> 00:00:15.220
Not the front-end web world, but that tide may be turning,

00:00:15.220 --> 00:00:17.760
and WebAssembly could be the key to making it happen.

00:00:17.760 --> 00:00:21.200
With WebAssembly, we have a new compilation target for web browsers,

00:00:21.200 --> 00:00:25.400
and Michael Dropboom from Mozilla and his team have decided to help bring

00:00:25.400 --> 00:00:29.180
the Python scientific stack to the front-end world with Pyodide.

00:00:29.680 --> 00:00:36.320
Dive into Pyodide on this episode, 212, of Talk Python To Me, recorded April 23, 2019.

00:00:36.320 --> 00:00:52.880
Welcome to Talk Python To Me, a weekly podcast on Python,

00:00:52.880 --> 00:00:55.920
the language, the libraries, the ecosystem, and the personalities.

00:00:55.920 --> 00:00:57.880
This is your host, Michael Kennedy.

00:00:58.060 --> 00:01:00.020
Follow me on Twitter, where I'm @mkennedy.

00:01:00.020 --> 00:01:03.760
Keep up with the show and listen to past episodes at talkpython.fm,

00:01:03.760 --> 00:01:06.180
and follow the show on Twitter via at Talk Python.

00:01:06.180 --> 00:01:08.960
This episode is brought to you by Microsoft.

00:01:08.960 --> 00:01:11.640
Be sure to check out what they're offering during their segments.

00:01:11.640 --> 00:01:13.120
It really helps support the show.

00:01:13.120 --> 00:01:16.280
Michael, welcome to Talk Python.

00:01:16.280 --> 00:01:18.060
Thanks. It's good to be here. Thanks for inviting me.

00:01:18.060 --> 00:01:21.260
Oh, you have such an interesting topic and thing that you've been working on.

00:01:21.320 --> 00:01:26.440
I was really excited when I heard about Pyodide a little bit, I don't know, maybe six months ago

00:01:26.440 --> 00:01:28.000
or when I first heard about it and stuff.

00:01:28.000 --> 00:01:30.240
And I was like, oh, this has some real possibilities.

00:01:30.240 --> 00:01:33.220
So I'm really excited to have you here to dig into that.

00:01:33.220 --> 00:01:33.540
Cool.

00:01:33.540 --> 00:01:33.860
Yeah.

00:01:33.860 --> 00:01:36.600
Now, before we do get into that, let's start at the beginning.

00:01:36.600 --> 00:01:37.640
Let's start with your story.

00:01:37.640 --> 00:01:38.880
How did you get into programming in Python?

00:01:39.260 --> 00:01:42.300
So I've been programming almost as long as I can remember.

00:01:42.300 --> 00:01:47.340
I think my parents brought home like an IBM XT sometime in the mid-80s.

00:01:47.340 --> 00:01:50.760
And I learned, you know, basic on that as one does.

00:01:50.760 --> 00:01:53.280
And just have been programming ever since.

00:01:53.280 --> 00:01:56.940
I found Python, I think, in 1996 while I was at university.

00:01:56.940 --> 00:02:02.220
And I sort of used it secretly in the background to like prototype my assignments that I had to

00:02:02.220 --> 00:02:03.240
write in other languages.

00:02:03.240 --> 00:02:07.660
I would kind of write these quick little hacks as I'm learning to program in Python because it

00:02:07.660 --> 00:02:09.500
was a lot easier for me.

00:02:09.500 --> 00:02:13.140
And then once I had figured out the problem, I'll convert it to LIS.

00:02:13.140 --> 00:02:16.580
I can now solve the syntax problem to make this happen over here, right?

00:02:16.580 --> 00:02:17.080
Exactly.

00:02:17.080 --> 00:02:17.340
Yeah.

00:02:17.340 --> 00:02:18.440
What languages were you using?

00:02:18.440 --> 00:02:25.080
There was a lot of LIS where I went to school and as well as Java and ML was a big thing then,

00:02:25.080 --> 00:02:28.720
which, you know, sort of grew into Hamill and that family of languages.

00:02:28.720 --> 00:02:29.060
Yeah.

00:02:29.060 --> 00:02:34.660
My first CS language that I only took a couple of CS classes as a minor type of thing,

00:02:34.780 --> 00:02:39.780
but my first one was in Scheme, which is a derivative of LISP, right?

00:02:39.780 --> 00:02:46.320
And I felt that it was both very mind shifting and very interesting and also super not useful.

00:02:46.320 --> 00:02:49.780
Like what was your feeling of studying LISP early on?

00:02:49.780 --> 00:02:51.800
Like I can't go build anything with this.

00:02:51.800 --> 00:02:52.800
What is this crazy language?

00:02:53.800 --> 00:02:56.180
Yeah, it definitely is kind of mind bending.

00:02:56.180 --> 00:03:01.020
And, you know, I think it enforces a lot of good habits, like functional programming ideas

00:03:01.020 --> 00:03:04.820
that you can bring into any language that are probably good habits to have.

00:03:04.820 --> 00:03:08.060
But yeah, like you say, it's not always the most practical.

00:03:08.060 --> 00:03:12.860
I know this is a question that usually is at the end of your show, but I use Emacs.

00:03:13.420 --> 00:03:17.200
And so occasionally to write Emacs extensions, I'll bring out LISP.

00:03:17.200 --> 00:03:24.380
And what's fascinating about it is Emacs is basically built by monkey patching everything else.

00:03:24.380 --> 00:03:30.220
That's Emacs extensions, basically monkey patching is the functional thing that you do,

00:03:30.220 --> 00:03:31.660
which I find interesting.

00:03:31.660 --> 00:03:32.680
That is interesting.

00:03:32.680 --> 00:03:32.980
Wow.

00:03:32.980 --> 00:03:35.000
I guess tends to happen in LISP a lot.

00:03:35.000 --> 00:03:37.780
Well, that's the only big LISP code I'm really familiar with.

00:03:38.000 --> 00:03:40.120
Yeah, Emacs is definitely one of the bigger projects.

00:03:40.120 --> 00:03:41.460
Quite, quite interesting.

00:03:41.460 --> 00:03:42.000
Cool.

00:03:42.000 --> 00:03:44.000
Okay, so you sort of used it.

00:03:44.000 --> 00:03:45.420
You got into it pretty early, right?

00:03:45.420 --> 00:03:46.460
Like, what is that?

00:03:46.460 --> 00:03:53.140
Four or five years into the existence of Python, which for Python, you know, it grew grassroots.

00:03:53.140 --> 00:03:57.060
It wasn't like busted on the scene by Microsoft or Apple or whoever, right?

00:03:57.060 --> 00:03:57.800
That's pretty early.

00:03:57.800 --> 00:04:00.920
Yeah, I went to my first Python conference in 2001.

00:04:00.920 --> 00:04:05.460
And at that point, I don't remember the exact number, but I would guess it was maybe around

00:04:05.460 --> 00:04:07.680
200 people, not much larger than that.

00:04:07.840 --> 00:04:08.760
It was one track.

00:04:08.760 --> 00:04:10.620
Everybody could fit in one room.

00:04:10.620 --> 00:04:15.900
It was small enough that as a grad student, I was able to just walk up to Guido and have

00:04:15.900 --> 00:04:19.100
a talk with him because there wasn't that many people demanding his attention.

00:04:19.100 --> 00:04:23.060
And of course, now you go to PyCon and, you know, forget anything like that happening.

00:04:23.060 --> 00:04:24.260
There's just thousands of people.

00:04:24.260 --> 00:04:25.980
And yeah, it's completely crazy.

00:04:25.980 --> 00:04:26.420
Yeah.

00:04:26.420 --> 00:04:29.180
I mean, it's really fun to see how big the community has grown.

00:04:29.180 --> 00:04:32.740
On the other hand, I do miss the days of the sort of smaller conferences where you feel

00:04:32.740 --> 00:04:36.540
like you got a hold of it and you feel like you got a sense of seeing a lot of the stuff.

00:04:36.800 --> 00:04:38.800
But yeah, these are good problems to have, I guess.

00:04:38.800 --> 00:04:39.620
I guess they are.

00:04:39.620 --> 00:04:43.880
How does it feel to do Python now versus then?

00:04:43.880 --> 00:04:47.560
It's got to be kind of different with pip install, anti-gravity and all that.

00:04:47.560 --> 00:04:54.120
Yeah, I think definitely having good package management means you're no longer compelled

00:04:54.120 --> 00:04:57.720
to like bring everything into your projects in the same way.

00:04:57.720 --> 00:05:00.420
So I used to be the developer on Matplotlib.

00:05:00.420 --> 00:05:05.040
And one of its sort of, I guess, technical debt, for lack of a better word, is the fact

00:05:05.040 --> 00:05:09.380
that for the longest time it was if someone had a great idea for a new plot type, we'd just

00:05:09.380 --> 00:05:11.300
say, yeah, let's include it in Matplotlib.

00:05:11.380 --> 00:05:16.080
Because the alternative was forcing all the users to install a bunch of packages, which

00:05:16.080 --> 00:05:17.320
was really hard at the time.

00:05:17.320 --> 00:05:20.020
So we would make this one really big package.

00:05:20.020 --> 00:05:23.160
And, you know, in hindsight, that's not the greatest thing.

00:05:23.160 --> 00:05:27.040
It means a lot of code for the core developers of that project to maintain.

00:05:27.040 --> 00:05:31.500
So it is nice that we live in this world now where we can have lots of little packages that

00:05:31.500 --> 00:05:31.940
interact.

00:05:31.940 --> 00:05:35.400
And that's not a huge burden on the user as it once was.

00:05:35.460 --> 00:05:36.440
Yeah, that's pretty interesting.

00:05:36.440 --> 00:05:41.860
I mean, maybe it would have been better to have a bunch of little Matplotlib dash extensions

00:05:41.860 --> 00:05:45.340
that you include in your requirements files if you want to do these kind of graphs and stuff.

00:05:45.340 --> 00:05:48.640
And you could have kept it a little more distributed in terms of support.

00:05:48.640 --> 00:05:53.020
But yeah, it's the right architecture and patterns for the right time, right?

00:05:53.020 --> 00:05:54.380
And that's what you needed, yeah?

00:05:54.380 --> 00:05:54.620
Yeah.

00:05:54.620 --> 00:05:54.800
Cool.

00:05:54.800 --> 00:05:58.800
So you both work at a cool web company and do data science.

00:05:58.800 --> 00:06:03.240
So you kind of do both of the things that Python is really good at simultaneously, right?

00:06:03.240 --> 00:06:03.840
Tell us about that.

00:06:04.020 --> 00:06:06.140
Yeah, so I'm a data engineer at Mozilla.

00:06:06.140 --> 00:06:07.500
I've been there about a year and a half.

00:06:07.500 --> 00:06:12.800
And I work on the team that manages the telemetry that comes from our products.

00:06:12.800 --> 00:06:17.160
So from Firefox on desktop and on Android and iOS and all these things.

00:06:17.160 --> 00:06:20.140
You know, and the telemetry goes into improving the product.

00:06:20.140 --> 00:06:24.960
It helps us discover when things are going wrong, whether things are getting better with changes

00:06:24.960 --> 00:06:26.840
or worse with changes and things like that.

00:06:26.840 --> 00:06:31.900
And so there's a whole team that manages collecting that data, ingesting that data, and then providing

00:06:31.900 --> 00:06:34.500
ways for people to analyze it at the end of the day.

00:06:34.500 --> 00:06:39.460
And what's really exciting about how they do that at Mozilla is we have this document called

00:06:39.460 --> 00:06:44.800
our Lean Data Practices, where we try really hard to not collect anything that we don't

00:06:44.800 --> 00:06:48.260
need to collect, not collect anything that will invade people's privacy.

00:06:48.260 --> 00:06:51.800
We really are just collecting what we need in order to improve the product.

00:06:52.020 --> 00:06:55.880
And so it's really nice to come at it from that point of view, not just to sort of snarf

00:06:55.880 --> 00:07:00.100
it all up and see what we can do later, but to really think upfront, do we need to do this

00:07:00.100 --> 00:07:01.200
and do we need to collect this?

00:07:01.200 --> 00:07:02.380
Yeah, that's super cool.

00:07:02.380 --> 00:07:07.380
Like maybe you had for a while, Firefox had the address bar and then like a search box,

00:07:07.380 --> 00:07:08.140
right?

00:07:08.140 --> 00:07:10.180
Most of the browsers have given up on this idea.

00:07:10.180 --> 00:07:12.900
But you could tell that from telemetry, right?

00:07:12.900 --> 00:07:14.640
How people are using each part and so on, yeah?

00:07:14.640 --> 00:07:15.180
Exactly.

00:07:15.180 --> 00:07:15.700
Yeah.

00:07:15.700 --> 00:07:19.440
We're able to sort of see how the search bar is getting used even now that it's a unified

00:07:19.440 --> 00:07:19.920
thing.

00:07:19.920 --> 00:07:21.720
It's a really fun place to work.

00:07:21.720 --> 00:07:26.940
The engineering talent there is just beyond anything I've ever had the privilege of working

00:07:26.940 --> 00:07:27.180
with.

00:07:27.180 --> 00:07:28.780
So I can imagine that it's super awesome.

00:07:28.780 --> 00:07:30.400
I'm personally a big fan of Firefox.

00:07:30.400 --> 00:07:33.660
I generally just use Firefox if at all possible.

00:07:33.660 --> 00:07:38.500
And it frustrates me to no end where I go to places and they're like, this page is only

00:07:38.500 --> 00:07:43.600
available on Chrome or this site only streams this video on Safari.

00:07:43.600 --> 00:07:48.560
And you know there's no good reason for it other than they're just lazy to do like the

00:07:48.560 --> 00:07:50.800
tiny bit of effort to make it work, right?

00:07:50.800 --> 00:07:51.360
Yeah.

00:07:51.360 --> 00:07:52.480
It's unfortunate.

00:07:52.480 --> 00:07:58.520
You know, I think one of Firefox's biggest challenges is that unfortunately there's like

00:07:58.520 --> 00:08:00.940
a network or a snowball effect, right?

00:08:00.940 --> 00:08:04.980
It's that the more websites that don't work on Firefox, the fewer people are going to use

00:08:04.980 --> 00:08:05.140
it.

00:08:05.140 --> 00:08:07.480
And therefore, the fewer websites are going to work on Firefox.

00:08:07.480 --> 00:08:13.160
And so trying to break out of that cycle is sort of a constant battle for us that we're

00:08:13.160 --> 00:08:14.700
tackling on a number of fronts.

00:08:14.700 --> 00:08:15.600
Props to you guys.

00:08:15.660 --> 00:08:17.460
You're doing great stuff, keeping the web open.

00:08:17.460 --> 00:08:19.340
I really am a big fan of Mozilla.

00:08:19.340 --> 00:08:23.000
And I guess our topic today is just one more reason why.

00:08:23.000 --> 00:08:23.280
Cool.

00:08:23.280 --> 00:08:23.980
Yeah, absolutely.

00:08:23.980 --> 00:08:27.900
So we're going to talk about WebAssembly and Pyodide.

00:08:27.900 --> 00:08:34.000
But I think it maybe makes sense to just sort of lay out a little bit of the history of like

00:08:34.000 --> 00:08:40.520
what even led to WebAs, not necessarily what led to WebAssembly, but like what preceded WebAssembly,

00:08:40.520 --> 00:08:40.840
you know?

00:08:40.840 --> 00:08:41.160
Sure.

00:08:41.240 --> 00:08:43.620
Maybe I'll take it up to WebAssembly and you can take it from there.

00:08:43.620 --> 00:08:49.880
So we have way back in, I don't know, 1995, browsers were things that looked at documents

00:08:49.880 --> 00:08:53.780
and people imagine like what if they could do more than just show documents, right?

00:08:53.780 --> 00:08:59.000
So Netscape, that company that used to exist, I also was a fan of them.

00:08:59.000 --> 00:09:03.740
Guys at Netscape came up with JavaScript and it was named JavaScript not because it had anything

00:09:03.740 --> 00:09:05.520
to do Java, but because Java was cool and hot.

00:09:05.520 --> 00:09:07.860
So this is the scripty cool language.

00:09:07.860 --> 00:09:13.820
In 10 days, like it was created in 10 days, start to finish and shipped in Netscape Navigator,

00:09:13.820 --> 00:09:15.840
like three months later or something.

00:09:15.840 --> 00:09:20.980
And boom, it oddly becomes the most popular language in the world and sort of becomes like

00:09:20.980 --> 00:09:25.840
this assembly language of the internet or maybe C, I don't know how you think about it, like

00:09:25.840 --> 00:09:27.880
just the base language of the internet, right?

00:09:29.340 --> 00:09:35.640
And then some other projects came along, ASM, ASM, ASM.JS, and some other ones show that you

00:09:35.640 --> 00:09:40.140
can take C code, compile it down to JavaScript and do incredible stuff.

00:09:40.140 --> 00:09:46.580
Like there's a super, super funny video by Gary Barnhart called The Birth and Death of JavaScript.

00:09:46.580 --> 00:09:47.540
Have you seen this video?

00:09:47.540 --> 00:09:48.440
Yeah, it's fantastic.

00:09:48.440 --> 00:09:53.400
It's like 15 minutes of beautiful history, but humor all mixed together.

00:09:53.400 --> 00:09:54.880
It's really, really insightful.

00:09:55.220 --> 00:10:00.440
And basically in there, he shows things like, I don't remember the exact one, like Chrome

00:10:00.440 --> 00:10:04.860
running in JavaScript inside of Firefox or the other way around, just like really interesting

00:10:04.860 --> 00:10:08.940
stuff like C games in the browser and all that stuff is amazing.

00:10:08.940 --> 00:10:15.680
So it really like showed the possibility of, you know, like the web really can be, and JavaScript

00:10:15.680 --> 00:10:24.460
can be the foundation of not just jQuery and Angular, but Doom or Quake or Firefox, right?

00:10:24.460 --> 00:10:26.140
Like incredible power.

00:10:26.140 --> 00:10:31.060
But it doesn't make a lot of sense to compile stuff to JavaScript, to ship it down, to then

00:10:31.060 --> 00:10:34.200
reinterpret it, to like then compile it and JIT it and run it.

00:10:34.200 --> 00:10:40.360
Like, wouldn't it be better if there was a binary way to like compile for the web, right?

00:10:40.360 --> 00:10:45.700
And so that's my, I guess my introduction to this idea of WebAssembly, right?

00:10:45.700 --> 00:10:47.140
Yeah, that's a great little history.

00:10:47.140 --> 00:10:50.820
I mean, most of that predates my coming to WebAssembly.

00:10:50.820 --> 00:10:54.860
I've only been using WebAssembly for about a year and a half when this whole project started.

00:10:54.860 --> 00:10:59.420
So fortunately, Mozilla has a lot of people who do work on WebAssembly and have web data from the

00:10:59.420 --> 00:10:59.560
beginning.

00:10:59.560 --> 00:11:00.480
So it's really nice.

00:11:00.480 --> 00:11:01.540
Did it originate with Mozilla?

00:11:01.540 --> 00:11:02.780
It did.

00:11:02.780 --> 00:11:03.460
Yeah.

00:11:03.500 --> 00:11:07.280
I felt like Rust and WebAssembly and all that stuff kind of came from you guys.

00:11:07.280 --> 00:11:07.760
Exactly.

00:11:07.760 --> 00:11:08.060
Yeah.

00:11:08.060 --> 00:11:13.060
It originated at Mozilla, but it is definitely an open standard that all the browsers are supporting

00:11:13.060 --> 00:11:13.780
and all that stuff.

00:11:13.780 --> 00:11:14.220
Yeah.

00:11:14.220 --> 00:11:14.760
Super cool.

00:11:14.760 --> 00:11:20.720
So, I mean, maybe give the elevator pitch of like, what is WebAssembly for folks who don't

00:11:20.720 --> 00:11:21.300
necessarily know?

00:11:21.300 --> 00:11:21.860
Yeah.

00:11:21.920 --> 00:11:23.980
So it's basically a binary format.

00:11:23.980 --> 00:11:28.320
It's designed to be something that your compiler writes to.

00:11:28.320 --> 00:11:31.580
It's not something you would ever write by hand or people do.

00:11:31.580 --> 00:11:36.500
It's a target for something like a C compiler or a Fortran compiler or a Rust compiler, right?

00:11:36.500 --> 00:11:38.220
Any of these compiled languages.

00:11:38.220 --> 00:11:44.340
And when that gets shipped to your browser, the browser then converts that into native machine

00:11:44.340 --> 00:11:45.980
code that runs on your machine.

00:11:45.980 --> 00:11:49.920
But it's also running inside of the same browser sandbox that runs your JavaScript.

00:11:49.920 --> 00:11:54.980
So it has the same sort of security space and constraints as JavaScript.

00:11:54.980 --> 00:11:55.240
Right.

00:11:55.240 --> 00:11:58.820
One of the concerns people have is like, well, JavaScript is safe because it's this sandbox

00:11:58.820 --> 00:12:00.620
thing and it only has so much in the language.

00:12:00.620 --> 00:12:04.340
If you start running arbitrary binary code, all bets are off.

00:12:04.340 --> 00:12:06.740
But it's not just like, here's some machine instructions.

00:12:06.740 --> 00:12:10.900
It's, oh, here's a binary thing that runs in the WebAssembly world, right?

00:12:10.900 --> 00:12:11.460
Yeah.

00:12:11.460 --> 00:12:12.300
Yeah.

00:12:12.300 --> 00:12:14.520
So these are details I don't know too much about.

00:12:14.720 --> 00:12:21.360
But they definitely is making all these assurances that the sort of typical things you can do

00:12:21.360 --> 00:12:24.720
in C that become security flaws, you cannot do in WebAssembly.

00:12:24.720 --> 00:12:28.220
Or if you do them, you don't break out of the browser, right?

00:12:28.220 --> 00:12:28.520
Right.

00:12:28.520 --> 00:12:30.160
You just get an exception or something, right?

00:12:30.160 --> 00:12:30.600
Exactly.

00:12:30.600 --> 00:12:30.920
Yeah.

00:12:30.920 --> 00:12:31.780
Cool.

00:12:31.780 --> 00:12:38.200
Dan Callahan, another Mozilla person, gave a pretty interesting call to action around WebAssembly

00:12:38.200 --> 00:12:41.800
last year in one of the PyCon keynotes at US PyCon.

00:12:41.800 --> 00:12:43.420
Were you there for that?

00:12:43.420 --> 00:12:44.420
I wasn't there.

00:12:44.420 --> 00:12:48.260
And actually, what's interesting about that is I had already been working on PyDyde for

00:12:48.260 --> 00:12:50.000
a few months when he gave that talk.

00:12:50.000 --> 00:12:52.400
And he and I were not aware of each other at all.

00:12:52.400 --> 00:12:54.720
It just sort of explains how big a place Mozilla is.

00:12:54.720 --> 00:12:55.260
Yeah, yeah, yeah.

00:12:55.260 --> 00:12:55.780
For sure.

00:12:55.780 --> 00:12:58.580
You know, nobody's fault at either end at all.

00:12:58.580 --> 00:13:02.400
And you guys also have like PyPyJS, which I don't think is active anymore.

00:13:02.560 --> 00:13:05.960
But there's like a lot of these little flowers blooming in that world, right?

00:13:05.960 --> 00:13:06.560
Exactly.

00:13:06.560 --> 00:13:07.180
Exactly.

00:13:07.180 --> 00:13:11.600
And so it was really cool to see his talk and sort of realize we were thinking along the

00:13:11.600 --> 00:13:12.100
same lines.

00:13:12.100 --> 00:13:16.700
And we sort of check in with each other periodically on what's going on there, which is great.

00:13:16.960 --> 00:13:17.320
That's cool.

00:13:17.320 --> 00:13:21.020
So I guess the quick summary of that, I'll link to the whole half hour presentation.

00:13:21.020 --> 00:13:22.960
But it was like, Python is amazing.

00:13:22.960 --> 00:13:23.880
We love Python.

00:13:23.880 --> 00:13:28.120
But the web is one of the most important places where code runs right now.

00:13:28.460 --> 00:13:33.000
And running in the browser, Python is sadly absent for the most part.

00:13:33.000 --> 00:13:36.480
I mean, we have Sculpt and a few of those other things like PyPyJS.

00:13:36.480 --> 00:13:41.520
But they're always in some kind of like seven caveats and some little sliver of use case, right?

00:13:41.520 --> 00:13:45.540
What I was hoping for when I watched this was like, okay, this is the buildup.

00:13:45.540 --> 00:13:47.240
Please let this be an announcement.

00:13:47.240 --> 00:13:49.680
Like, please let this be an announcement involving WebAssembly.

00:13:49.680 --> 00:13:51.560
And it just turned out to be a community.

00:13:51.560 --> 00:13:52.520
We need to work on this.

00:13:52.520 --> 00:13:54.680
And I thought it was really awesome when I saw Pyodide come out.

00:13:54.680 --> 00:13:56.860
I'm like, oh my gosh, they actually were working on something.

00:13:56.860 --> 00:13:59.340
But I guess since you didn't know about each other, you couldn't really.

00:13:59.340 --> 00:14:03.800
It's not like you could do the big reveal of like, and Pyodide's a good start or something, right?

00:14:03.800 --> 00:14:04.940
Yeah, yeah.

00:14:04.940 --> 00:14:07.320
It would have been a little bit of a different talk, I guess.

00:14:07.320 --> 00:14:07.560
Yeah.

00:14:07.560 --> 00:14:12.840
But still, I mean, the points he makes are great points in terms of, I think, you know,

00:14:12.840 --> 00:14:17.340
the web is where so much computing happens these days that if you aren't playing in that space,

00:14:17.340 --> 00:14:19.120
you know, you are becoming limited.

00:14:19.120 --> 00:14:24.320
And yeah, and like you say, there's been a bunch of other projects to bring Python to the web browser.

00:14:24.320 --> 00:14:24.640
Yeah.

00:14:24.780 --> 00:14:30.400
The thing that makes Pyodide a little unique is that it tries to be as close to upstream as possible.

00:14:30.400 --> 00:14:35.700
So it's using upstream CPython, the upstream versions of NumPy and SciPy and all these things,

00:14:35.700 --> 00:14:40.860
and tries to change them as little as possible so that the effort on those projects contributes

00:14:40.860 --> 00:14:47.020
into our effort directly rather than reinventing and then constantly having to keep up with that,

00:14:47.020 --> 00:14:47.360
right?

00:14:47.360 --> 00:14:47.640
Right.

00:14:47.740 --> 00:14:47.900
Yeah.

00:14:47.900 --> 00:14:51.080
Like we spoke earlier, there's so much benefit to having all these packages.

00:14:51.080 --> 00:14:54.360
And every day there's just more to grow upon.

00:14:54.360 --> 00:14:59.980
But if your job is like, we have to have our own copy and implementation of Matplotlib in JavaScript,

00:14:59.980 --> 00:15:03.140
like nobody wants to do that job, right?

00:15:03.140 --> 00:15:03.460
Yeah.

00:15:03.460 --> 00:15:09.760
That's just so many years of effort that I think, you know, it would always be a poor imitation of the real thing, right?

00:15:09.760 --> 00:15:09.980
So it's...

00:15:09.980 --> 00:15:11.920
Yeah, it'd also be behind as well.

00:15:11.920 --> 00:15:16.980
Like TensorFlow came out, but we haven't written the JavaScript TensorFlow yet, so forget that or whatever, you know?

00:15:16.980 --> 00:15:17.840
Something like that.

00:15:17.840 --> 00:15:18.120
Right.

00:15:18.120 --> 00:15:20.360
And you see this a little bit even with like PyPy.

00:15:20.360 --> 00:15:22.000
I mean, PyPy is incredibly impressive.

00:15:22.000 --> 00:15:23.360
Really cool project.

00:15:23.640 --> 00:15:29.280
But they're still at the 3.6 level of syntax because they're just always sort of following.

00:15:29.280 --> 00:15:31.520
It's just sort of in the nature of what they're doing.

00:15:31.520 --> 00:15:32.740
And I don't mean that as criticism.

00:15:32.740 --> 00:15:37.200
But if you aren't tracking the leader, you're always going to be a little bit behind, right?

00:15:37.200 --> 00:15:37.480
Right.

00:15:37.480 --> 00:15:37.960
Right.

00:15:37.960 --> 00:15:38.060
Right.

00:15:38.060 --> 00:15:43.280
So I guess before we move off of just WebAssembly on its own, I can dig into Pyodide.

00:15:43.280 --> 00:15:45.080
How well supported is it?

00:15:45.080 --> 00:15:47.600
Like WebAssembly sounds like new and futuristic.

00:15:47.600 --> 00:15:49.700
How well supported is this?

00:15:49.700 --> 00:15:54.740
It's in all the major browsers, in the stable versions of all the major browsers right now.

00:15:54.740 --> 00:15:57.100
So it's pretty easy to rely on it.

00:15:57.100 --> 00:16:00.840
It's at what they're calling sort of this MVP level of WebAssembly.

00:16:00.840 --> 00:16:05.780
They sort of decided which features were the most critical, and that's everywhere.

00:16:05.780 --> 00:16:13.000
Then there's a bunch of ways in which WebAssembly is already planning to be improved that will eventually trickle down to browsers.

00:16:13.000 --> 00:16:17.340
So things like threading are newer features that are coming down.

00:16:17.760 --> 00:16:19.960
And garbage collection is going to be added.

00:16:19.960 --> 00:16:23.340
So there's a bunch of things that are coming that you can't rely on yet.

00:16:23.340 --> 00:16:27.160
But for the core stuff, and actually for most of the stuff we needed for Pyodide, it's already there.

00:16:29.140 --> 00:16:33.240
This portion of Talk Python is sponsored by Microsoft and Visual Studio Code.

00:16:33.240 --> 00:16:40.540
Visual Studio Code is a free, open-source, and lightweight code editor that runs on Mac, Linux, and Windows with rich Python support.

00:16:40.540 --> 00:16:48.620
Download Visual Studio Code and install the Python extension to get coding with support for tools you love like Jupyter, Black Formatting, Pilot, pytest, and more.

00:16:48.820 --> 00:16:55.860
And just announced this month, you can now work with remote Python code bases using the new Visual Studio Code remote extensions.

00:16:55.860 --> 00:17:03.020
Use the full power of Visual Studio Code when coding in containers, in Windows subsystem for Linux, and over SSH connections.

00:17:03.020 --> 00:17:04.060
Yep, that's right.

00:17:04.060 --> 00:17:08.040
Autocompletions, debugging, the terminal, source control, your favorite extensions.

00:17:08.040 --> 00:17:10.860
Everything works just right in the remote environment.

00:17:10.860 --> 00:17:15.500
Get started with Visual Studio Code now at talkpython.fm/Microsoft.

00:17:17.500 --> 00:17:25.140
So it looks like I opened up Can I Use, and I'll put a link to the WebAssembly report for Can I Use, which talks about what browser support what.

00:17:25.140 --> 00:17:30.940
So it looks like Edge, Firefox, Chrome, Safari, Opera, all those desktops just supported.

00:17:30.940 --> 00:17:36.740
iOS on Safari supports it, and Android, Chrome, and Firefox there supported.

00:17:36.740 --> 00:17:39.100
That kind of sounds like 99%, right?

00:17:39.100 --> 00:17:39.880
Yeah, yeah.

00:17:39.880 --> 00:17:42.660
I'm excited to see things like threading and stuff coming as well.

00:17:42.660 --> 00:17:46.380
There's a possibility for more interesting stuff to come along.

00:17:46.380 --> 00:17:46.860
Yeah.

00:17:46.980 --> 00:17:47.080
Cool.

00:17:47.080 --> 00:17:47.400
All right.

00:17:47.400 --> 00:17:49.600
So that brings us to Pyodide.

00:17:49.600 --> 00:17:58.320
Now, there's a couple of interesting projects that are saying WebAssembly plus other language plus interesting runtime means something in the browser.

00:17:58.320 --> 00:18:00.220
What did it mean for you guys?

00:18:00.420 --> 00:18:13.520
The reason Pyodide started is when I arrived at Mozilla, Hamilton, Ulmer, and Brendan Collar, and also William LaChanze and Tian Brooks were working on this sort of internal skunkworks project for data science at Mozilla.

00:18:13.520 --> 00:18:23.160
The idea was data scientists at Mozilla were using a lot of things like Jupyter Notebooks, or there's a tool called Databricks that's very similar.

00:18:23.160 --> 00:18:35.940
And the problem with these tools was that sharing them is harder than maybe it needs to be because you have a front end on the web, but your actual computation is happening somewhere else.

00:18:35.940 --> 00:18:39.460
It might be on the same machine, but often it's a remote kernel somewhere else.

00:18:39.460 --> 00:18:39.660
Right.

00:18:39.660 --> 00:18:46.420
So you need maybe access to that compute cluster, or if you want to run it locally, you've got to pip install a bunch of stuff, right?

00:18:46.420 --> 00:18:47.580
You're like, oh, you can run this.

00:18:47.580 --> 00:18:50.980
It's easy to run, except for you have to now set up a virtual environment.

00:18:50.980 --> 00:18:52.440
You probably should use Minicon.

00:18:52.560 --> 00:18:53.040
And here's your stuff.

00:18:53.040 --> 00:18:53.780
You're like, whoa, whoa, whoa.

00:18:53.780 --> 00:18:55.100
I just want to look at the report.

00:18:55.100 --> 00:18:55.740
What is this, right?

00:18:55.740 --> 00:18:57.680
Like, for a lot of folks, that's super overwhelming.

00:18:57.680 --> 00:19:08.480
Yeah, your choices are generally, you either require people to install, which, like you say, it's very difficult, or you have to sort of pay for some cloud computing resources somehow.

00:19:08.480 --> 00:19:17.260
And so if you were to put that on, say, a public website to share your data science, you might end up with an unexpectedly huge bill, perhaps, right?

00:19:17.260 --> 00:19:17.560
Right.

00:19:17.560 --> 00:19:21.300
The worst case thing happens, exactly what you want, is a lot of people get interested in it.

00:19:21.300 --> 00:19:21.760
Exactly.

00:19:22.020 --> 00:19:22.380
Exactly.

00:19:22.380 --> 00:19:30.860
And so the idea with iodide was, let's move all the computation into the browser, and then the computation's all happening at the edges in people's clients, right?

00:19:30.860 --> 00:19:31.100
Right.

00:19:31.100 --> 00:19:34.380
And to be clear for people, it's iodide, not pyodide, right?

00:19:34.380 --> 00:19:34.520
Exactly.

00:19:34.520 --> 00:19:36.120
This is another project.

00:19:36.120 --> 00:19:36.720
Yeah, okay.

00:19:36.720 --> 00:19:37.120
Yeah.

00:19:37.120 --> 00:19:37.740
Yeah.

00:19:37.740 --> 00:19:39.120
So that's iodide.

00:19:39.120 --> 00:19:42.680
It's sort of this data science user interface front end, right?

00:19:42.680 --> 00:19:48.380
And the first iteration of that, of course, was using JavaScript to do the data science.

00:19:49.060 --> 00:19:53.400
And you start to realize, well, JavaScript's not a great language for that.

00:19:53.400 --> 00:20:01.220
There's not this sort of mature ecosystem of libraries and tools you can just pull off the shelf like there is in Python or R or Julia, right?

00:20:01.580 --> 00:20:07.140
And then there's even some kind of really nice to have language features that JavaScript doesn't have.

00:20:07.140 --> 00:20:12.340
Like it doesn't have operator overloading, which is really handy when you're doing a lot of numerical computation.

00:20:12.340 --> 00:20:12.700
Right.

00:20:12.700 --> 00:20:14.300
And numbers are weird, right?

00:20:14.360 --> 00:20:17.520
Like you can't have true integers, for example, and other stuff.

00:20:17.520 --> 00:20:17.740
Yeah.

00:20:17.740 --> 00:20:18.420
Exactly.

00:20:18.420 --> 00:20:22.900
And there's a lot of movement in that space trying to make that all better.

00:20:22.900 --> 00:20:24.500
But that's kind of a big lift.

00:20:24.500 --> 00:20:26.780
And definitely the iodide project wants to encourage that.

00:20:26.780 --> 00:20:29.600
And we're working on that kind of in one thread.

00:20:29.600 --> 00:20:37.740
But in the meantime, we thought, why don't we try and come to data scientists where they are, which is in Python, and somehow bring Python to the browser.

00:20:38.140 --> 00:20:42.220
And this seemed like a very crazy idea to me when it was first raised.

00:20:42.220 --> 00:20:50.820
But fortunately, like I said, at Mozilla, we have a bunch of WebAssembly experts who got on a meeting with us and said, it can't be that hard.

00:20:50.820 --> 00:20:53.140
Other people have done things much more difficult than that.

00:20:53.140 --> 00:20:54.540
So why don't you just try it?

00:20:54.540 --> 00:21:06.480
So I went off and found this project by a GitHub user named DGM called cPython Emscripten, which had done a lot of the initial footwork on this.

00:21:06.620 --> 00:21:11.160
And starting with that, was able to get something going in probably a couple weeks.

00:21:11.160 --> 00:21:13.740
That's kind of like that 80% in 20 weeks.

00:21:13.740 --> 00:21:17.220
And then the remaining 20% is, sorry, in two weeks.

00:21:17.220 --> 00:21:18.960
And then the remaining 20% takes forever.

00:21:18.960 --> 00:21:22.160
But certainly getting to the proof of concept was pretty quick.

00:21:22.160 --> 00:21:29.900
And realizing, hey, you can actually compile the real cPython interpreter and get that to run and then have the real NumPy loading in.

00:21:29.900 --> 00:21:31.840
And all those things actually do kind of work.

00:21:31.840 --> 00:21:32.740
It was pretty exciting.

00:21:32.740 --> 00:21:34.020
Yeah, that's pretty amazing.

00:21:34.020 --> 00:21:41.780
And how much of the cPython, like how big is the cPython.wasm or whatever it's called?

00:21:41.780 --> 00:21:46.200
Like the core runtime bits you got to take down before you can start doing stuff, just roughly.

00:21:46.200 --> 00:21:47.240
Yeah, roughly.

00:21:47.240 --> 00:21:50.140
So I'm actually pulling up those numbers because they change all the time.

00:21:50.140 --> 00:21:53.820
It's about 20 megabytes for Python core.

00:21:53.820 --> 00:21:57.560
And then, of course, the libraries you're going to pull in will add to that.

00:21:57.680 --> 00:21:59.960
So NumPy is around 8 megabytes.

00:21:59.960 --> 00:22:01.400
It goes from there.

00:22:01.400 --> 00:22:06.520
But one of the things that PyDyde does is it only downloads the libraries you actually import.

00:22:06.520 --> 00:22:06.880
I see.

00:22:06.880 --> 00:22:11.100
If you don't import Matplotlib or NumPy, like those are not things it has to go hit, right?

00:22:11.100 --> 00:22:11.820
Exactly.

00:22:11.820 --> 00:22:15.140
And also your browser will cache those things.

00:22:15.140 --> 00:22:18.040
So the first time you'll pay the network penalty.

00:22:18.180 --> 00:22:20.200
But once it's been done once, it's on your machine.

00:22:20.200 --> 00:22:25.840
And it will actually recompile the WebAssembly each time, but it doesn't actually have to download it again.

00:22:25.840 --> 00:22:26.140
Cool.

00:22:26.140 --> 00:22:29.420
So that seems like a really good opportunity for a CDN.

00:22:29.420 --> 00:22:32.680
And the fewer CDNs that host this, the better, right?

00:22:32.680 --> 00:22:35.160
Like the more it is shared, the better, honestly.

00:22:36.080 --> 00:22:45.900
You know, when I think about these types of things, one thing when I speak to folks like you, because you actually are within a sphere of influence that could possibly have some kind of difference.

00:22:45.900 --> 00:22:49.640
It's cool that you can put that on a CDN and you can download it and make it go.

00:22:49.640 --> 00:22:55.140
But, you know, what if like JavaScript is baked into Firefox?

00:22:55.440 --> 00:23:01.440
What if CPython in WebAssembly was part of Firefox?

00:23:01.440 --> 00:23:06.300
Like when I update Firefox, I get that latest runtime and I don't have to download it ever.

00:23:06.300 --> 00:23:08.220
I don't know if it's a silly idea or not.

00:23:08.220 --> 00:23:15.400
I mean, I remember from, you know, probably at least 20 years ago, people have talked about JavaScript's not a great language.

00:23:15.400 --> 00:23:17.860
Let's put a better language in the browser instead, right?

00:23:17.860 --> 00:23:20.340
That's been a theme for a very long time.

00:23:20.340 --> 00:23:25.120
And what's interesting is now that we have WebAssembly, we sort of no longer have to pick one.

00:23:25.120 --> 00:23:27.340
Which is a nice situation to be in.

00:23:27.340 --> 00:23:29.240
We can get any of them in there.

00:23:29.240 --> 00:23:33.060
Are there ways that it could be done like more efficiently or more conveniently?

00:23:33.060 --> 00:23:33.700
Certainly.

00:23:33.700 --> 00:23:44.540
Like I think, you know, maybe there'll be some like web extension you download that gives you languages so that it just kind of will always update those in the background and you don't have to worry about this kind of stuff.

00:23:44.540 --> 00:23:49.660
There's ways that you could make it almost like a browser just with a little bit of extra added to make.

00:23:49.660 --> 00:23:52.680
Yeah, I'm not nearly proposing Python in Firefox.

00:23:52.680 --> 00:23:58.500
I'm proposing Firefox come with these preloaded or like you say in the background, like preload the latest one or something.

00:23:58.500 --> 00:24:00.320
And then you could have Python.

00:24:00.320 --> 00:24:01.380
You could have C#.

00:24:01.380 --> 00:24:07.600
You could have all the languages that are like building these runtimes in WebAssembly and make them available.

00:24:07.600 --> 00:24:09.720
And just I think that would be super cool, actually.

00:24:10.020 --> 00:24:17.140
Yeah, it definitely would open up the possibility for, say, writing web applications using this technology.

00:24:17.140 --> 00:24:28.640
So one of the things I do warn people about Pyodide is the thing that's cool about Pyodide is you can actually see the Python in your web browser and run it there, which is great for the data science use case.

00:24:28.640 --> 00:24:28.940
Right.

00:24:29.880 --> 00:24:36.320
But if you don't have a use case where you need to show people the code, this is probably not how you want to write your web app.

00:24:36.320 --> 00:24:37.020
Yeah, for sure.

00:24:37.020 --> 00:24:42.720
You probably want to stay in a JavaScript world, which is a lot more efficient and a lot faster and all these things, right?

00:24:42.800 --> 00:24:56.680
Yeah, so Pyodide is very much focused on basically what you do with Jupyter notebooks, but make that execution happen on the client side in the browser, not connected back to a Docker thing or some kernel elsewhere, right?

00:24:56.680 --> 00:25:00.420
Like that is the use case and that's what it's built for and optimized.

00:25:00.420 --> 00:25:00.660
Exactly.

00:25:00.660 --> 00:25:03.460
That's the original use case that came out of Pyodide.

00:25:03.800 --> 00:25:15.960
What's interesting is since then, we've been talking with the WebAssembly folks again who are really pushing the idea of WebAssembly as a containerization technology.

00:25:15.960 --> 00:25:22.520
So WebAssembly that doesn't actually run in a browser but would run on cloud computers, right?

00:25:22.520 --> 00:25:22.900
Yes.

00:25:22.900 --> 00:25:31.120
Because it provides a really nice sandboxed way of running arbitrary code, but it does it in a way that's actually a lot lighter weight than Docker, right?

00:25:31.120 --> 00:25:41.600
So like Docker is maybe the industry standard for this right now, but Docker essentially you're taking a whole Linux distribution and a whole OS and shoving that in a container and passing that around.

00:25:41.600 --> 00:25:41.880
Right.

00:25:41.880 --> 00:25:46.360
And you require the kernels to match of the host and the container, right?

00:25:46.360 --> 00:25:48.060
Whereas WebAssembly doesn't care.

00:25:48.060 --> 00:25:54.240
You know, there are projects like Wasmr that are bringing this to Python already, so it's not a far-fetched idea.

00:25:54.240 --> 00:25:55.040
No, it's not.

00:25:55.040 --> 00:26:11.500
I think it has a lot of promise, and for our own little data science community, what excites us about it is we can use the browser as like a prototyping tool where the computation is happening locally and it's really fast, maybe while you're working on a small part of your data.

00:26:11.860 --> 00:26:22.300
But then to be able to very smoothly say, I'm now going to run that on a cluster without having to change anything and having it still be built on the same technologies could be a really powerful thing.

00:26:22.460 --> 00:26:26.920
This is all kind of in the pie-in-the-sky streaming of its stage for us.

00:26:26.920 --> 00:26:28.300
Yeah, but it's a sweet pie.

00:26:28.300 --> 00:26:29.120
It looks really nice.

00:26:29.120 --> 00:26:29.360
Yeah.

00:26:29.360 --> 00:26:29.900
Exactly.

00:26:29.900 --> 00:26:38.040
You could bring in things like Dask as well to help you do distributed computation without people really even knowing or caring that that's happening.

00:26:38.040 --> 00:26:42.840
So there's a lot of stuff that could like expand, expand to the servers, expand to clusters, and so on.

00:26:42.840 --> 00:26:43.300
Exactly.

00:26:43.300 --> 00:26:43.820
Yeah.

00:26:43.820 --> 00:26:44.140
Cool.

00:26:44.140 --> 00:26:47.200
So I've seen some cool examples of this already working.

00:26:47.200 --> 00:26:55.480
Like I pulled up the, was it LA data or some kind of like city map, the one that came from your article that you recently published, but there's a live example and a link to that.

00:26:55.480 --> 00:26:56.800
It's pretty interesting.

00:26:56.800 --> 00:26:58.120
It's doing real data science.

00:26:58.120 --> 00:26:59.040
It's doing real computation.

00:26:59.040 --> 00:27:02.880
So it takes like 15 seconds to load, right?

00:27:02.880 --> 00:27:03.240
Yeah.

00:27:03.240 --> 00:27:06.920
Even if you've already cached the download, it's like computing for 15 seconds.

00:27:06.920 --> 00:27:07.920
So you got to be patient.

00:27:07.920 --> 00:27:11.920
But then it comes up and it's got a really wonderful graph and visualizations.

00:27:11.920 --> 00:27:13.880
Like how do you do the visualizations?

00:27:14.000 --> 00:27:17.340
There's probably a lot of JavaScript and other tie-ins there, right?

00:27:17.340 --> 00:27:17.740
Yeah.

00:27:17.740 --> 00:27:19.900
So it actually depends.

00:27:19.900 --> 00:27:22.020
There's a lot of different ways you can do it.

00:27:22.020 --> 00:27:33.520
But that example you're talking about with the map and the call data, there we're using a JavaScript library called Regal, which is sort of a WebGL front end to do that 3D plotting.

00:27:33.520 --> 00:27:37.540
And it's nice because it uses the hardware to do the 3D acceleration and works really well.

00:27:37.540 --> 00:27:45.700
But you can just as easily use Matplotlib and exact like the full Python Matplotlib and do your plotting there.

00:27:45.700 --> 00:27:46.980
And that all works.

00:27:46.980 --> 00:27:54.460
There's only very maybe 300 lines of code that kind of glue Matplotlib to a Web Canvas that had to be written to make that work.

00:27:54.460 --> 00:27:57.040
But otherwise, that's pure Python doing your plotting.

00:27:57.040 --> 00:28:03.620
But then we've also played around with tools like Plotly, which is a JavaScript plotting library.

00:28:03.620 --> 00:28:09.660
One of the things that PyDyde does really well is it can share your data between Python and JavaScript without copying it.

00:28:09.900 --> 00:28:12.060
So it's really fast to share the data back and forth.

00:28:12.060 --> 00:28:23.000
So you can do like all your heavy lifting computation in Pandas in Python and then just ship that over the JavaScript side for plotting and kind of mix and match.

00:28:23.000 --> 00:28:27.240
So it's kind of exciting that like you no longer have to say, well, I'm working in Python.

00:28:27.240 --> 00:28:31.780
So my choices for plotting are Matplotlib and maybe Boca and whatever the Python ones are.

00:28:31.780 --> 00:28:35.240
You can choose from both Python and JavaScript ecosystems.

00:28:35.680 --> 00:28:43.420
Maybe in this crazy future, you could also import some other WebAssembly based visualization thing that you don't even know what it's written in, right?

00:28:43.420 --> 00:28:43.940
Like who knows?

00:28:43.940 --> 00:28:45.600
Maybe it's in Swift or something crazy, right?

00:28:45.600 --> 00:28:46.680
But yeah, interesting.

00:28:46.680 --> 00:28:46.960
Yep.

00:28:46.960 --> 00:28:47.240
Okay.

00:28:47.240 --> 00:28:55.600
One thing I want to ask you is like if I'm a data scientist and I'm listening to this and I'm like super excited, should I be excited today or should I be excited in like a year and a half?

00:28:55.600 --> 00:29:00.120
Is this something I can reasonably use now or is that a cool proof of concept or like what's the status?

00:29:00.120 --> 00:29:03.720
I would say you have to have a little bit of patience still.

00:29:03.720 --> 00:29:04.480
Okay.

00:29:04.720 --> 00:29:14.080
You know, we certainly encourage people to come up and try to do the kind of things they're doing in Jupyter now in iodide and pyodide and sort of see where some of the rough edges are.

00:29:14.080 --> 00:29:16.100
We do actually have it.

00:29:16.100 --> 00:29:22.620
It is being used for real work within Mozilla for data science and people are using the Python parts of it and stuff.

00:29:22.620 --> 00:29:27.900
So if you sort of know where the boundaries are and what you can get away with, it's already working.

00:29:27.900 --> 00:29:29.380
It's sharp and jagged over there.

00:29:29.380 --> 00:29:30.040
Don't go over there.

00:29:30.040 --> 00:29:30.460
Exactly.

00:29:30.460 --> 00:29:31.700
Yeah, yeah, yeah.

00:29:31.700 --> 00:29:32.040
Right.

00:29:32.040 --> 00:29:36.780
But I would be being disingenuous if I said it was ready for everything that people might want to do.

00:29:36.780 --> 00:29:37.120
Sure.

00:29:37.120 --> 00:29:42.260
I guess the way you find out and the way you get it ready is people try to use it and you're like, wait, everyone's trying to do this and it doesn't do that.

00:29:42.260 --> 00:29:43.900
Well, maybe that's something it gets.

00:29:44.080 --> 00:29:44.400
Exactly.

00:29:44.400 --> 00:29:49.440
No, it's really helpful for us, actually, because we've been getting some really great bug reports.

00:29:49.440 --> 00:30:00.960
We had a blog post last week that kind of brought a lot more traffic to the site and that turned into a lot of really great bug reports of things that seem obvious in hindsight, but we had never thought to check, is that going to work?

00:30:00.960 --> 00:30:01.200
Yeah.

00:30:01.200 --> 00:30:02.120
That's really helpful.

00:30:02.120 --> 00:30:07.100
And also, it helps us prioritize what features need to be added.

00:30:07.100 --> 00:30:13.860
If 10 out of the 20 people that show up all have problems with the same thing, well, that's a pretty good sign that that's where we should focus effort.

00:30:13.860 --> 00:30:15.460
Yeah, that makes total sense.

00:30:15.460 --> 00:30:18.100
Are there full-time employees working on it at Mozilla?

00:30:18.340 --> 00:30:21.180
What is its status as a project for you all?

00:30:21.180 --> 00:30:39.200
So there's probably a total of about three FTEs working on it divided among five people within Mozilla, and it's still primarily sort of an internally devoted project in that our internal users are kind of helping us prioritize what gets worked on and move forward.

00:30:39.200 --> 00:30:47.860
We do, of course, have the public-facing website at iodide.io where anybody can come up and create notebooks, and we do look at that, too.

00:30:47.860 --> 00:30:48.980
And it's all open source.

00:30:48.980 --> 00:30:49.280
Yeah.

00:30:49.280 --> 00:31:00.220
And we're just sort of hoping that if we can bootstrap it enough internally and prove it as a useful tool, then we can get, hopefully, some more resourcing to kind of make it something that will serve a broader community.

00:31:00.220 --> 00:31:01.160
Yeah, that's super cool.

00:31:01.160 --> 00:31:11.520
So if I am a person who maintains or is somehow in charge of a data science package that is not available in there, I'm guessing there are some that are not available.

00:31:11.520 --> 00:31:12.540
Is that right?

00:31:12.540 --> 00:31:12.880
Yeah.

00:31:12.880 --> 00:31:13.080
Yeah.

00:31:13.080 --> 00:31:13.300
Yeah.

00:31:13.300 --> 00:31:15.640
So if I am, how do I get mine?

00:31:15.840 --> 00:31:17.620
Like, I run, you know, whatever.

00:31:17.620 --> 00:31:20.520
And I really want that available alongside NumPy.

00:31:20.520 --> 00:31:21.540
How do I make that happen?

00:31:21.860 --> 00:31:27.960
So right now, all our package building is built as part of kind of this monolithic tree of make files.

00:31:28.480 --> 00:31:36.840
So to add a new package, you basically add it to the Plyodide source code, and then that causes it to automatically get built and then eventually distributed.

00:31:37.100 --> 00:31:53.800
Where I'd really like to move is to something that works more like CondoForge where, and maybe even literally to use CondoForge if we can make that work so that anybody could just walk up and contribute a package sort of in their own repo and that would automatically get picked up and then distributed.

00:31:53.800 --> 00:31:57.960
So it would be a more like distributed build system than what it is now.

00:31:57.960 --> 00:31:58.260
Oh, yeah.

00:31:58.260 --> 00:31:59.240
That's pretty interesting.

00:31:59.240 --> 00:31:59.560
Yeah.

00:31:59.560 --> 00:32:00.380
That's a cool idea.

00:32:00.380 --> 00:32:00.600
Yeah.

00:32:00.600 --> 00:32:08.260
To just basically have approved sources for packages and you guys just continually pull them kind of like, like you said, like a CI system almost.

00:32:08.260 --> 00:32:08.760
Exactly.

00:32:08.760 --> 00:32:09.240
Yeah.

00:32:09.240 --> 00:32:09.640
Yeah.

00:32:09.640 --> 00:32:18.400
So you'd have individual package maintainers who could maintain their own package, but the sort of infrastructure that makes that all work would be centralized.

00:32:18.400 --> 00:32:19.380
That's the idea.

00:32:19.380 --> 00:32:19.720
Yeah.

00:32:19.720 --> 00:32:19.960
Right.

00:32:19.960 --> 00:32:20.660
Is it hard?

00:32:20.660 --> 00:32:28.340
If I have something that has like some C section and some Python section, what's the overhead to get it working in this?

00:32:28.340 --> 00:32:29.340
It varies.

00:32:29.340 --> 00:32:33.780
So like NumPy that has C, but it's fairly straightforward.

00:32:33.780 --> 00:32:36.500
C is not too bad.

00:32:36.880 --> 00:32:42.140
One of the things we've really struggled with is SciPy because SciPy actually has a fair bit of Fortran.

00:32:42.140 --> 00:32:47.620
And so Fortran is its whole other set of problems for WebAssembly.

00:32:47.620 --> 00:32:54.200
There's basically there's not a good compiling option for anything that's not Fortran 77 right now.

00:32:54.200 --> 00:32:56.220
So we have to kind of push that forward somehow.

00:32:56.220 --> 00:33:02.360
So there's kind of this range of easy to hard and it's hard to know maybe up front how hard something's going to be.

00:33:02.360 --> 00:33:04.600
Pure Python stuff is very easy.

00:33:04.600 --> 00:33:05.220
Pure Python.

00:33:05.440 --> 00:33:17.160
There's actually even a little helper script where if you run this little helper script with the name of the package on PyPy and it will automatically generate the make file needed to build it as part of PyDy and it automatically goes in.

00:33:17.160 --> 00:33:20.380
You know, you submit that as a PR and you're good to go.

00:33:20.380 --> 00:33:20.960
That's pretty cool.

00:33:21.180 --> 00:33:27.420
So when say my Python, suppose I have a pure Python package on PyPI and I want to have it in there.

00:33:27.420 --> 00:33:31.780
Does it get somehow compiled to WebAssembly?

00:33:31.780 --> 00:33:36.420
Does WebAssembly itself just like become an interpreter and just use Python bytecode?

00:33:36.420 --> 00:33:38.900
Like what do you know what the process there is?

00:33:38.900 --> 00:33:39.260
Yeah.

00:33:39.760 --> 00:33:44.680
So PyDyde is literally running the Python interpreter inside your browser.

00:33:44.680 --> 00:33:51.260
So if you have a pure Python package, it's actually just shipping Python to that interpreter running in the browser.

00:33:51.260 --> 00:33:52.600
And that's how it runs.

00:33:52.600 --> 00:33:55.400
So it basically just works off PYC bytecode.

00:33:55.400 --> 00:33:55.680
Exactly.

00:33:56.240 --> 00:34:00.980
And just feeds it off to the interpreter just happens to be executing not on C but in WebAssembly.

00:34:00.980 --> 00:34:01.540
Exactly.

00:34:01.540 --> 00:34:03.500
That was my first guess.

00:34:03.500 --> 00:34:08.840
But I thought maybe there's some other magic like, oh, we had to add a JIT compiler to this or, you know, some weird thing.

00:34:08.840 --> 00:34:09.660
No.

00:34:09.660 --> 00:34:16.860
And so for that reason, what you're getting is something that performs pretty similar to CPython, right?

00:34:16.860 --> 00:34:23.740
Something like PyPy.js has, of course, opportunities to JIT the Python itself and potentially get a lot more performance.

00:34:23.740 --> 00:34:27.880
So we're not doing anything sophisticated like that in PyDyde at this point.

00:34:27.880 --> 00:34:28.120
Yeah.

00:34:28.120 --> 00:34:28.520
Yeah.

00:34:28.520 --> 00:34:34.760
So because we have a really nice JIT sitting there in the browser, it's certainly there and could be on some point.

00:34:34.760 --> 00:34:35.500
It's certainly possible.

00:34:35.500 --> 00:34:39.400
I know Brett Cannon worked on Pigeon, I think is what it was called.

00:34:39.400 --> 00:34:39.580
Yeah.

00:34:39.580 --> 00:34:47.780
Which was looking at some of the JavaScript JIT stuff and seeing how that could be applied back into the CPython runtime.

00:34:47.780 --> 00:34:48.220
I don't know.

00:34:48.220 --> 00:34:49.420
Maybe there's some synergy there.

00:34:49.420 --> 00:34:49.900
Who knows?

00:34:49.900 --> 00:34:54.520
Because it sounds like they're all kind of swimming in the same soup of ingredients or whatever.

00:34:54.520 --> 00:34:55.060
Absolutely.

00:34:55.060 --> 00:34:55.580
Yeah.

00:34:55.580 --> 00:35:00.040
What does the performance of WebAssembly relative to JavaScript look like?

00:35:00.040 --> 00:35:08.340
So if I had like my Quake example and I ran Quake over Asm.js and I ran it in WebAssembly, what would I get?

00:35:08.480 --> 00:35:10.140
I don't actually know those numbers.

00:35:10.140 --> 00:35:14.180
I'm sure that WebAssembly at this point is quite a bit better than Assembly.js.

00:35:14.180 --> 00:35:17.200
The numbers I do have is what happens to Python.

00:35:17.200 --> 00:35:23.340
So like comparing Python running natively on a machine versus inside the browser.

00:35:23.740 --> 00:35:30.600
And you generally get anywhere between like the same speed and 12 to 20 times slower.

00:35:30.600 --> 00:35:30.960
Okay.

00:35:30.960 --> 00:35:32.900
And what seems to matter.

00:35:32.900 --> 00:35:40.760
So if your Python application is mainly just calling NumPy operations, which are at the bottom sort of C tight inner loops.

00:35:40.760 --> 00:35:40.960
Yeah.

00:35:40.960 --> 00:35:44.180
That stuff tends to be pretty much the same speed.

00:35:44.180 --> 00:35:44.840
Right.

00:35:45.080 --> 00:35:49.600
Those tight C loops tend to be pretty much the same in WebAssembly as out of WebAssembly.

00:35:49.600 --> 00:35:55.420
If you're doing a lot of looping in Python or calling a lot of Python functions, those things tend to get quite a bit slower.

00:35:55.420 --> 00:36:04.100
And the reason is that the Python interpreter is basically calling a lot of C function pointers all the time.

00:36:04.100 --> 00:36:05.660
That's sort of kind of how it works.

00:36:05.660 --> 00:36:09.060
It's calling other C code through C function pointers.

00:36:09.060 --> 00:36:10.980
Py, object, star, all over the place.

00:36:10.980 --> 00:36:11.180
Yeah.

00:36:11.180 --> 00:36:11.600
Exactly.

00:36:11.600 --> 00:36:12.400
Yeah.

00:36:12.680 --> 00:36:17.820
And calling a C function pointer in WebAssembly is quite a bit slower than it is on native.

00:36:17.820 --> 00:36:18.320
Okay.

00:36:18.320 --> 00:36:24.900
For reasons that I don't know if I could fully articulate, but probably partly related to the security model.

00:36:24.900 --> 00:36:27.860
Like there's just ways in which that's a lot slower.

00:36:27.860 --> 00:36:33.820
And unfortunately, the way the Python interpreter is designed is making lots of C function pointer calls all over the place.

00:36:33.820 --> 00:36:33.960
Right.

00:36:33.960 --> 00:36:34.740
But it might not matter.

00:36:34.740 --> 00:36:35.420
It depends.

00:36:35.420 --> 00:36:35.780
Right.

00:36:35.780 --> 00:36:36.280
Exactly.

00:36:36.600 --> 00:36:43.700
So it's five times slower might make it go from half a millisecond to, I don't know, 2.5 milliseconds.

00:36:43.700 --> 00:36:46.120
And like the user doesn't perceive these or care about them.

00:36:46.120 --> 00:36:46.300
Right.

00:36:46.300 --> 00:36:49.720
But now all of a sudden you have this great new deployment story and this execution engine.

00:36:49.720 --> 00:36:53.840
And like what it opens up is way more valuable than that amount of slowness potentially.

00:36:53.840 --> 00:36:54.440
Exactly.

00:36:54.660 --> 00:36:57.820
It's certainly within the realm of like I can live with that.

00:36:57.820 --> 00:36:59.500
You know, it's not hundreds of times slower.

00:36:59.500 --> 00:36:59.940
Right.

00:36:59.940 --> 00:37:00.160
Yeah.

00:37:00.160 --> 00:37:00.660
Yeah.

00:37:00.660 --> 00:37:01.000
Interesting.

00:37:01.000 --> 00:37:01.900
Cython.

00:37:01.900 --> 00:37:08.500
Can I do some, like I've got some doubly nested loop and I know that that's the problem.

00:37:08.500 --> 00:37:13.420
Can I Cythonize that puppy and then WebAssembly the result or something?

00:37:13.660 --> 00:37:13.960
Yeah.

00:37:13.960 --> 00:37:14.060
Yeah.

00:37:14.060 --> 00:37:16.300
So Cython works just fine.

00:37:16.300 --> 00:37:20.260
And in fact, like Pandas is largely written in Cython.

00:37:20.260 --> 00:37:22.700
So in order to get that to work, that needed to work.

00:37:22.700 --> 00:37:29.640
But all of that compilation happens ahead of time on the native machine before shipping it

00:37:29.640 --> 00:37:30.080
to the browser.

00:37:30.080 --> 00:37:34.320
We don't actually have the ability to compile Cython code inside of the browser.

00:37:34.320 --> 00:37:34.620
Right.

00:37:34.620 --> 00:37:40.120
But as a package developer, maybe I could leverage that to avoid the, like the really, so where

00:37:40.120 --> 00:37:43.480
there's a penalty in Python versus C now, the penalty may be worse.

00:37:43.480 --> 00:37:46.360
It's in WebAssembly, but maybe the answer is Cython still.

00:37:46.360 --> 00:37:47.340
Absolutely.

00:37:47.340 --> 00:37:48.060
Yeah.

00:37:48.060 --> 00:37:48.220
Okay.

00:37:48.220 --> 00:37:48.460
Yeah.

00:37:48.460 --> 00:37:49.080
Totally.

00:37:49.080 --> 00:37:54.980
And then of course, people have gotten the Clang compiler to run in WebAssembly.

00:37:54.980 --> 00:37:59.160
So theoretically, we could get, we could put that there as well.

00:37:59.160 --> 00:38:04.700
And then we could send it Cython code and get something back and maybe run that right away.

00:38:04.700 --> 00:38:09.160
Like that's a little bit getting into crazy territory maybe, but who knows?

00:38:09.160 --> 00:38:09.440
Yeah.

00:38:09.440 --> 00:38:12.920
Well, it's turtles all the way down or WebAssembly is all the way down or something like that.

00:38:12.920 --> 00:38:13.080
Right.

00:38:13.300 --> 00:38:13.620
Right.

00:38:13.620 --> 00:38:13.860
Yeah.

00:38:13.860 --> 00:38:14.220
Interesting.

00:38:14.220 --> 00:38:23.060
Both having all of this experience and expertise in Rust and Rust being so built for WebAssembly

00:38:23.060 --> 00:38:25.080
or being so well paired with WebAssembly.

00:38:25.080 --> 00:38:32.120
And then also some projects trying to do, say, CPython's runtime in Rust.

00:38:32.240 --> 00:38:39.420
I guess the question is rambling around is to say, if I take Rust and like rethink CPython,

00:38:39.420 --> 00:38:43.140
what do you think the possibilities are there in this context?

00:38:43.140 --> 00:38:43.580
Yeah.

00:38:43.580 --> 00:38:49.980
I mean, I think what's exciting about Rust is because it's a newer technology, unlike C, like,

00:38:49.980 --> 00:38:56.020
it's a lot easier to get into WebAssembly because they just, there's just not as much baggage.

00:38:56.020 --> 00:38:58.220
And that's kind of Rust's advantage on WebAssembly.

00:38:58.780 --> 00:39:09.460
Plus, there's the fact that I think Rust and WebAssembly both coming out of Mozilla with a lot of people overlapping between those two communities has really helped that make that story much smoother.

00:39:10.100 --> 00:39:17.400
You know, for example, if I was going to write something from scratch that I wanted to run in WebAssembly, I would absolutely reach for Rust and not C in this day and age.

00:39:17.400 --> 00:39:19.360
It just is going to be an easier experience.

00:39:19.360 --> 00:39:28.220
So, like you say, there's a project, maybe more than one, I don't know, that to rewrite the C and the Python interpreter in Rust, right?

00:39:28.640 --> 00:39:33.580
So then that would help us build something like Pyodide a lot easier because it's in Rust.

00:39:33.580 --> 00:39:37.640
We don't have to deal with a lot of the sort of niggly details we've had to deal with C.

00:39:37.640 --> 00:39:48.260
My worry there is historically, whenever people write an alternative Python interpreter, it's really hard for it to catch on because there's so much catch up to do.

00:39:48.260 --> 00:39:57.560
You know, if they can, I think the sort of uphill battle for that project, and I think I've read about the project and the author is like clearly doing it for fun.

00:39:57.560 --> 00:39:58.800
Right, exactly.

00:39:58.800 --> 00:40:03.360
You don't have to have a better reason than for fun, and I'm not trying to say that that's not a valid reason.

00:40:03.360 --> 00:40:16.660
But, like, I think for that to kind of take over from the CPython interpreter, which I know is not a goal, it would have to, like, convince that community that's currently maintaining the CPython interpreter that Rust is going to be a better way forward.

00:40:16.660 --> 00:40:22.580
If they can do that and sort of replace it and become the leader, that would be an amazing outcome.

00:40:22.580 --> 00:40:25.440
But I think that's a real uphill struggle.

00:40:25.440 --> 00:40:31.980
It would be an interesting outcome, but certainly with Rust being so new and C being so, such a stalwart, right?

00:40:31.980 --> 00:40:34.120
Like, it's the foundation of so many things.

00:40:34.120 --> 00:40:37.280
It would be an interesting conversation for sure.

00:40:39.280 --> 00:40:43.600
This portion of Talk Python To Me is brought to you by Microsoft and Azure Pipelines.

00:40:43.600 --> 00:40:47.940
Azure Pipelines is a CI-CD service that supports Windows, Linux, and Mac.

00:40:47.940 --> 00:40:53.000
It lets you run automatic builds and tests of your Python code on each commit or pull request.

00:40:53.000 --> 00:40:58.940
It is fully integrated with GitHub, and it lets you define your continuous integration and delivery pipelines with a simple YAML file.

00:40:58.940 --> 00:41:01.780
Azure Pipelines is free for individuals and small teams.

00:41:01.780 --> 00:41:07.120
If you're maintaining an open source project, you'll even get unlimited build minutes and 10 concurrent pipelines.

00:41:07.120 --> 00:41:10.200
Many Python projects are already using Azure Pipelines.

00:41:10.200 --> 00:41:14.280
So get started for free at talkpython.fm/Microsoft.

00:41:16.620 --> 00:41:26.020
I guess what I'm thinking also is, like, it's interesting that people have these, hey, I want to learn Rust, and let me do that by trying to rewrite CPython and Rust.

00:41:26.020 --> 00:41:30.760
These are interesting and, like you say, super good goals, and people, I'm sure, are getting a lot out of it.

00:41:30.960 --> 00:41:42.000
But I'm more thinking of, like, what if you rethought what it meant to be the runtime for Python specifically optimized for WebAssembly?

00:41:42.000 --> 00:41:42.700
You know what I mean?

00:41:42.700 --> 00:41:50.500
You try to make it, like, 99% compatible, and so you can bring in C libraries and stuff like you can through WebAssembly.

00:41:50.500 --> 00:41:57.360
But is there an opportunity to, like, truly rethink CPython, not just rewrite it with the different syntax and compilers?

00:41:57.360 --> 00:41:58.900
Yeah, that's a really good question.

00:41:58.900 --> 00:41:59.900
I think...

00:41:59.900 --> 00:42:00.580
I don't know the answer.

00:42:00.580 --> 00:42:01.540
Yeah.

00:42:01.540 --> 00:42:02.900
Just interesting to think about.

00:42:02.900 --> 00:42:03.320
Yeah.

00:42:03.320 --> 00:42:15.000
I think one of the things that a lot of various projects have kind of worked on rethinking the Python interpreter has sort of been, what if we can assume that there's a really good JIT around, right?

00:42:15.000 --> 00:42:26.420
That's kind of what PyPy is, and, like, IronPython, when they built it on top of the .NET CLR and those sorts of things, they sort of go, if we have a really good JIT around, what can we do?

00:42:26.760 --> 00:42:34.680
I think a lot of times they run up against the sort of really dynamic corner cases of the Python language that are really hard to deal with there.

00:42:34.680 --> 00:42:34.960
Yeah.

00:42:34.960 --> 00:42:37.780
And you end up with something that's slightly different from Python.

00:42:38.440 --> 00:42:40.960
That, to me, always feels like where it gets a little stuck.

00:42:40.960 --> 00:42:41.400
Right.

00:42:41.400 --> 00:42:53.840
And if there's a way of unsticking that, like, unfortunately, the community has gone through this big transition from Python 2 to 3 already, and I don't know if there's a lot of appetite for another Python that would be slightly incompatible.

00:42:54.140 --> 00:42:54.580
Exactly.

00:42:54.580 --> 00:43:01.940
If you could live with something that's slightly incompatible, I think there's a lot of ways you could make it more performant with some of these newer technologies.

00:43:01.940 --> 00:43:02.320
Yeah.

00:43:02.320 --> 00:43:05.160
I mean, maybe there's an opportunity to do that.

00:43:05.160 --> 00:43:13.580
And maybe PyOidide's not the right answer for that because you're not looking at, I don't know, like, so much of what happens there happens in C anyway.

00:43:13.920 --> 00:43:18.140
Like, that's where the data science action really is, and Python is kind of the orchestration layer.

00:43:18.140 --> 00:43:26.940
Maybe it's something like, what if I could write the equivalent of AngularJS or React with an Electron app with a nice UI, but I only touch Python?

00:43:26.940 --> 00:43:37.720
You might be willing, like, in that context to say, well, I'd rather have a 99% compatible Python with packages than just switch 100% to JavaScript as a Python enthusiast, right?

00:43:37.720 --> 00:43:39.720
Like, that might be a cell you would buy.

00:43:39.720 --> 00:43:40.220
Yeah.

00:43:40.220 --> 00:43:41.040
Yeah, absolutely.

00:43:41.040 --> 00:43:41.560
Yeah.

00:43:41.620 --> 00:43:49.680
And the file size doesn't matter in these, like, offline Electron apps and stuff because you're already downloading, like, a 60-meg Chrome binary.

00:43:49.680 --> 00:43:53.340
Like, what's another 10 megs that's just zipped up in there anyway, right?

00:43:53.340 --> 00:43:54.040
Right, exactly.

00:43:54.040 --> 00:43:54.700
So who knows?

00:43:54.700 --> 00:43:56.880
Maybe that's an interesting corner to explore.

00:43:56.880 --> 00:43:59.700
Not necessarily for you, but for, like, anyone interested, right?

00:43:59.700 --> 00:44:01.180
Yeah, absolutely.

00:44:01.180 --> 00:44:01.900
Yeah, cool.

00:44:01.900 --> 00:44:05.500
So I guess maybe tell us a little bit about where things are going.

00:44:05.500 --> 00:44:08.560
Like, where are you and what's the future plans?

00:44:08.560 --> 00:44:11.960
There's a bunch of things that don't work that we'd like to work on.

00:44:11.960 --> 00:44:17.920
So, like, currently we don't support threading because when we started the project, WebAssembly didn't support threading.

00:44:17.920 --> 00:44:22.220
Now WebAssembly does, so it would be good to go back and kind of build on top of that.

00:44:22.220 --> 00:44:27.200
Is that true operating system threads or is that some kind of, like, preemptive threading?

00:44:27.200 --> 00:44:28.920
Like, what does threading in WebAssembly mean?

00:44:28.920 --> 00:44:34.020
It's based on the web worker technology in browsers.

00:44:34.020 --> 00:44:41.860
My understanding is they are sort of true separate operating system threads, and they're probably even more isolated than you would think of with threads.

00:44:41.860 --> 00:44:44.240
They're kind of their own little JavaScript interpreters.

00:44:44.240 --> 00:44:44.500
Right.

00:44:44.500 --> 00:44:47.520
They can kind of message passing, and that's all they get for data sharing and whatnot.

00:44:47.760 --> 00:44:48.200
Exactly.

00:44:48.200 --> 00:44:48.680
Exactly.

00:44:48.680 --> 00:44:53.000
So you can take advantage of that in WebAssembly now and pass things between web workers.

00:44:53.000 --> 00:44:59.580
And there should be a way to kind of hopefully build the Python threading API on top of that.

00:44:59.580 --> 00:45:04.300
It sounds almost like Python's multiprocessing more than Python's threading.

00:45:04.300 --> 00:45:05.420
That's probably accurate.

00:45:05.420 --> 00:45:06.420
Yeah, yeah, yeah.

00:45:06.420 --> 00:45:11.900
But it's still, like, it's cool that it would be some parallelism that you can do regardless of how that happens, right?

00:45:11.900 --> 00:45:12.380
Exactly.

00:45:12.380 --> 00:45:13.020
Exactly.

00:45:13.020 --> 00:45:13.300
Okay.

00:45:13.300 --> 00:45:22.200
Another big sticking point is networking is obviously very different for Pyodide than it is for native Python, largely because of the sandbox, right?

00:45:22.200 --> 00:45:27.800
You can't just open up a Unix socket and start writing things to it because that would be a big security hole.

00:45:28.460 --> 00:45:41.000
So what that means is a lot of the libraries that come in the Python data science community, like Pandas, they have ways of fetching things over the network, and those don't currently work because they try to open a socket and they fail.

00:45:41.000 --> 00:45:41.380
I see.

00:45:41.380 --> 00:45:53.100
So building some kind of abstraction layer, maybe on top of WebSockets or maybe something else that would at least let basic things work from Python would be really useful.

00:45:53.400 --> 00:46:01.160
Right now, generally, what you have to kind of do is do your networking in JavaScript using Fetch or whatever the tools are there, and then you can bring that into the Python side.

00:46:01.160 --> 00:46:01.860
Right.

00:46:01.860 --> 00:46:03.680
Axios or something nice, yeah.

00:46:03.680 --> 00:46:04.520
Yeah, exactly.

00:46:04.520 --> 00:46:04.780
Okay.

00:46:04.780 --> 00:46:11.800
Yeah, but so, like, if I installed imported requests and tried to use that, for example, that might not work?

00:46:11.800 --> 00:46:13.660
That's definitely not going to work.

00:46:13.660 --> 00:46:13.980
Yeah.

00:46:13.980 --> 00:46:14.120
Okay.

00:46:14.120 --> 00:46:14.820
Cool.

00:46:14.820 --> 00:46:15.680
All right.

00:46:15.680 --> 00:46:16.420
So what else?

00:46:16.420 --> 00:46:20.520
You talked about this, like, Honda Forge-like distributed build integration thing.

00:46:20.520 --> 00:46:21.220
That's pretty cool.

00:46:21.220 --> 00:46:21.700
What else?

00:46:21.900 --> 00:46:26.980
There's a whole area of research we're doing around the data sharing.

00:46:26.980 --> 00:46:35.900
So, like, I mentioned this before, like, you can have a NumPy array living in Python and pass that over to JavaScript without copying it, and that's pretty cool that that works.

00:46:35.900 --> 00:46:41.560
But when you bring it over to JavaScript, you don't get, like, how many dimensions it has or what the shape is.

00:46:41.560 --> 00:46:45.340
You just sort of get this one-dimensional thing because that's all that JavaScript really supports.

00:46:45.860 --> 00:46:51.560
So we'd like to build something that kind of makes that more transparent and smooth.

00:46:51.560 --> 00:46:56.520
Would that be like a JavaScript wrapper that kind of has an API like Pandas a little bit?

00:46:56.520 --> 00:46:57.940
Or what are you thinking there?

00:46:57.940 --> 00:46:59.840
Yeah, something like that, I think.

00:46:59.980 --> 00:47:10.840
There's a project called Apache Arrow, and its sort of purpose is to allow for sharing of these data structures in memory between different language runtimes.

00:47:10.840 --> 00:47:15.060
And they've primarily been looking at, like, native runtime space.

00:47:15.060 --> 00:47:17.180
But they have a JavaScript implementation.

00:47:17.180 --> 00:47:18.780
They have a Python implementation.

00:47:19.400 --> 00:47:22.780
It should be possible to kind of bring all that into the browser and use it there.

00:47:22.780 --> 00:47:27.200
And then, again, building on top of work already been done and not have to build it ourselves.

00:47:27.200 --> 00:47:29.440
But there's a lot of details there.

00:47:29.440 --> 00:47:42.680
The other thing that's sort of exciting for Arrow for us is the sort of industry standard way to bring data in for data science computation is still the comma-separated value format, right?

00:47:42.680 --> 00:47:45.420
They're not terribly efficient to read.

00:47:45.420 --> 00:47:47.580
They're not very space-efficient or memory-efficient.

00:47:48.100 --> 00:47:52.920
Whereas Apache Arrow provides this sort of nice, tight binary format that we could use.

00:47:52.920 --> 00:47:53.140
Yeah.

00:47:53.140 --> 00:47:58.400
And that would actually allow us to sort of shove more data into the browser, which is pretty memory-limited to begin with.

00:47:58.400 --> 00:48:03.920
So anything that will let us kind of get away from CSVs is also on our roadmap.

00:48:03.920 --> 00:48:07.840
Yeah, just parsing all those strings is going to be a slow thing wherever.

00:48:07.840 --> 00:48:08.440
Yeah.

00:48:08.440 --> 00:48:11.020
And most of the libraries that do it don't.

00:48:11.020 --> 00:48:13.000
They assume tons and tons of memory.

00:48:13.000 --> 00:48:16.340
So they don't necessarily do it in the most efficient way possible.

00:48:16.340 --> 00:48:17.740
They don't necessarily stream it.

00:48:17.800 --> 00:48:18.960
They might copy the whole thing.

00:48:18.960 --> 00:48:20.540
And then, you know, so.

00:48:20.540 --> 00:48:20.780
Right.

00:48:20.780 --> 00:48:21.020
Yeah.

00:48:21.020 --> 00:48:21.320
Okay.

00:48:21.320 --> 00:48:21.800
Yeah.

00:48:21.800 --> 00:48:22.680
That sounds pretty cool.

00:48:22.680 --> 00:48:24.580
And then I was looking around the site.

00:48:24.580 --> 00:48:27.900
I found like a pretty cool demo notebook that people can try out.

00:48:27.900 --> 00:48:29.300
I guess, you know, there's that.

00:48:29.300 --> 00:48:30.220
And I'll put a link to that.

00:48:30.220 --> 00:48:32.980
And what else do you recommend for people just trying to play around with it?

00:48:32.980 --> 00:48:39.300
There's a demo notebook that kind of goes through the language features that works also kind of as a tutorial for how to get started.

00:48:39.540 --> 00:48:42.320
But then also linked on the blog post from last week.

00:48:42.320 --> 00:48:44.280
And maybe we can link to that blog post.

00:48:44.280 --> 00:48:49.540
In there, there's a bunch of other demo notebooks that kind of do more real world cool things, putting stuff together.

00:48:49.540 --> 00:48:52.360
Like you say, the call data one is pretty fun.

00:48:52.360 --> 00:48:58.120
There's another one that I used at Mozilla internally for figuring out how to time things in Firefox.

00:48:58.460 --> 00:48:59.100
It's kind of fun.

00:48:59.100 --> 00:49:07.300
So if you go to iodide.io, then all the notebooks that anybody has created on that public website, they're all public.

00:49:07.300 --> 00:49:11.220
And so you can just kind of browse through there and see what interesting things other people are doing.

00:49:11.220 --> 00:49:11.500
Yeah.

00:49:11.500 --> 00:49:11.780
Cool.

00:49:11.780 --> 00:49:14.280
There's always interesting stuff happening in the data science space.

00:49:14.280 --> 00:49:14.500
Yeah.

00:49:14.500 --> 00:49:14.920
Cool.

00:49:14.920 --> 00:49:19.520
One thing that I just wanted to give a shout out to, I don't know if you've even looked at it or something.

00:49:19.520 --> 00:49:23.480
I became aware of it basically like a week ago is WASM.

00:49:23.480 --> 00:49:26.540
So W-A-S-M is often the extension for WebAssembly.

00:49:26.780 --> 00:49:28.720
So W-A-S-M is like WebAssembly or right?

00:49:28.720 --> 00:49:30.740
So are you familiar with this project?

00:49:30.740 --> 00:49:31.060
Yeah.

00:49:31.060 --> 00:49:32.000
Yeah, I am.

00:49:32.000 --> 00:49:36.020
To me, my first impression is it's kind of a little bit like what Node.js did to JavaScript.

00:49:36.020 --> 00:49:39.340
Like it used to be JavaScript ran in the browser and that was like its space.

00:49:39.340 --> 00:49:42.720
And I guess you could open the console and like type and play with it if you want.

00:49:42.720 --> 00:49:45.300
But then all of a sudden Node.js like sprung on the scene.

00:49:45.300 --> 00:49:52.940
It's like, wait, I can just take my JavaScript code and run it in like a server process or doing other interesting stuff just in an, like on its own.

00:49:52.940 --> 00:49:55.040
And WASM is kind of like that for Python.

00:49:55.220 --> 00:50:02.920
Like it will let you run any WebAssembly code regardless of whether it's this Python code or just like random WebAssembly code.

00:50:02.920 --> 00:50:05.580
And then directly import that into your Python code.

00:50:05.580 --> 00:50:12.420
So it's kind of more like Node enabling than it is what you're working on, which is move Python to the browser.

00:50:12.420 --> 00:50:13.200
It's like the opposite.

00:50:13.200 --> 00:50:14.420
Move WebAssembly to Python.

00:50:14.680 --> 00:50:21.160
Yeah, what excites me about it actually is it's going to make languages that aren't C a lot easier to integrate in Python.

00:50:21.160 --> 00:50:26.440
So like Rust, for example, there is a way to integrate Rust in Python that's actually pretty good and works really well.

00:50:26.440 --> 00:50:32.320
But if the story was compile whatever you have to WebAssembly and we can get to it from Python,

00:50:32.320 --> 00:50:39.100
I think that makes it a lot easier to have things written in whatever language is the most convenient or the most, you know, at hand.

00:50:39.100 --> 00:50:52.780
And what also is potentially exciting from the Pydyde point of view is if this causes there to be a big sort of community of WASM packages that work with Python, we can use those in Pydyde basically for free.

00:50:52.960 --> 00:50:53.380
Yeah, exactly.

00:50:53.380 --> 00:50:55.260
It just grows the pie for everyone.

00:50:55.260 --> 00:50:55.960
Yeah, exactly.

00:50:55.960 --> 00:50:56.400
Yeah.

00:50:56.400 --> 00:50:56.720
Yeah.

00:50:56.720 --> 00:51:01.200
So I mean, I don't have a whole lot more to say than just like, hey, people, if that sounds interesting to you, check it out.

00:51:01.200 --> 00:51:02.640
It looks like a really cool project.

00:51:02.640 --> 00:51:13.740
And it just it's just one more sign that there's like this excitement of WebAssembly integrating with these non JavaScript, non browser traditional languages and ecosystems.

00:51:13.740 --> 00:51:14.640
Yeah, absolutely.

00:51:14.640 --> 00:51:15.000
Yeah.

00:51:15.000 --> 00:51:19.800
I guess if I'm throwing out other stuff that just kind of randomly to do here, one more I'll throw out.

00:51:19.800 --> 00:51:21.300
That's it's really, really interesting.

00:51:21.300 --> 00:51:22.780
I don't know if you've heard of this one at all.

00:51:22.780 --> 00:51:23.560
Blazor.

00:51:23.560 --> 00:51:24.200
Have you heard of that?

00:51:24.200 --> 00:51:26.180
Yes, this is the C#.

00:51:26.180 --> 00:51:26.720
Yeah.

00:51:26.720 --> 00:51:27.000
Yeah.

00:51:27.000 --> 00:51:33.560
So they had a totally different take, but they have gotten the .NET runtime, the CLR, all that stuff running a WebAssembly.

00:51:33.560 --> 00:51:35.280
And now you can do C# in the browser.

00:51:35.280 --> 00:51:43.100
Their take was to build an AngularJS like framework that lets you write front end code in C# and then run it in the browser.

00:51:43.100 --> 00:51:49.780
I don't know if that's a good idea or not, but it's, you know, it's kind of the other half of the story, I think, for Python, right?

00:51:49.840 --> 00:51:52.940
Like right now we've got the data science, like with your work going really well.

00:51:52.940 --> 00:51:59.360
But there's no story around like, what would I use in Python instead of Vue or Angular?

00:51:59.360 --> 00:52:06.100
Not necessarily saying those are bad or you should, but like you could, you know, C# is showing the way like on that side of the story.

00:52:06.360 --> 00:52:14.460
Yeah, I think there's a real advantage to having your implementation and your back end and your front end be in the same language.

00:52:14.460 --> 00:52:16.320
I think that's kind of what Node has proven.

00:52:16.320 --> 00:52:16.740
Right.

00:52:16.740 --> 00:52:17.920
There's definitely an appeal there.

00:52:17.920 --> 00:52:18.140
Yeah.

00:52:18.140 --> 00:52:18.640
Yeah.

00:52:18.760 --> 00:52:21.100
And so I think for Blazor, it's the same thing.

00:52:21.100 --> 00:52:25.820
If you're a shop that's done your back end in C# for a long time, well, now you can have your front end in it too.

00:52:25.820 --> 00:52:27.180
That's really nice.

00:52:27.180 --> 00:52:41.060
One of the things that from talking to some Jupyter developers, one of the things they're really excited about with Pyodide is now they could potentially start to have some of their front end stuff written in Python as well as their back end that's currently in Python.

00:52:41.400 --> 00:52:49.060
So because right now the world they live in is there's sort of these arbitrary lines that get drawn between what you would write in one language versus another.

00:52:49.060 --> 00:52:51.160
And they're not always the right thing.

00:52:51.160 --> 00:52:57.000
And sometimes you have to write the same thing in two different languages just so that you can put it in both places.

00:52:57.000 --> 00:52:57.480
Right.

00:52:57.480 --> 00:52:58.560
Validation or something.

00:52:58.560 --> 00:52:59.940
Get rid of those.

00:52:59.940 --> 00:53:00.240
Yeah.

00:53:00.240 --> 00:53:07.960
Validation or even with Jupyter, it's certain kinds of computation that they need in the widget as well as in the back end and they need to match.

00:53:07.960 --> 00:53:11.120
But one's written in Python.

00:53:11.120 --> 00:53:11.920
It's not fun.

00:53:11.920 --> 00:53:16.700
And it's just sort of this arbitrary speed bump that gets created because of the world we live in.

00:53:16.700 --> 00:53:23.540
But if you imagine a world where all languages run everywhere, suddenly, hopefully, you're doing less work.

00:53:23.540 --> 00:53:23.800
Yeah.

00:53:23.800 --> 00:53:24.680
Yeah, absolutely.

00:53:24.680 --> 00:53:27.400
And you can reuse stuff in a context where you wouldn't.

00:53:27.400 --> 00:53:32.800
Like, until your work, it would have been kind of insane to say, well, let's reuse NumPy in the browser, right?

00:53:32.800 --> 00:53:33.260
Yeah.

00:53:33.260 --> 00:53:35.740
But now these doors are open.

00:53:35.740 --> 00:53:37.480
So it just creates more synergy, I think.

00:53:37.480 --> 00:53:38.020
It's pretty awesome.

00:53:38.020 --> 00:53:38.680
All right.

00:53:38.680 --> 00:53:40.500
Well, I think we're pretty much out of time there.

00:53:40.840 --> 00:53:40.960
Yeah.

00:53:40.960 --> 00:53:42.520
Definitely a fun conversation.

00:53:42.520 --> 00:53:44.320
I think the future is bright.

00:53:44.320 --> 00:53:44.760
What do you think?

00:53:44.760 --> 00:53:45.580
Yeah, absolutely.

00:53:45.580 --> 00:53:47.240
I'm really excited about all this stuff.

00:53:47.240 --> 00:53:47.500
Yeah.

00:53:47.500 --> 00:53:47.860
Same.

00:53:47.860 --> 00:53:48.580
All right.

00:53:48.580 --> 00:53:50.940
Now, before I let you out here, let me ask you the two final questions.

00:53:50.940 --> 00:53:56.400
I kind of think I can guess this first one because of the way you opened the whole show.

00:53:56.400 --> 00:53:58.600
But favorite editor for writing Python code?

00:53:58.800 --> 00:53:59.000
Yeah.

00:53:59.000 --> 00:53:59.080
Yeah.

00:53:59.080 --> 00:54:01.560
So I use Emacs, but I actually use SpaceMax.

00:54:01.560 --> 00:54:05.160
So it's kind of this like weird Emacs VI hybrid.

00:54:05.160 --> 00:54:06.840
But I find it works for me.

00:54:06.840 --> 00:54:07.300
Yeah.

00:54:07.300 --> 00:54:07.760
Right on.

00:54:07.760 --> 00:54:08.160
Very cool.

00:54:08.160 --> 00:54:10.160
And then notable PyPI package?

00:54:10.160 --> 00:54:11.300
Oh, gosh.

00:54:11.580 --> 00:54:14.960
I mean, the one I'm most familiar with is Matplotlib because I worked on it for years and years.

00:54:14.960 --> 00:54:17.480
And, you know, if you're not familiar with it, go check it out.

00:54:17.480 --> 00:54:21.020
It's the kitchen sink of plotting for Python.

00:54:21.020 --> 00:54:22.120
Yeah, it absolutely is.

00:54:22.120 --> 00:54:26.640
And, you know, it's something exciting and like both silly but also kind of real sense.

00:54:26.700 --> 00:54:30.720
As I saw, now XKCD style plots have come to Matplotlib.

00:54:30.720 --> 00:54:31.360
Did you see that?

00:54:31.360 --> 00:54:31.960
Oh, yeah.

00:54:31.960 --> 00:54:33.960
I implemented that actually.

00:54:33.960 --> 00:54:34.720
You did?

00:54:34.720 --> 00:54:36.200
How awesome.

00:54:36.200 --> 00:54:36.740
Yeah.

00:54:36.740 --> 00:54:36.880
Yeah.

00:54:36.880 --> 00:54:38.020
How hard was that?

00:54:38.020 --> 00:54:39.900
Not too bad.

00:54:39.900 --> 00:54:45.300
Strangely, the infrastructure that was already there kind of made it easier than it might have been.

00:54:45.300 --> 00:54:45.640
So.

00:54:45.640 --> 00:54:45.920
Yeah.

00:54:45.920 --> 00:54:51.000
I mean, it looks like the sort of cartoony hand-drawn like plots and stuff.

00:54:51.000 --> 00:54:57.380
But it actually looks really hard to do because it's imprecise and it has these imperfections, right?

00:54:57.380 --> 00:55:00.840
It seems like it would be hard to tell a computer to be imprecise in like a human way.

00:55:00.840 --> 00:55:01.860
But well done.

00:55:01.860 --> 00:55:02.540
And that looks great.

00:55:02.540 --> 00:55:03.180
So fun.

00:55:03.180 --> 00:55:03.720
Thanks.

00:55:03.720 --> 00:55:04.340
Yeah, absolutely.

00:55:04.340 --> 00:55:05.320
Awesome.

00:55:05.320 --> 00:55:05.700
All right.

00:55:05.700 --> 00:55:06.420
Final call to action.

00:55:06.420 --> 00:55:08.380
People are excited about iodide, pyodide.

00:55:08.380 --> 00:55:09.960
They want to check it out, maybe contribute.

00:55:09.960 --> 00:55:11.660
What do you got for them?

00:55:11.660 --> 00:55:12.180
Yeah.

00:55:12.180 --> 00:55:16.860
Check out iodide.io where you can check out all the notebooks that people have created.

00:55:16.860 --> 00:55:20.900
And then we have a GitHub site at github slash iodide project.

00:55:20.900 --> 00:55:21.340
Nice.

00:55:21.340 --> 00:55:23.800
And I'll put those links in the show notes.

00:55:23.800 --> 00:55:26.640
Are you looking for contributors or people working on it any?

00:55:26.640 --> 00:55:29.060
Or is it kind of still an internal project at the moment?

00:55:29.060 --> 00:55:30.560
We're definitely looking for contributors.

00:55:30.560 --> 00:55:32.740
Find us on Gitter if you have any great ideas.

00:55:32.740 --> 00:55:35.880
We'd be glad to help you make them reality.

00:55:35.880 --> 00:55:36.440
Super cool.

00:55:36.440 --> 00:55:37.220
All right.

00:55:37.220 --> 00:55:43.220
Well, I am very thrilled to see you all working on getting this take on Python in the browser.

00:55:43.220 --> 00:55:46.040
I think the more attempts that we have here, the better.

00:55:46.040 --> 00:55:48.020
And it's an exciting time.

00:55:48.020 --> 00:55:49.500
And I think it'll take off.

00:55:49.500 --> 00:55:50.020
Thanks a lot.

00:55:50.100 --> 00:55:50.940
It was fun talking to you.

00:55:50.940 --> 00:55:51.500
Yeah, you as well.

00:55:51.500 --> 00:55:52.140
Thanks for being on the show.

00:55:52.140 --> 00:55:52.440
Bye.

00:55:52.440 --> 00:55:52.740
All right.

00:55:52.740 --> 00:55:53.160
Take care.

00:55:53.160 --> 00:55:53.460
Bye.

00:55:53.460 --> 00:55:57.060
This has been another episode of Talk Python To Me.

00:55:57.060 --> 00:55:59.800
Our guest on this episode was Michael Dropboom.

00:55:59.800 --> 00:56:01.800
And it's been brought to you by Microsoft.

00:56:01.800 --> 00:56:05.300
If you're a Python developer, Microsoft has you covered.

00:56:05.300 --> 00:56:10.380
From VS Code and their modern editor plugins, to Azure Pipelines for continuous integration,

00:56:10.380 --> 00:56:13.160
and serverless Python functions on Azure.

00:56:13.160 --> 00:56:16.820
Check them out at talkpython.fm/Microsoft.

00:56:17.580 --> 00:56:19.160
Want to level up your Python?

00:56:19.160 --> 00:56:23.940
If you're just getting started, try my Python Jumpstart by building 10 apps course.

00:56:23.940 --> 00:56:32.100
Or if you're looking for something more advanced, check out our new async course that digs into all the different types of async programming you can do in Python.

00:56:32.100 --> 00:56:36.740
And of course, if you're interested in more than one of these, be sure to check out our everything bundle.

00:56:36.740 --> 00:56:38.660
It's like a subscription that never expires.

00:56:39.220 --> 00:56:40.960
Be sure to subscribe to the show.

00:56:40.960 --> 00:56:43.360
Open your favorite podcatcher and search for Python.

00:56:43.360 --> 00:56:44.580
We should be right at the top.

00:56:44.580 --> 00:56:53.580
You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm.

00:56:53.580 --> 00:56:55.680
This is your host, Michael Kennedy.

00:56:55.680 --> 00:56:57.180
Thanks so much for listening.

00:56:57.180 --> 00:56:58.240
I really appreciate it.

00:56:58.240 --> 00:56:59.980
Now get out there and write some Python code.

00:56:59.980 --> 00:57:20.740
I really appreciate it.

00:57:20.740 --> 00:57:50.720
Thank you.

