WEBVTT

00:00:00.001 --> 00:00:03.360
How do you build and maintain a complex suite of Python packages?

00:00:03.360 --> 00:00:06.040
Of course, you want to put them on PyPI.

00:00:06.040 --> 00:00:08.460
The best format there is as a wheel.

00:00:08.460 --> 00:00:10.760
This means that when developers use your code,

00:00:10.760 --> 00:00:14.520
it comes straight down and requires no local tooling to install and use.

00:00:14.520 --> 00:00:18.820
But if you have complex dependencies, such as C or Fortran,

00:00:18.820 --> 00:00:20.360
then you have a big challenge.

00:00:20.360 --> 00:00:24.720
How do you automatically compile and test against Linux, macOS,

00:00:24.720 --> 00:00:29.640
that's Intel and Apple Silicon, Windows, 32 and 64-bit,

00:00:30.040 --> 00:00:30.640
and so on?

00:00:30.640 --> 00:00:33.940
That's the problem solved by CI Buildwheel.

00:00:33.940 --> 00:00:36.080
On this episode, you'll meet Henry Schreiner.

00:00:36.080 --> 00:00:39.700
He's developing tools for the next era of the Large Hadron Collider

00:00:39.700 --> 00:00:42.600
and is an admin of Scikit-HEP.

00:00:42.600 --> 00:00:46.300
Of course, CI Buildwheel is central to that process.

00:00:46.300 --> 00:00:52.140
This is Talk Python To Me, episode 338, recorded October 14th, 2021.

00:00:52.140 --> 00:01:08.060
Welcome to Talk Python To Me, a weekly podcast on Python.

00:01:08.060 --> 00:01:09.780
This is your host, Michael Kennedy.

00:01:09.780 --> 00:01:12.060
Follow me on Twitter, where I'm @mkennedy,

00:01:12.060 --> 00:01:15.980
and keep up with the show and listen to past episodes at talkpython.fm.

00:01:15.980 --> 00:01:19.000
And follow the show on Twitter via at Talk Python.

00:01:19.540 --> 00:01:22.620
We've started streaming most of our episodes live on YouTube.

00:01:22.620 --> 00:01:26.420
Subscribe to our YouTube channel over at talkpython.fm/youtube

00:01:26.420 --> 00:01:30.220
to get notified about upcoming shows and be part of that episode.

00:01:30.220 --> 00:01:34.220
Hey there, I have some exciting news to share before we jump into the interview.

00:01:34.220 --> 00:01:36.340
We have a new course over at Talk Python.

00:01:36.340 --> 00:01:40.540
HTMLX plus Flask, modern Python web apps, hold the JavaScript.

00:01:40.540 --> 00:01:44.760
HTMLX is one of the hottest properties in web development today,

00:01:44.760 --> 00:01:45.720
and for good reason.

00:01:45.720 --> 00:01:50.860
You might even remember all the stuff we talked about with Carson Gross back on episode 321.

00:01:50.860 --> 00:01:55.120
HTMLX, along with the libraries and techniques we introduced in our new course,

00:01:55.120 --> 00:01:58.320
will have you writing the best Python web apps you've ever written.

00:01:58.320 --> 00:02:01.900
Clean, fast, and interactive, all without that front-end overhead.

00:02:01.900 --> 00:02:06.360
If you're a Python web developer that has wanted to build more dynamic, interactive apps,

00:02:06.360 --> 00:02:11.440
but don't want to or can't write a significant portion of your app in rich front-end JavaScript

00:02:11.440 --> 00:02:14.500
frameworks, you'll absolutely love HTMLX.

00:02:14.500 --> 00:02:18.400
Check it out over at talkpython.fm/HTMLX,

00:02:18.400 --> 00:02:20.480
or just click the link in your podcast player's show notes.

00:02:20.480 --> 00:02:22.440
Now let's get on to that interview.

00:02:22.440 --> 00:02:25.580
Henry, welcome to Talk Python To Me.

00:02:25.580 --> 00:02:26.280
Thank you.

00:02:26.280 --> 00:02:27.780
Yeah, it's great to have you here.

00:02:27.780 --> 00:02:31.080
I'm always fascinated with cutting-edge physics

00:02:31.080 --> 00:02:34.760
with maybe both ends of physics, right?

00:02:34.820 --> 00:02:37.780
I'm really fascinated with astrophysics and the super large,

00:02:37.780 --> 00:02:39.900
and then also the very, very small.

00:02:39.900 --> 00:02:42.920
And we're going to probably tend a little bit towards the smaller,

00:02:42.920 --> 00:02:45.100
high-energy things this time around,

00:02:45.100 --> 00:02:48.120
but so much fun to talk about this stuff and how it intersects Python.

00:02:48.120 --> 00:02:50.680
Some of the smallest things you can measure

00:02:50.680 --> 00:02:52.960
and some of the largest amounts of data you can get out.

00:02:52.960 --> 00:02:57.080
Yeah, the data story is actually really, really crazy,

00:02:57.080 --> 00:02:58.860
and we're going to talk a bit about that.

00:02:58.860 --> 00:03:00.180
So neat, so much stuff.

00:03:00.180 --> 00:03:04.200
We used to think that atoms were as small as things could get, right?

00:03:04.200 --> 00:03:05.960
I remember learning that in elementary school.

00:03:05.960 --> 00:03:07.860
There are these things called atoms.

00:03:07.860 --> 00:03:10.600
They combine to form compounds and stuff,

00:03:10.600 --> 00:03:11.900
and that's as small as it gets.

00:03:11.900 --> 00:03:13.200
And yeah, not so much, right?

00:03:13.200 --> 00:03:15.640
Yeah, that was sort of what atom was supposed to mean.

00:03:15.640 --> 00:03:19.400
Exactly, the smallest bit, but nope.

00:03:19.400 --> 00:03:21.800
But that name got used up, so there we are.

00:03:21.800 --> 00:03:23.960
All right, well, before we get into all that stuff, though,

00:03:23.960 --> 00:03:25.300
let's start with your story.

00:03:25.300 --> 00:03:26.620
How did you get into programming in Python?

00:03:26.620 --> 00:03:30.320
Well, I started with a little bit of programming

00:03:30.320 --> 00:03:31.480
that my dad taught me.

00:03:31.480 --> 00:03:32.560
He was a physicist.

00:03:33.320 --> 00:03:37.580
And I remember it was C++ and sort of taught the way you would teach Java,

00:03:37.580 --> 00:03:39.500
you know, all objects and classes.

00:03:39.500 --> 00:03:39.700
Yeah.

00:03:39.700 --> 00:03:41.140
Just a little bit.

00:03:41.140 --> 00:03:43.800
And then when I started at college,

00:03:43.800 --> 00:03:45.860
then I wanted to take classes,

00:03:45.860 --> 00:03:48.980
and I took a couple classes again in C++.

00:03:48.980 --> 00:03:51.860
But I just really loved objects and classes.

00:03:51.860 --> 00:03:54.500
Unfortunately, the courses didn't actually cover that much,

00:03:54.500 --> 00:03:55.100
but the book did.

00:03:55.100 --> 00:03:56.920
So I really got into that.

00:03:57.240 --> 00:04:00.320
And then for Python, actually, right when I started college,

00:04:00.320 --> 00:04:02.460
I started using this program called Blender.

00:04:02.460 --> 00:04:03.420
Oh, yeah.

00:04:03.420 --> 00:04:03.780
Blender.

00:04:03.780 --> 00:04:04.460
I've heard of Blender.

00:04:04.460 --> 00:04:08.340
It's like a 3D animation tool, like Maya or something like that, right?

00:04:08.340 --> 00:04:10.820
And it's very Python-friendly, right?

00:04:10.820 --> 00:04:11.360
Yes.

00:04:11.360 --> 00:04:12.960
It has a built-in Python interpreter.

00:04:13.400 --> 00:04:15.540
So I knew it had this built-in language called Python,

00:04:15.540 --> 00:04:16.940
so that made me really want to learn Python.

00:04:16.940 --> 00:04:20.820
And then when I went to an REU,

00:04:20.820 --> 00:04:22.500
a research experience for undergraduates

00:04:22.500 --> 00:04:24.700
at Northwestern University in Chicago.

00:04:25.240 --> 00:04:26.660
And when I was there,

00:04:26.660 --> 00:04:29.620
we had this cluster that we were working on.

00:04:29.620 --> 00:04:32.540
This was in solid-state physics, material physics.

00:04:32.540 --> 00:04:36.920
And we would launch these simulations on the cluster.

00:04:36.920 --> 00:04:39.620
And so I started using Python,

00:04:39.620 --> 00:04:42.520
and I was able to write a program that would go out,

00:04:42.520 --> 00:04:44.380
and it would create a bunch of threads,

00:04:44.380 --> 00:04:46.680
and it would watch all of the cluster,

00:04:46.680 --> 00:04:48.120
all the nodes in the cluster.

00:04:48.120 --> 00:04:49.740
And as soon as one became available, it would take it.

00:04:49.740 --> 00:04:50.500
So I could just,

00:04:50.500 --> 00:04:52.820
my simulation would just take the entire cluster.

00:04:52.820 --> 00:04:54.360
After a few hours, I would have everything.

00:04:54.480 --> 00:04:55.680
So at the end of that,

00:04:55.680 --> 00:04:56.740
everybody hated me,

00:04:56.740 --> 00:04:58.700
and everybody wanted my scripts.

00:04:58.700 --> 00:04:59.560
Exactly.

00:04:59.560 --> 00:04:59.980
They're like,

00:04:59.980 --> 00:05:01.020
this is horrible.

00:05:01.020 --> 00:05:02.900
I can't believe you did that to me,

00:05:02.900 --> 00:05:04.400
but I'll completely forgive you

00:05:04.400 --> 00:05:06.180
if you just give it to me and only to me,

00:05:06.180 --> 00:05:07.760
because I need that power.

00:05:07.760 --> 00:05:09.440
Yeah, that's fantastic.

00:05:09.440 --> 00:05:10.380
How neat.

00:05:10.380 --> 00:05:13.480
So I think that is one of the cool things about Python, right,

00:05:13.480 --> 00:05:18.420
is that it has this quick prototyping approachability.

00:05:18.420 --> 00:05:18.900
They're like,

00:05:18.900 --> 00:05:22.020
I'm just going to take over a huge hardware, right?

00:05:22.020 --> 00:05:23.660
Like a huge cluster of servers,

00:05:23.720 --> 00:05:26.780
but it itself doesn't have to be like intense programming.

00:05:26.780 --> 00:05:28.840
It can be like this elegant little bit of code, right?

00:05:28.840 --> 00:05:30.500
You can sort of do things that,

00:05:30.500 --> 00:05:32.980
normally I think the programming gets in the way more,

00:05:32.980 --> 00:05:34.640
but Python tends to stay out.

00:05:34.640 --> 00:05:35.800
It looks more like pseudocode.

00:05:35.800 --> 00:05:38.060
So you can do more and learn more,

00:05:38.060 --> 00:05:40.640
and eventually you can go do it in C++ or something.

00:05:40.640 --> 00:05:41.300
Yeah.

00:05:41.300 --> 00:05:42.180
Yeah, absolutely.

00:05:42.180 --> 00:05:42.740
Great way to start.

00:05:43.120 --> 00:05:44.200
Or maybe not.

00:05:44.200 --> 00:05:47.120
Sometimes you do need to go do it in some other language,

00:05:47.120 --> 00:05:48.380
and sometimes you don't.

00:05:48.380 --> 00:05:54.540
I think the stuff at CERN and LHC has an interesting exchange between C++

00:05:54.540 --> 00:05:57.260
and maybe some more Python and whatnot,

00:05:57.260 --> 00:05:58.680
so that'll be fun to talk about.

00:05:59.240 --> 00:05:59.600
Yeah.

00:05:59.600 --> 00:06:01.760
We've been C++ originally,

00:06:01.760 --> 00:06:04.860
but Python is really showing up in a lot more places,

00:06:04.860 --> 00:06:07.580
and there's been a lot of movement in that direction.

00:06:07.580 --> 00:06:09.940
And there have been some really interesting things that have come out.

00:06:09.940 --> 00:06:12.060
A lot of interesting things have come out of the LHC,

00:06:12.060 --> 00:06:13.500
computing-wise as well as physics.

00:06:13.500 --> 00:06:14.160
Awesome.

00:06:14.160 --> 00:06:14.480
Yeah.

00:06:15.020 --> 00:06:16.840
As a computing bit of infrastructure,

00:06:16.840 --> 00:06:18.320
there's a ton going on there.

00:06:18.320 --> 00:06:19.260
And as physics,

00:06:19.260 --> 00:06:22.840
it's kind of the center of the particle physics world, right?

00:06:22.840 --> 00:06:26.360
So it's got those two parallel things generating,

00:06:26.360 --> 00:06:27.460
all sorts of cool stuff.

00:06:27.460 --> 00:06:29.740
I want to go back to just really quickly to,

00:06:29.740 --> 00:06:30.000
you know,

00:06:30.000 --> 00:06:31.500
you talked about your dad teaching a little programming.

00:06:31.500 --> 00:06:34.400
If people are out there and they're the dad,

00:06:34.400 --> 00:06:36.200
they want to teach their kids a little bit of programming,

00:06:36.200 --> 00:06:38.480
I want to give a shout out to CodeCombat.com.

00:06:38.480 --> 00:06:39.760
Such a cool place.

00:06:39.760 --> 00:06:41.680
My daughter just yesterday was like,

00:06:41.680 --> 00:06:42.960
hey, dad, I want to do a little Python.

00:06:42.960 --> 00:06:45.600
Remember that game that taught me programming?

00:06:45.600 --> 00:06:46.440
Like, yeah, yeah, sure.

00:06:46.440 --> 00:06:47.140
So she's like,

00:06:47.140 --> 00:06:48.620
she logged in and started playing

00:06:48.620 --> 00:06:52.100
and basically solve a dungeon interactively by writing Python.

00:06:52.100 --> 00:06:53.900
And it's such an approachable way,

00:06:53.900 --> 00:06:55.960
but it's not the like draggy, dropy, fake stuff.

00:06:55.960 --> 00:06:57.060
You write real Python,

00:06:57.060 --> 00:06:59.380
which I think is cool to introduce kids that way.

00:06:59.380 --> 00:07:01.060
So anyway, shout out to them.

00:07:01.060 --> 00:07:02.300
I had them on the podcast before,

00:07:02.300 --> 00:07:05.500
but it's cool to see kids like taking to it in that way, right?

00:07:05.500 --> 00:07:06.240
Whereas you say it like,

00:07:06.240 --> 00:07:07.420
you could write a terminal app.

00:07:07.420 --> 00:07:08.620
They're like, I don't want to do that.

00:07:08.620 --> 00:07:10.260
But solve a dungeon.

00:07:10.260 --> 00:07:11.080
Yeah, they could do that.

00:07:11.080 --> 00:07:11.440
Yeah.

00:07:11.440 --> 00:07:12.580
I've actually played with a couple of those.

00:07:12.660 --> 00:07:13.600
They're actually really fun just to play.

00:07:13.600 --> 00:07:14.600
Yeah, they are.

00:07:14.600 --> 00:07:15.020
Exactly.

00:07:15.020 --> 00:07:17.760
I did like 40 dungeons along with my daughter.

00:07:17.760 --> 00:07:18.300
It was very cool.

00:07:18.300 --> 00:07:19.060
How about now?

00:07:19.060 --> 00:07:19.700
What do you do now?

00:07:19.700 --> 00:07:22.900
So I work in a lot of different areas

00:07:22.900 --> 00:07:24.060
and I jump around a lot.

00:07:24.060 --> 00:07:26.800
So I do a mix of coding.

00:07:26.800 --> 00:07:29.260
I do some work on websites

00:07:29.260 --> 00:07:31.980
because they just needed maintenance

00:07:31.980 --> 00:07:33.120
and somehow I got volunteered.

00:07:33.120 --> 00:07:35.800
And some writing.

00:07:35.800 --> 00:07:38.060
So less coding than I would like,

00:07:38.160 --> 00:07:40.020
but I definitely do get to do it,

00:07:40.020 --> 00:07:40.420
which is fun.

00:07:40.420 --> 00:07:40.860
Yeah.

00:07:40.860 --> 00:07:43.380
And this is at CERN or at your university

00:07:43.380 --> 00:07:44.500
or where is this?

00:07:44.840 --> 00:07:47.240
So now I'm at Princeton University

00:07:47.240 --> 00:07:49.900
and I'm part of a local group

00:07:49.900 --> 00:07:52.320
of RSEs, research software engineers.

00:07:52.320 --> 00:07:56.660
And I'm also part of Iris Hep,

00:07:56.660 --> 00:07:57.620
which we'll talk about a little bit.

00:07:57.620 --> 00:08:00.380
But that's sort of a very spread out group.

00:08:00.380 --> 00:08:02.280
Some of us are at CERN,

00:08:02.280 --> 00:08:04.080
a few are in some other places,

00:08:04.080 --> 00:08:05.720
a few in some Fermilab.

00:08:06.260 --> 00:08:08.400
And energy physicists are just used

00:08:08.400 --> 00:08:09.200
to working remote.

00:08:09.200 --> 00:08:10.920
The pandemic wasn't that big

00:08:10.920 --> 00:08:11.600
of a change for us.

00:08:11.600 --> 00:08:12.200
We were already doing

00:08:12.200 --> 00:08:12.960
all our meetings remote.

00:08:12.960 --> 00:08:14.280
We just eventually changed

00:08:14.280 --> 00:08:15.180
from video to Zoom.

00:08:15.180 --> 00:08:16.620
But other than that,

00:08:16.620 --> 00:08:17.540
it was pretty much the same.

00:08:17.540 --> 00:08:17.680
Exactly.

00:08:17.680 --> 00:08:19.600
It was real similar for me as well.

00:08:19.600 --> 00:08:20.180
That's interesting.

00:08:20.180 --> 00:08:22.220
Fermilab, that's in Chicago,

00:08:22.220 --> 00:08:23.120
outside Chicago, right?

00:08:23.120 --> 00:08:23.740
Yes.

00:08:23.740 --> 00:08:24.660
Is that still going?

00:08:24.660 --> 00:08:25.380
I got the sense that

00:08:25.380 --> 00:08:26.200
that was shutting down.

00:08:26.200 --> 00:08:27.820
They're big in neutrino physics.

00:08:27.820 --> 00:08:30.500
So they do a lot of neutrino things there.

00:08:30.500 --> 00:08:32.280
And then they're also very active

00:08:32.280 --> 00:08:34.080
just in the particle physics space.

00:08:34.080 --> 00:08:36.020
So you may be at Fermilab,

00:08:36.080 --> 00:08:37.520
but working on CERN data.

00:08:37.520 --> 00:08:37.900
I see.

00:08:37.900 --> 00:08:38.320
Okay.

00:08:38.320 --> 00:08:38.700
Interesting.

00:08:38.700 --> 00:08:39.040
Yeah.

00:08:39.040 --> 00:08:41.400
I got to tour that place a little bit

00:08:41.400 --> 00:08:42.680
and it's a really neat place.

00:08:42.680 --> 00:08:43.420
It is.

00:08:43.420 --> 00:08:44.320
CERN's a neat place too.

00:08:44.320 --> 00:08:46.200
I would love to tour CERN,

00:08:46.200 --> 00:08:49.300
but it wasn't 20 minutes down the street

00:08:49.300 --> 00:08:50.240
from where I happened to be.

00:08:50.240 --> 00:08:51.640
So I didn't make it there.

00:08:51.640 --> 00:08:53.720
Sadly, I hope to get back there someday.

00:08:53.720 --> 00:08:54.420
All right.

00:08:54.420 --> 00:08:57.800
Well, let's talk about

00:08:57.800 --> 00:09:01.560
sort of the scikit-hep side of things

00:09:01.560 --> 00:09:02.800
and how you got into

00:09:02.800 --> 00:09:04.900
maintaining all of these packages.

00:09:05.260 --> 00:09:06.880
So you found yourself in this place

00:09:06.880 --> 00:09:08.920
where you're working on tools

00:09:08.920 --> 00:09:10.720
that help other people build packages

00:09:10.720 --> 00:09:13.660
for the physicists and data scientists

00:09:13.660 --> 00:09:14.540
and so on, right?

00:09:14.540 --> 00:09:16.360
So where'd that all start?

00:09:16.860 --> 00:09:18.540
So with maintenance itself,

00:09:18.540 --> 00:09:20.380
the first thing I started maintaining

00:09:20.380 --> 00:09:23.840
was a package called Plumbum back in 2015.

00:09:23.840 --> 00:09:26.140
And at that point,

00:09:26.140 --> 00:09:29.280
I was starting to submit some PRs

00:09:29.280 --> 00:09:30.400
and the author came to me and said,

00:09:30.620 --> 00:09:32.920
I would like to have somebody do the releases.

00:09:32.920 --> 00:09:34.280
I need a release manager.

00:09:34.280 --> 00:09:35.460
I don't have time.

00:09:35.460 --> 00:09:36.760
And I said, sure, I'd be happy to do it.

00:09:36.760 --> 00:09:37.500
And it was exciting for me

00:09:37.500 --> 00:09:39.360
because it was the first package

00:09:39.360 --> 00:09:42.020
or like real package I got to join.

00:09:42.020 --> 00:09:45.580
And so I think on the page,

00:09:45.580 --> 00:09:47.320
it might even still have the original news item

00:09:47.320 --> 00:09:48.660
when it says, welcome to me.

00:09:48.660 --> 00:09:49.080
But-

00:09:49.080 --> 00:09:49.980
Nice.

00:09:50.360 --> 00:09:52.960
So that was the first thing I started maintaining.

00:09:52.960 --> 00:09:58.320
And then I was working on a IG physics tool

00:09:58.320 --> 00:10:00.000
called Goofit when I became a postdoc.

00:10:00.000 --> 00:10:03.600
And I worked on sort of really renovating that.

00:10:03.600 --> 00:10:05.920
It started out as a code written by physicists.

00:10:06.620 --> 00:10:09.820
And I worked on making it actually installable

00:10:09.820 --> 00:10:11.360
and packaged nicely

00:10:11.360 --> 00:10:14.280
and worked with a student to add Python bindings to it,

00:10:14.280 --> 00:10:14.820
things like that.

00:10:14.820 --> 00:10:15.880
And as part of that,

00:10:15.880 --> 00:10:18.420
I wrote a C++ package, CLI 11.

00:10:18.420 --> 00:10:21.480
It was just a first package I actually wrote

00:10:21.480 --> 00:10:22.240
and then maintained.

00:10:22.240 --> 00:10:24.100
And it's actually in C++.

00:10:24.100 --> 00:10:26.060
And that was written for Goofit,

00:10:26.060 --> 00:10:27.800
but now it's a fairly,

00:10:27.800 --> 00:10:30.740
I think it's done pretty well on its own.

00:10:30.740 --> 00:10:31.280
Nice.

00:10:31.280 --> 00:10:32.020
What's that one do?

00:10:32.020 --> 00:10:32.860
Microsoft Terminal uses it.

00:10:32.860 --> 00:10:33.660
Yeah.

00:10:33.660 --> 00:10:35.200
Microsoft Terminal uses it?

00:10:35.200 --> 00:10:35.600
Mm-hmm.

00:10:35.600 --> 00:10:36.360
Oh, nice.

00:10:36.420 --> 00:10:37.900
Yeah, I'm a big fan of Microsoft Terminal.

00:10:37.900 --> 00:10:39.900
I've for a while now

00:10:39.900 --> 00:10:42.080
kind of shied away from working on Windows

00:10:42.080 --> 00:10:44.140
because the terminal experience

00:10:44.140 --> 00:10:45.480
has been really crummy.

00:10:45.480 --> 00:10:48.280
You know, the cmd.exe command prompt style

00:10:48.280 --> 00:10:48.960
is just like,

00:10:48.960 --> 00:10:50.560
oh, why is it so painful?

00:10:50.560 --> 00:10:52.000
And people who work in that all day,

00:10:52.000 --> 00:10:53.120
they might not see it as painful.

00:10:53.120 --> 00:10:54.480
But if you get to work in something

00:10:54.480 --> 00:10:56.500
like a macOS terminal

00:10:56.500 --> 00:10:59.180
or even to not quite the same degree,

00:10:59.180 --> 00:11:00.720
but still in like a Linux one,

00:11:00.720 --> 00:11:01.580
then all of a sudden,

00:11:01.580 --> 00:11:02.820
yeah, it kind of gets there.

00:11:02.820 --> 00:11:05.840
But I'm kind of warming up to it again

00:11:06.220 --> 00:11:06.900
Windows Terminal.

00:11:06.900 --> 00:11:08.840
Yeah, the Xterm is one of the reasons

00:11:08.840 --> 00:11:09.220
I use,

00:11:09.220 --> 00:11:11.280
I really moved to Mac

00:11:11.280 --> 00:11:12.080
because I loved Xterm.

00:11:12.080 --> 00:11:14.720
And then Windows Terminal is amazing.

00:11:14.720 --> 00:11:15.460
Now it's a great,

00:11:15.460 --> 00:11:16.760
great team working on it,

00:11:16.760 --> 00:11:18.940
including the fact that they used my parser.

00:11:18.940 --> 00:11:22.520
But it's actually quite nice.

00:11:22.520 --> 00:11:23.760
The only problem I have in Windows 10

00:11:23.760 --> 00:11:24.900
is it's really hard to get the thing

00:11:24.900 --> 00:11:27.820
to show up instead of cmd prompt.

00:11:28.200 --> 00:11:28.360
Yeah.

00:11:28.360 --> 00:11:29.660
But Windows 11,

00:11:29.660 --> 00:11:31.260
I think it's supposed to be the only one.

00:11:31.260 --> 00:11:31.440
Yeah.

00:11:31.440 --> 00:11:33.380
I definitely think it's included now,

00:11:33.380 --> 00:11:34.060
which is great.

00:11:34.060 --> 00:11:36.200
So CLI 11,

00:11:36.200 --> 00:11:39.560
this is a C++ 11 command line parser,

00:11:39.560 --> 00:11:39.820
right?

00:11:39.820 --> 00:11:41.040
Like click or arg parse

00:11:41.040 --> 00:11:41.740
or something like that,

00:11:41.740 --> 00:11:42.800
but for C++, right?

00:11:42.800 --> 00:11:43.380
Yes.

00:11:43.380 --> 00:11:44.960
It was designed off of

00:11:44.960 --> 00:11:46.600
the Pumbum command line parser.

00:11:46.600 --> 00:11:47.820
Pumbum is sort of a toolkit

00:11:47.820 --> 00:11:48.960
and it has several different things.

00:11:48.960 --> 00:11:50.500
I wish those things had been pulled out

00:11:50.500 --> 00:11:51.420
because I think on their own,

00:11:51.420 --> 00:11:51.880
they might have

00:11:51.880 --> 00:11:54.940
maybe even been popular on their own.

00:11:55.280 --> 00:11:56.440
it has a really nice parser,

00:11:56.440 --> 00:11:57.680
but it was sort of designed off of that

00:11:57.680 --> 00:11:58.300
and off click.

00:11:58.300 --> 00:12:01.020
It has some similarities

00:12:01.020 --> 00:12:01.980
to the both of those.

00:12:01.980 --> 00:12:02.500
Yeah.

00:12:02.500 --> 00:12:04.000
I think probably that's a challenge.

00:12:04.000 --> 00:12:04.340
I mean,

00:12:04.340 --> 00:12:06.440
we're going to get into CiteGithub

00:12:06.440 --> 00:12:07.360
with a whole bunch

00:12:07.360 --> 00:12:08.260
of these different packages,

00:12:08.260 --> 00:12:10.500
but finding the right granularity

00:12:10.500 --> 00:12:13.040
of what is a self-contained unit

00:12:13.040 --> 00:12:14.320
that you want to share with people

00:12:14.320 --> 00:12:15.840
or versus things like

00:12:15.840 --> 00:12:17.840
pulling out a command line parser

00:12:17.840 --> 00:12:19.660
rather than some other library, right?

00:12:19.660 --> 00:12:21.140
This is a careful balance.

00:12:21.140 --> 00:12:23.140
It's a bit challenging.

00:12:23.140 --> 00:12:24.040
I think in Python,

00:12:24.260 --> 00:12:26.940
there's a really strong emphasis

00:12:26.940 --> 00:12:28.240
to having the individual

00:12:28.240 --> 00:12:29.580
separate pieces and packages,

00:12:29.580 --> 00:12:31.540
especially in Python,

00:12:31.540 --> 00:12:32.300
partially because it has

00:12:32.300 --> 00:12:34.120
a really good packaging system.

00:12:34.120 --> 00:12:37.000
And being able to take things,

00:12:37.000 --> 00:12:38.380
have just pieces

00:12:38.380 --> 00:12:39.480
and be able to swap out one

00:12:39.480 --> 00:12:40.160
that you don't like

00:12:40.160 --> 00:12:41.200
is really, really nice.

00:12:41.200 --> 00:12:42.820
And that's one of the things

00:12:42.820 --> 00:12:44.000
we'll talk about the PyPA as well.

00:12:44.000 --> 00:12:44.940
And that's one of the things

00:12:44.940 --> 00:12:46.320
that they focus on

00:12:46.320 --> 00:12:48.200
is small individual packages

00:12:48.200 --> 00:12:49.560
that each do a job

00:12:49.560 --> 00:12:51.540
versus all-in-one poetry.

00:12:51.540 --> 00:12:52.060
Yeah.

00:12:52.060 --> 00:12:53.820
Well, you'll have to

00:12:53.820 --> 00:12:54.980
do some checking

00:12:54.980 --> 00:12:56.840
or some fact-checking,

00:12:56.840 --> 00:12:57.320
balancing,

00:12:57.320 --> 00:12:58.660
modernizing for me.

00:12:58.660 --> 00:13:00.700
I did professional C++ development

00:13:00.700 --> 00:13:01.820
for a couple of years

00:13:01.820 --> 00:13:03.720
and I really enjoyed it

00:13:03.720 --> 00:13:05.000
until there were better options.

00:13:05.000 --> 00:13:06.100
And then I'm like,

00:13:06.100 --> 00:13:07.660
why am I still doing this?

00:13:07.660 --> 00:13:09.060
I would go work on those.

00:13:09.060 --> 00:13:11.520
But one of the things

00:13:11.520 --> 00:13:12.220
that struck me

00:13:12.220 --> 00:13:13.060
as a big difference

00:13:13.060 --> 00:13:14.400
to that world

00:13:14.400 --> 00:13:15.460
is basically

00:13:15.460 --> 00:13:16.980
the number of libraries

00:13:16.980 --> 00:13:17.660
you use,

00:13:17.780 --> 00:13:18.900
the granularity

00:13:18.900 --> 00:13:19.780
of the libraries you use,

00:13:19.780 --> 00:13:20.120
you know,

00:13:20.120 --> 00:13:22.320
the relative acceptance

00:13:22.320 --> 00:13:23.300
of things like pip

00:13:23.300 --> 00:13:23.980
and the ease

00:13:23.980 --> 00:13:25.540
of using another library,

00:13:25.540 --> 00:13:25.940
right?

00:13:25.940 --> 00:13:26.960
In C++,

00:13:26.960 --> 00:13:27.980
you've got the header

00:13:27.980 --> 00:13:30.520
and you've got the linked file

00:13:30.520 --> 00:13:31.600
and you've got the DLL

00:13:31.600 --> 00:13:32.220
and there's like

00:13:32.220 --> 00:13:33.060
all sorts of stuff

00:13:33.060 --> 00:13:33.840
that can like

00:13:33.840 --> 00:13:34.560
get out of sync

00:13:34.560 --> 00:13:35.300
and go crazy

00:13:35.300 --> 00:13:37.260
and like make weird crashes.

00:13:37.260 --> 00:13:38.560
Your app just goes away

00:13:38.560 --> 00:13:39.380
and that's not great.

00:13:39.380 --> 00:13:40.960
Is that still true?

00:13:40.960 --> 00:13:41.820
I feel like that

00:13:41.820 --> 00:13:42.920
that difference

00:13:42.920 --> 00:13:43.720
is one of the things

00:13:43.720 --> 00:13:44.580
that allows for people

00:13:44.580 --> 00:13:45.520
to make these smaller

00:13:45.520 --> 00:13:46.880
composable pieces

00:13:46.880 --> 00:13:47.360
in Python.

00:13:47.360 --> 00:13:48.400
I think that has

00:13:48.400 --> 00:13:49.200
a lot to do with it.

00:13:49.200 --> 00:13:50.940
What has happened

00:13:50.940 --> 00:13:51.540
in C++

00:13:51.540 --> 00:13:52.580
is there's sort of

00:13:52.580 --> 00:13:53.200
a rise of a lot

00:13:53.200 --> 00:13:54.280
of header-only libraries

00:13:54.280 --> 00:13:55.500
and these libraries

00:13:55.500 --> 00:13:57.340
are a lot easier

00:13:57.340 --> 00:13:57.920
to just drop

00:13:57.920 --> 00:13:59.500
into your project

00:13:59.500 --> 00:14:00.880
because all you do

00:14:00.880 --> 00:14:01.700
is you put in the headers

00:14:01.700 --> 00:14:03.940
and there's no,

00:14:03.940 --> 00:14:05.040
you don't have to deal

00:14:05.040 --> 00:14:05.920
with a lot of the

00:14:05.920 --> 00:14:07.300
original issues.

00:14:07.300 --> 00:14:07.900
So a lot of these

00:14:07.900 --> 00:14:09.080
small standalone libraries

00:14:09.080 --> 00:14:10.120
are header-only

00:14:10.120 --> 00:14:11.860
and one of the next

00:14:11.860 --> 00:14:13.980
things that I picked up

00:14:13.980 --> 00:14:15.040
as a maintainer

00:14:15.040 --> 00:14:16.020
was Pybind 11,

00:14:16.440 --> 00:14:16.760
which,

00:14:16.760 --> 00:14:17.700
and I've sort of

00:14:17.700 --> 00:14:18.460
been in that space

00:14:18.460 --> 00:14:19.580
sort of between C++

00:14:19.580 --> 00:14:20.160
and Python

00:14:20.160 --> 00:14:21.680
for quite a bit.

00:14:21.680 --> 00:14:22.700
I kind of like being

00:14:22.700 --> 00:14:23.720
in that area,

00:14:23.720 --> 00:14:26.360
joining the two.

00:14:26.360 --> 00:14:27.580
I get a sense

00:14:27.580 --> 00:14:28.680
from listening to the things

00:14:28.680 --> 00:14:29.340
that you've worked on

00:14:29.340 --> 00:14:29.800
previously

00:14:29.800 --> 00:14:31.040
and things like this

00:14:31.040 --> 00:14:32.160
that you're interested

00:14:32.160 --> 00:14:32.900
in connecting

00:14:32.900 --> 00:14:33.600
and enabling,

00:14:33.600 --> 00:14:34.660
like piecing together,

00:14:34.660 --> 00:14:35.800
like here's my script

00:14:35.800 --> 00:14:36.680
that's going to pull together

00:14:36.680 --> 00:14:37.340
the compute

00:14:37.340 --> 00:14:38.120
on this cluster

00:14:38.120 --> 00:14:39.100
or here's this library

00:14:39.100 --> 00:14:39.800
that pulls together

00:14:39.800 --> 00:14:41.120
Python and C++

00:14:41.120 --> 00:14:41.820
and so on.

00:14:41.820 --> 00:14:42.260
Yes,

00:14:42.260 --> 00:14:43.380
making different things

00:14:43.380 --> 00:14:43.900
work together

00:14:43.900 --> 00:14:45.740
and combining things

00:14:45.740 --> 00:14:46.680
like C++ and Python

00:14:46.680 --> 00:14:47.460
or combining different

00:14:47.460 --> 00:14:48.260
packages in Python

00:14:48.260 --> 00:14:49.440
and piecing together

00:14:49.440 --> 00:14:49.860
a solution.

00:14:49.860 --> 00:14:50.920
I think that's one

00:14:50.920 --> 00:14:51.840
of Python's strengths

00:14:51.840 --> 00:14:52.700
versus something like

00:14:52.700 --> 00:14:53.100
MATLAB.

00:14:53.100 --> 00:14:53.800
I spent quite a bit

00:14:53.800 --> 00:14:54.500
of time in MATLAB

00:14:54.500 --> 00:14:55.720
early on

00:14:55.720 --> 00:14:57.040
and got to move

00:14:57.040 --> 00:14:57.520
a lot of stuff

00:14:57.520 --> 00:14:58.180
over to Python.

00:14:58.180 --> 00:14:59.000
Right on,

00:14:59.000 --> 00:14:59.680
that's awesome.

00:14:59.680 --> 00:15:00.480
It was really nice.

00:15:00.480 --> 00:15:01.240
We didn't have to have

00:15:01.240 --> 00:15:01.600
a license

00:15:01.600 --> 00:15:02.600
and things like that.

00:15:02.980 --> 00:15:03.320
I know,

00:15:03.320 --> 00:15:04.660
it's so expensive

00:15:04.660 --> 00:15:05.680
and then you get the,

00:15:05.680 --> 00:15:06.480
what are they called,

00:15:06.480 --> 00:15:07.060
toolkits,

00:15:07.060 --> 00:15:08.100
the add-on toolkits

00:15:08.100 --> 00:15:08.680
and they're like,

00:15:08.680 --> 00:15:09.600
each toolkit

00:15:09.600 --> 00:15:10.360
is the price

00:15:10.360 --> 00:15:12.580
of another $1,000 a year

00:15:12.580 --> 00:15:13.540
or $2,000 a year.

00:15:13.540 --> 00:15:14.260
It's ridiculous.

00:15:14.260 --> 00:15:17.780
So I know of CFFI,

00:15:17.780 --> 00:15:18.620
which is a way

00:15:18.620 --> 00:15:20.480
for Python and C

00:15:20.480 --> 00:15:21.440
to get clicked together

00:15:21.440 --> 00:15:23.140
in a simple way.

00:15:23.140 --> 00:15:26.520
How's Pybind 11

00:15:26.520 --> 00:15:27.280
fit into that?

00:15:27.280 --> 00:15:28.280
This is seamless

00:15:28.280 --> 00:15:28.960
interoperability

00:15:28.960 --> 00:15:30.200
between C++11

00:15:30.200 --> 00:15:30.960
and Python.

00:15:30.960 --> 00:15:32.100
How are they different?

00:15:32.480 --> 00:15:33.760
So CFFI,

00:15:33.760 --> 00:15:35.460
I teach like

00:15:35.460 --> 00:15:36.040
a little short course

00:15:36.040 --> 00:15:36.840
where I can go through

00:15:36.840 --> 00:15:38.120
some of the different

00:15:38.120 --> 00:15:38.700
binding tools

00:15:38.700 --> 00:15:40.120
and it usually ends

00:15:40.120 --> 00:15:40.520
with me saying

00:15:40.520 --> 00:15:41.400
Pybind 11 is my favorite.

00:15:41.400 --> 00:15:43.160
Yeah, cool.

00:15:43.160 --> 00:15:43.920
Give us an overview

00:15:43.920 --> 00:15:44.840
of what the options are

00:15:44.840 --> 00:15:45.160
and stuff.

00:15:45.160 --> 00:15:46.080
CFFI is closer

00:15:46.080 --> 00:15:47.220
to C types.

00:15:47.220 --> 00:15:47.960
It's more of,

00:15:47.960 --> 00:15:49.920
it's focused on C

00:15:49.920 --> 00:15:50.860
versus C++

00:15:50.860 --> 00:15:53.120
and it's actually

00:15:53.120 --> 00:15:53.580
the one I've used

00:15:53.580 --> 00:15:54.060
the least.

00:15:54.060 --> 00:15:54.800
I was just helping,

00:15:54.800 --> 00:15:56.460
we're just talking

00:15:56.460 --> 00:15:57.720
with the CFFI developer

00:15:57.720 --> 00:15:59.140
but I've used it

00:15:59.140 --> 00:15:59.860
the least of those

00:15:59.860 --> 00:16:01.900
but I think it

00:16:01.900 --> 00:16:02.980
basically parses

00:16:02.980 --> 00:16:04.920
your C headers

00:16:04.920 --> 00:16:06.040
and then sort of

00:16:06.040 --> 00:16:06.660
automates a lot

00:16:06.660 --> 00:16:07.680
of what you would

00:16:07.680 --> 00:16:08.140
have to manually

00:16:08.140 --> 00:16:08.820
do with C types

00:16:08.820 --> 00:16:09.620
or you have to

00:16:09.620 --> 00:16:11.480
specify what

00:16:11.480 --> 00:16:12.540
symbol you want to call

00:16:12.540 --> 00:16:13.260
and what the arguments

00:16:13.260 --> 00:16:14.080
are and what the return

00:16:14.080 --> 00:16:15.040
type is and if one

00:16:15.040 --> 00:16:15.720
of those things is wrong

00:16:15.720 --> 00:16:16.420
you get a seg fault

00:16:16.420 --> 00:16:17.260
and that sort of thing.

00:16:17.260 --> 00:16:18.880
Whereas Pybind 11,

00:16:18.880 --> 00:16:20.000
this is about building

00:16:20.000 --> 00:16:21.340
modules, extension modules.

00:16:21.340 --> 00:16:22.740
So, and it's,

00:16:22.740 --> 00:16:23.260
and it's,

00:16:23.260 --> 00:16:24.040
the interesting thing

00:16:24.040 --> 00:16:24.860
about this

00:16:24.860 --> 00:16:26.060
is that it's written

00:16:26.060 --> 00:16:26.940
in pure C++.

00:16:26.940 --> 00:16:28.360
The other tools

00:16:28.360 --> 00:16:28.840
out there,

00:16:28.840 --> 00:16:30.380
so Cython can do this,

00:16:30.380 --> 00:16:30.740
it's not what

00:16:30.740 --> 00:16:31.620
it was designed for

00:16:31.620 --> 00:16:32.960
but it immediately

00:16:32.960 --> 00:16:33.680
became popular

00:16:33.680 --> 00:16:34.360
for doing this

00:16:34.360 --> 00:16:35.640
because Cython

00:16:35.640 --> 00:16:37.120
turned code,

00:16:37.120 --> 00:16:37.840
Python,

00:16:37.840 --> 00:16:38.980
Python-like code

00:16:38.980 --> 00:16:40.060
is a new language

00:16:40.060 --> 00:16:40.900
into,

00:16:40.900 --> 00:16:42.180
it transpiled it

00:16:42.180 --> 00:16:42.960
into C

00:16:42.960 --> 00:16:43.800
or C++

00:16:43.800 --> 00:16:44.520
that had a toggle

00:16:44.520 --> 00:16:45.080
you could change,

00:16:45.080 --> 00:16:45.860
has a toggle

00:16:45.860 --> 00:16:46.340
you can change

00:16:46.340 --> 00:16:47.260
and then

00:16:47.260 --> 00:16:48.640
when you're there

00:16:48.640 --> 00:16:49.340
you can now call

00:16:49.340 --> 00:16:50.160
C or C++

00:16:50.160 --> 00:16:52.200
but it's extremely verbose

00:16:52.200 --> 00:16:53.060
and you repeat yourself

00:16:53.060 --> 00:16:53.700
and you have to learn

00:16:53.700 --> 00:16:54.320
another language.

00:16:54.320 --> 00:16:55.820
This weird combined

00:16:55.820 --> 00:16:56.580
Python thing

00:16:56.580 --> 00:16:57.800
and just thinking

00:16:57.800 --> 00:16:58.760
in Cython is difficult

00:16:58.760 --> 00:16:59.500
because you have to think

00:16:59.500 --> 00:17:00.140
about well am I

00:17:00.140 --> 00:17:00.880
in Python

00:17:00.880 --> 00:17:02.560
or am I in Cython

00:17:02.560 --> 00:17:03.580
that can,

00:17:03.580 --> 00:17:04.440
that's going to be

00:17:04.440 --> 00:17:05.200
bound to Python

00:17:05.200 --> 00:17:06.040
or am I in Cython

00:17:06.040 --> 00:17:06.840
that's just going

00:17:06.840 --> 00:17:07.480
straight to C,

00:17:07.480 --> 00:17:08.020
C++

00:17:08.020 --> 00:17:09.160
or am I just in C++

00:17:09.160 --> 00:17:10.220
or C

00:17:10.220 --> 00:17:11.100
but I've actually used it.

00:17:11.100 --> 00:17:12.040
It's a lot of layers there,

00:17:12.040 --> 00:17:12.280
yeah.

00:17:12.280 --> 00:17:13.160
But Pybind 11

00:17:13.160 --> 00:17:14.620
is just C++

00:17:14.620 --> 00:17:15.780
and it's just,

00:17:15.780 --> 00:17:16.640
it's basically,

00:17:16.640 --> 00:17:18.620
it's like the,

00:17:18.620 --> 00:17:19.560
C API

00:17:19.560 --> 00:17:20.500
for Python

00:17:20.500 --> 00:17:21.920
but a C++ API.

00:17:21.920 --> 00:17:22.660
It's quite,

00:17:22.660 --> 00:17:23.460
it's quite natural

00:17:23.460 --> 00:17:24.660
and you don't have

00:17:24.660 --> 00:17:25.380
to learn a new language.

00:17:25.380 --> 00:17:26.400
It uses some fairly

00:17:26.400 --> 00:17:27.080
advanced C++

00:17:27.080 --> 00:17:28.140
but that's it.

00:17:28.140 --> 00:17:28.560
You're learning

00:17:28.560 --> 00:17:29.380
something useful anyway.

00:17:29.380 --> 00:17:29.920
Right.

00:17:29.920 --> 00:17:31.080
So do you do

00:17:31.080 --> 00:17:31.640
some sort of like

00:17:31.640 --> 00:17:32.560
template type thing

00:17:32.560 --> 00:17:33.540
and then say

00:17:33.540 --> 00:17:34.800
I'm going to expose

00:17:34.800 --> 00:17:36.300
this class to Python

00:17:36.300 --> 00:17:37.380
or something like that

00:17:37.380 --> 00:17:38.180
and then it figures out,

00:17:38.180 --> 00:17:38.840
does it write

00:17:38.840 --> 00:17:39.580
the Python code

00:17:39.580 --> 00:17:40.140
or what is it?

00:17:40.140 --> 00:17:41.420
It's writing the,

00:17:41.420 --> 00:17:43.640
build like .so files

00:17:43.640 --> 00:17:45.060
or what do you do here?

00:17:45.060 --> 00:17:45.820
It,

00:17:45.820 --> 00:17:47.160
it compiles

00:17:47.160 --> 00:17:48.560
into the C API calls

00:17:48.560 --> 00:17:49.480
and then that

00:17:49.480 --> 00:17:50.000
would compile

00:17:50.000 --> 00:17:50.680
into a .so

00:17:50.680 --> 00:17:51.800
and there's no

00:17:51.800 --> 00:17:52.400
separate step

00:17:52.400 --> 00:17:53.060
like Cython

00:17:53.060 --> 00:17:53.480
or Swig

00:17:53.480 --> 00:17:53.880
or these,

00:17:53.880 --> 00:17:55.740
or these other tools

00:17:55.740 --> 00:17:56.420
because it's just

00:17:56.420 --> 00:17:56.980
C++.

00:17:56.980 --> 00:17:57.500
You compile it

00:17:57.500 --> 00:17:57.840
like you do

00:17:57.840 --> 00:17:58.540
any other C++

00:17:58.540 --> 00:18:00.120
but it's actually

00:18:00.120 --> 00:18:01.440
internally using

00:18:01.440 --> 00:18:02.600
the CPython API

00:18:02.600 --> 00:18:03.820
or PyPy's

00:18:03.820 --> 00:18:04.720
wrapper for it

00:18:04.720 --> 00:18:05.540
and the language

00:18:05.540 --> 00:18:06.480
looks a lot like Python

00:18:06.480 --> 00:18:06.980
but the names

00:18:06.980 --> 00:18:07.400
are similar.

00:18:07.400 --> 00:18:08.180
You just do a def

00:18:08.180 --> 00:18:09.320
to define a function

00:18:09.320 --> 00:18:10.280
and you give it

00:18:10.280 --> 00:18:10.660
the name

00:18:10.660 --> 00:18:11.200
and then you just

00:18:11.200 --> 00:18:12.420
pass it the pointer

00:18:12.420 --> 00:18:13.000
to the,

00:18:13.000 --> 00:18:14.540
the underlying thing.

00:18:14.540 --> 00:18:15.220
It can figure out

00:18:15.220 --> 00:18:15.920
things like types

00:18:15.920 --> 00:18:16.460
and stuff like that

00:18:16.460 --> 00:18:16.860
for you.

00:18:16.860 --> 00:18:17.700
Give it a doc string

00:18:17.700 --> 00:18:18.120
if you want.

00:18:18.120 --> 00:18:18.640
Give the arguments

00:18:18.640 --> 00:18:19.020
names.

00:18:19.020 --> 00:18:19.900
You can make it

00:18:19.900 --> 00:18:20.960
as Pythonic as you want.

00:18:20.960 --> 00:18:22.000
It's verbose

00:18:22.000 --> 00:18:23.720
but it's not overly verbose.

00:18:23.720 --> 00:18:24.820
Yeah,

00:18:24.820 --> 00:18:25.520
that's really neat.

00:18:25.520 --> 00:18:25.780
Nice.

00:18:25.780 --> 00:18:26.760
And for people

00:18:26.760 --> 00:18:27.380
who haven't used

00:18:27.380 --> 00:18:29.040
those kind of outputs,

00:18:29.040 --> 00:18:29.580
basically,

00:18:29.580 --> 00:18:30.660
it's just import

00:18:30.660 --> 00:18:31.700
module name

00:18:31.700 --> 00:18:32.800
whether it's a

00:18:32.800 --> 00:18:34.000
.py file

00:18:34.000 --> 00:18:34.540
or it's a

00:18:34.540 --> 00:18:36.620
.so file.

00:18:36.620 --> 00:18:37.640
PyTorch

00:18:37.640 --> 00:18:38.540
if you've used

00:18:38.540 --> 00:18:39.880
.py

00:18:39.880 --> 00:18:41.140
if you've used

00:18:41.140 --> 00:18:42.080
any of those things

00:18:42.080 --> 00:18:42.360
you have,

00:18:42.360 --> 00:18:43.480
you've been importing

00:18:43.480 --> 00:18:44.780
some PyBind11 code.

00:18:44.780 --> 00:18:46.140
So let's talk

00:18:46.140 --> 00:18:46.620
a little bit

00:18:46.620 --> 00:18:47.260
about

00:18:47.260 --> 00:18:48.460
Scikit-Hep.

00:18:48.460 --> 00:18:49.380
This is

00:18:49.380 --> 00:18:50.700
one of the projects

00:18:50.700 --> 00:18:52.340
that it has

00:18:52.340 --> 00:18:52.920
a lot of these

00:18:52.920 --> 00:18:53.340
packages

00:18:53.340 --> 00:18:54.500
inside of it

00:18:54.500 --> 00:18:55.560
and your

00:18:55.560 --> 00:18:56.720
library

00:18:56.720 --> 00:18:58.600
CBuild

00:18:58.600 --> 00:18:59.420
Wheel

00:18:59.420 --> 00:19:01.360
is

00:19:01.360 --> 00:19:02.540
one of the things

00:19:02.540 --> 00:19:03.080
that is used

00:19:03.080 --> 00:19:03.560
to maintain

00:19:03.560 --> 00:19:03.960
and build

00:19:03.960 --> 00:19:04.240
a lot of

00:19:04.240 --> 00:19:04.700
those packages

00:19:04.700 --> 00:19:05.180
because I'm sure

00:19:05.180 --> 00:19:05.660
they have a lot

00:19:05.660 --> 00:19:06.380
of interesting

00:19:06.380 --> 00:19:07.720
and oddball

00:19:07.720 --> 00:19:08.300
dependencies,

00:19:08.300 --> 00:19:08.920
right?

00:19:08.920 --> 00:19:09.320
I mean,

00:19:09.320 --> 00:19:10.800
C++ is kind of

00:19:10.800 --> 00:19:11.100
standard,

00:19:11.100 --> 00:19:11.760
but there's probably

00:19:11.760 --> 00:19:12.420
others as well,

00:19:12.420 --> 00:19:12.660
right?

00:19:12.660 --> 00:19:14.040
It is.

00:19:14.040 --> 00:19:14.700
So one thing

00:19:14.700 --> 00:19:15.520
that is kind of

00:19:15.520 --> 00:19:16.860
somewhat unique

00:19:16.860 --> 00:19:17.900
to HEP

00:19:17.900 --> 00:19:18.560
is that we are

00:19:18.560 --> 00:19:19.360
very heavily invested

00:19:19.360 --> 00:19:19.940
in C++.

00:19:19.940 --> 00:19:21.420
So it's usually

00:19:21.420 --> 00:19:22.320
either you're going

00:19:22.320 --> 00:19:22.840
to see Python

00:19:22.840 --> 00:19:23.260
or you're going

00:19:23.260 --> 00:19:23.820
to see some sort

00:19:23.820 --> 00:19:24.320
of C++

00:19:24.320 --> 00:19:25.840
package of some sort.

00:19:25.840 --> 00:19:26.320
I mean,

00:19:26.320 --> 00:19:26.960
it could be

00:19:26.960 --> 00:19:28.300
varies in size there,

00:19:28.300 --> 00:19:29.320
but it's mostly

00:19:29.320 --> 00:19:30.180
C++ or Python.

00:19:30.180 --> 00:19:30.840
We really

00:19:30.840 --> 00:19:32.380
haven't used

00:19:32.380 --> 00:19:32.940
other languages

00:19:32.940 --> 00:19:34.080
much for the past

00:19:34.080 --> 00:19:35.880
early 90s or so.

00:19:35.880 --> 00:19:36.960
Is that inertia

00:19:36.960 --> 00:19:37.900
or is that

00:19:37.900 --> 00:19:39.100
by choice?

00:19:39.100 --> 00:19:39.620
You know,

00:19:39.620 --> 00:19:40.520
why is that?

00:19:40.520 --> 00:19:41.640
I think it's

00:19:41.640 --> 00:19:42.400
partially the community

00:19:42.400 --> 00:19:44.460
is a fairly

00:19:44.460 --> 00:19:46.780
cohesive community.

00:19:46.780 --> 00:19:47.320
We're really

00:19:47.320 --> 00:19:48.000
used to

00:19:48.000 --> 00:19:48.580
sort of working

00:19:48.580 --> 00:19:48.900
together.

00:19:48.900 --> 00:19:49.400
The experiments

00:19:49.400 --> 00:19:49.860
themselves

00:19:49.860 --> 00:19:51.240
are often,

00:19:51.240 --> 00:19:52.000
you know,

00:19:52.000 --> 00:19:52.860
might be a thousand

00:19:52.860 --> 00:19:54.060
or several thousand

00:19:54.060 --> 00:19:55.140
physicists working

00:19:55.140 --> 00:19:55.980
on a single experiment.

00:19:55.980 --> 00:19:57.980
And we have been

00:19:57.980 --> 00:19:58.880
fairly good about

00:19:58.880 --> 00:19:59.380
sort of meeting

00:19:59.380 --> 00:20:00.020
together and

00:20:00.020 --> 00:20:01.060
sort of deciding

00:20:01.060 --> 00:20:01.580
the direction

00:20:01.580 --> 00:20:02.060
that we want

00:20:02.060 --> 00:20:02.580
to go in

00:20:02.580 --> 00:20:03.180
and it's

00:20:03.180 --> 00:20:03.580
sort of sticking

00:20:03.580 --> 00:20:04.120
to that.

00:20:04.120 --> 00:20:06.200
So for C++,

00:20:06.200 --> 00:20:07.120
it was heavily

00:20:07.120 --> 00:20:07.620
root,

00:20:07.620 --> 00:20:08.640
which is a

00:20:08.640 --> 00:20:10.280
giant C++

00:20:10.280 --> 00:20:11.400
framework.

00:20:11.400 --> 00:20:12.220
And it's got

00:20:12.220 --> 00:20:13.400
everything in it.

00:20:13.400 --> 00:20:14.200
And that was

00:20:14.200 --> 00:20:15.160
C++ and that's

00:20:15.160 --> 00:20:15.540
what everybody

00:20:15.540 --> 00:20:15.980
used.

00:20:15.980 --> 00:20:17.460
So root is the

00:20:17.460 --> 00:20:17.820
library.

00:20:17.820 --> 00:20:18.540
If I was going

00:20:18.540 --> 00:20:19.240
to write code

00:20:19.240 --> 00:20:20.080
that would run

00:20:20.080 --> 00:20:21.100
and interact

00:20:21.100 --> 00:20:22.100
with like the

00:20:22.100 --> 00:20:23.380
grid computing

00:20:23.380 --> 00:20:24.660
or the data

00:20:24.660 --> 00:20:25.500
access and all

00:20:25.500 --> 00:20:25.860
that kind of

00:20:25.860 --> 00:20:27.220
stuff at LHC,

00:20:27.220 --> 00:20:28.060
I would use

00:20:28.060 --> 00:20:28.920
this root library

00:20:28.920 --> 00:20:29.400
if I was doing

00:20:29.400 --> 00:20:29.920
that in C++,

00:20:29.920 --> 00:20:30.340
right?

00:20:30.720 --> 00:20:30.960
Yes.

00:20:30.960 --> 00:20:31.660
You might be

00:20:31.660 --> 00:20:32.300
using interpreted

00:20:32.300 --> 00:20:32.800
C++,

00:20:32.800 --> 00:20:33.460
which is something

00:20:33.460 --> 00:20:33.920
we invented.

00:20:33.920 --> 00:20:35.320
Oh, okay.

00:20:35.320 --> 00:20:37.260
This is interesting.

00:20:37.260 --> 00:20:38.040
Is this something

00:20:38.040 --> 00:20:38.740
people can use?

00:20:38.740 --> 00:20:39.500
Oh, yes.

00:20:39.500 --> 00:20:40.040
We actually,

00:20:40.040 --> 00:20:42.180
so Cint

00:20:42.180 --> 00:20:42.720
was the original

00:20:42.720 --> 00:20:43.140
interpreter

00:20:43.140 --> 00:20:43.960
and then it got

00:20:43.960 --> 00:20:44.920
replaced by

00:20:44.920 --> 00:20:45.760
Cling,

00:20:45.760 --> 00:20:46.320
which is built

00:20:46.320 --> 00:20:47.160
on the LLVM.

00:20:47.160 --> 00:20:49.000
And I think

00:20:49.000 --> 00:20:49.980
recently it was

00:20:49.980 --> 00:20:50.480
merged to

00:20:50.480 --> 00:20:51.420
mainline LLVM

00:20:51.420 --> 00:20:53.180
as Clang

00:20:53.180 --> 00:20:53.880
Repl,

00:20:53.880 --> 00:20:54.400
I think it's

00:20:54.400 --> 00:20:55.140
called,

00:20:55.140 --> 00:20:55.980
but it's sort

00:20:55.980 --> 00:20:56.340
of a lightweight

00:20:56.340 --> 00:20:56.840
version.

00:20:56.840 --> 00:20:58.120
Yeah, it's

00:20:58.120 --> 00:20:58.780
a C++

00:20:58.780 --> 00:20:59.560
interpreter.

00:20:59.640 --> 00:20:59.980
You can actually

00:20:59.980 --> 00:21:00.500
get Zeus

00:21:00.500 --> 00:21:00.820
Cling,

00:21:00.820 --> 00:21:03.960
which I think

00:21:03.960 --> 00:21:05.000
Quantstack has,

00:21:05.000 --> 00:21:06.280
but they package

00:21:06.280 --> 00:21:06.820
it as well,

00:21:06.820 --> 00:21:07.040
I think,

00:21:07.040 --> 00:21:08.000
Zeus Cling.

00:21:08.000 --> 00:21:09.000
Okay, yeah,

00:21:09.000 --> 00:21:09.540
very interesting.

00:21:09.540 --> 00:21:10.620
It's not,

00:21:10.620 --> 00:21:12.100
C++ really wasn't

00:21:12.100 --> 00:21:12.740
designed for a

00:21:12.740 --> 00:21:13.300
notebook though.

00:21:13.300 --> 00:21:14.200
It does work,

00:21:14.200 --> 00:21:15.240
but you can't

00:21:15.240 --> 00:21:15.940
rerun a cell

00:21:15.940 --> 00:21:16.860
often because

00:21:16.860 --> 00:21:17.980
you can't

00:21:17.980 --> 00:21:18.800
redefine things.

00:21:18.800 --> 00:21:19.520
Python is just

00:21:19.520 --> 00:21:20.300
really natural

00:21:20.300 --> 00:21:20.840
in a notebook

00:21:20.840 --> 00:21:21.420
and C++

00:21:21.420 --> 00:21:21.940
is not.

00:21:21.940 --> 00:21:22.700
Yeah, especially

00:21:22.700 --> 00:21:23.140
if you change

00:21:23.140 --> 00:21:23.620
the type,

00:21:23.620 --> 00:21:24.420
you compile it

00:21:24.420 --> 00:21:24.820
as an int

00:21:24.820 --> 00:21:25.080
and then you're

00:21:25.080 --> 00:21:25.340
like, ah,

00:21:25.340 --> 00:21:25.660
that should be

00:21:25.660 --> 00:21:26.000
a string.

00:21:26.000 --> 00:21:27.040
Yeah, that's

00:21:27.040 --> 00:21:27.300
not going to

00:21:27.300 --> 00:21:27.660
be a string.

00:21:27.660 --> 00:21:28.200
It's compiled.

00:21:28.480 --> 00:21:29.020
Yeah, interesting.

00:21:29.020 --> 00:21:30.520
So it seems to

00:21:30.520 --> 00:21:31.080
me like the

00:21:31.080 --> 00:21:32.020
community at

00:21:32.020 --> 00:21:33.300
CERN has

00:21:33.300 --> 00:21:34.160
decided, look,

00:21:34.160 --> 00:21:34.600
we need some

00:21:34.600 --> 00:21:35.440
low-level stuff

00:21:35.440 --> 00:21:36.080
and there's some

00:21:36.080 --> 00:21:36.760
crazy low-level

00:21:36.760 --> 00:21:37.480
things that

00:21:37.480 --> 00:21:38.640
happen over there.

00:21:38.640 --> 00:21:39.260
People can check

00:21:39.260 --> 00:21:40.180
out a video,

00:21:40.180 --> 00:21:41.100
maybe I'll mention

00:21:41.100 --> 00:21:41.440
a little bit

00:21:41.440 --> 00:21:41.660
later.

00:21:41.660 --> 00:21:43.520
But for that

00:21:43.520 --> 00:21:44.180
use, they've

00:21:44.180 --> 00:21:44.980
sort of gravitated

00:21:44.980 --> 00:21:46.000
towards C and

00:21:46.000 --> 00:21:46.960
then for the

00:21:46.960 --> 00:21:47.640
other aspects,

00:21:47.640 --> 00:21:48.140
it sounds like

00:21:48.140 --> 00:21:49.480
Python is what

00:21:49.480 --> 00:21:50.080
everyone agreed

00:21:50.080 --> 00:21:50.400
to.

00:21:50.400 --> 00:21:51.000
It's like, hey,

00:21:51.000 --> 00:21:51.400
we want to

00:21:51.400 --> 00:21:52.080
visualize this,

00:21:52.080 --> 00:21:52.620
we want to do

00:21:52.620 --> 00:21:53.080
some notebook

00:21:53.080 --> 00:21:53.880
stuff, we want

00:21:53.880 --> 00:21:54.860
to piece things

00:21:54.860 --> 00:21:55.200
together,

00:21:55.200 --> 00:21:56.100
something like

00:21:56.100 --> 00:21:56.380
that, right?

00:21:56.380 --> 00:21:57.380
It's certainly

00:21:57.380 --> 00:21:58.900
moving that way.

00:21:58.900 --> 00:22:00.200
They definitely

00:22:00.200 --> 00:22:00.760
have sort of

00:22:00.760 --> 00:22:01.340
agreed that

00:22:01.340 --> 00:22:02.740
Python should

00:22:02.740 --> 00:22:03.600
be a first-class

00:22:03.600 --> 00:22:05.120
language and

00:22:05.120 --> 00:22:05.860
join C++.

00:22:05.860 --> 00:22:06.820
That was decided

00:22:06.820 --> 00:22:07.860
a few years ago.

00:22:07.860 --> 00:22:09.600
And I think

00:22:09.600 --> 00:22:10.360
that's been a

00:22:10.360 --> 00:22:10.920
great step in

00:22:10.920 --> 00:22:11.340
the right direction

00:22:11.340 --> 00:22:11.880
because what was

00:22:11.880 --> 00:22:12.440
happening, people

00:22:12.440 --> 00:22:13.240
were coming in

00:22:13.240 --> 00:22:14.020
with Python

00:22:14.020 --> 00:22:14.340
knowledge.

00:22:14.340 --> 00:22:14.860
They wanted to

00:22:14.860 --> 00:22:15.560
use Pandas.

00:22:15.560 --> 00:22:16.440
I came in

00:22:16.440 --> 00:22:17.240
that way as

00:22:17.240 --> 00:22:17.480
well.

00:22:17.480 --> 00:22:18.960
Pandas and

00:22:18.960 --> 00:22:19.480
Numba and all

00:22:19.480 --> 00:22:20.340
these tools were

00:22:20.340 --> 00:22:20.800
really, really

00:22:20.800 --> 00:22:21.280
nice.

00:22:21.860 --> 00:22:22.900
And we were

00:22:22.900 --> 00:22:23.360
basically just

00:22:23.360 --> 00:22:23.840
having to write

00:22:23.840 --> 00:22:24.140
them all

00:22:24.140 --> 00:22:25.000
ourselves in

00:22:25.000 --> 00:22:25.760
C++.

00:22:25.760 --> 00:22:26.760
It has a

00:22:26.760 --> 00:22:27.120
data frame,

00:22:27.120 --> 00:22:28.080
but why not

00:22:28.080 --> 00:22:30.800
just use Python,

00:22:30.800 --> 00:22:31.260
which is what

00:22:31.260 --> 00:22:31.640
people know

00:22:31.640 --> 00:22:32.000
anyway?

00:22:32.000 --> 00:22:33.040
Panda exists.

00:22:33.040 --> 00:22:34.260
There's a ton

00:22:34.260 --> 00:22:34.860
of people already

00:22:34.860 --> 00:22:35.780
doing the work

00:22:35.780 --> 00:22:36.360
maintaining it

00:22:36.360 --> 00:22:36.820
for us.

00:22:36.820 --> 00:22:38.160
Root literally

00:22:38.160 --> 00:22:38.800
has a string

00:22:38.800 --> 00:22:39.280
class.

00:22:39.280 --> 00:22:41.240
Literally,

00:22:41.240 --> 00:22:42.160
they do

00:22:42.160 --> 00:22:42.560
everything.

00:22:42.560 --> 00:22:44.740
So the idea,

00:22:44.740 --> 00:22:45.240
and this is

00:22:45.240 --> 00:22:45.540
sort of the

00:22:45.540 --> 00:22:45.980
idea behind

00:22:45.980 --> 00:22:46.700
Scikit-HEP

00:22:46.700 --> 00:22:47.720
was to build

00:22:47.720 --> 00:22:49.040
this collection

00:22:49.040 --> 00:22:50.880
of packages

00:22:50.880 --> 00:22:51.400
that would just

00:22:51.400 --> 00:22:51.800
fill in the

00:22:51.800 --> 00:22:52.440
missing pieces,

00:22:52.440 --> 00:22:53.440
the things that

00:22:53.440 --> 00:22:54.120
energy physicists

00:22:54.120 --> 00:22:54.900
were used to

00:22:54.900 --> 00:22:55.880
and needed.

00:22:55.880 --> 00:22:57.240
And some of

00:22:57.240 --> 00:22:57.720
them are general

00:22:57.720 --> 00:22:58.780
and were just

00:22:58.780 --> 00:22:59.520
gaps in the

00:22:59.520 --> 00:23:00.200
data science

00:23:00.200 --> 00:23:00.620
ecosystem,

00:23:00.620 --> 00:23:01.200
and some

00:23:01.200 --> 00:23:01.500
things are

00:23:01.500 --> 00:23:02.100
very specific,

00:23:02.100 --> 00:23:03.000
high energy

00:23:03.000 --> 00:23:03.420
physics.

00:23:03.420 --> 00:23:04.480
Scikit-HEP

00:23:04.480 --> 00:23:05.140
actually sort

00:23:05.140 --> 00:23:07.120
of originated

00:23:07.120 --> 00:23:08.540
as a single

00:23:08.540 --> 00:23:08.920
package.

00:23:08.920 --> 00:23:10.400
It sort of

00:23:10.400 --> 00:23:11.100
looked like

00:23:11.100 --> 00:23:12.420
root red at

00:23:12.420 --> 00:23:12.740
first,

00:23:12.740 --> 00:23:13.740
and it was

00:23:13.740 --> 00:23:14.680
invented by

00:23:14.680 --> 00:23:15.160
someone called

00:23:15.160 --> 00:23:15.460
Eduardo

00:23:15.460 --> 00:23:15.940
Rodriguez,

00:23:15.940 --> 00:23:16.380
who was

00:23:16.380 --> 00:23:16.800
actually in

00:23:16.800 --> 00:23:17.300
my office

00:23:17.300 --> 00:23:18.120
at CERN,

00:23:18.120 --> 00:23:18.840
and we're

00:23:18.840 --> 00:23:19.320
office mates.

00:23:19.320 --> 00:23:20.620
But he did

00:23:20.620 --> 00:23:21.180
something I think

00:23:21.180 --> 00:23:21.740
really brilliant

00:23:21.740 --> 00:23:22.340
when he did

00:23:22.340 --> 00:23:22.680
this,

00:23:22.680 --> 00:23:23.100
and that is

00:23:23.100 --> 00:23:23.420
he created

00:23:23.420 --> 00:23:24.280
an organization

00:23:24.280 --> 00:23:25.160
called Scikit-HEP

00:23:25.160 --> 00:23:25.820
around it,

00:23:25.820 --> 00:23:26.420
and then he

00:23:26.420 --> 00:23:27.180
went out and

00:23:27.180 --> 00:23:27.620
spoke with

00:23:27.620 --> 00:23:28.540
people and

00:23:28.540 --> 00:23:28.980
got some of

00:23:28.980 --> 00:23:29.300
the other

00:23:29.300 --> 00:23:30.040
Python packages

00:23:30.040 --> 00:23:30.760
that existed

00:23:30.760 --> 00:23:31.400
at the time

00:23:31.400 --> 00:23:32.180
to join

00:23:32.180 --> 00:23:32.920
Scikit-HEP,

00:23:32.920 --> 00:23:33.560
moved them

00:23:33.560 --> 00:23:33.920
over and

00:23:33.920 --> 00:23:34.520
started building

00:23:34.520 --> 00:23:35.840
a collection of

00:23:35.840 --> 00:23:36.160
some of the

00:23:36.160 --> 00:23:36.660
most popular

00:23:36.660 --> 00:23:37.700
Python packages

00:23:37.700 --> 00:23:38.280
at the time.

00:23:38.280 --> 00:23:39.800
And I thought

00:23:39.800 --> 00:23:40.420
that was great,

00:23:40.420 --> 00:23:41.880
and I really

00:23:41.880 --> 00:23:42.500
wanted Scikit-HEP

00:23:42.500 --> 00:23:43.400
to become a

00:23:43.400 --> 00:23:44.120
collection of

00:23:44.120 --> 00:23:44.420
tools,

00:23:44.420 --> 00:23:45.200
separate tools,

00:23:45.200 --> 00:23:45.920
and for the

00:23:45.920 --> 00:23:46.360
Scikit-HEP

00:23:46.360 --> 00:23:47.080
package to just

00:23:47.080 --> 00:23:48.000
be sort of a

00:23:48.000 --> 00:23:48.700
meta package that

00:23:48.700 --> 00:23:49.620
just grabbed all

00:23:49.620 --> 00:23:50.020
the rest.

00:23:50.380 --> 00:23:50.700
And that's

00:23:50.700 --> 00:23:51.160
actually kind of

00:23:51.160 --> 00:23:51.720
where it is now.

00:23:51.720 --> 00:23:52.200
Right.

00:23:52.200 --> 00:23:53.140
I can pip install

00:23:53.140 --> 00:23:53.740
Scikit-HEP.

00:23:53.740 --> 00:23:54.400
Is that right?

00:23:54.400 --> 00:23:55.140
You can,

00:23:55.140 --> 00:23:55.920
and mostly,

00:23:55.920 --> 00:23:56.600
other than a few

00:23:56.600 --> 00:23:57.200
little things that

00:23:57.200 --> 00:23:57.840
are still in there

00:23:57.840 --> 00:23:58.620
that never got

00:23:58.620 --> 00:23:59.000
pulled out,

00:23:59.000 --> 00:24:00.200
that will mostly

00:24:00.200 --> 00:24:01.340
just install our

00:24:01.340 --> 00:24:01.820
most popular,

00:24:01.820 --> 00:24:03.340
maybe 15 or so

00:24:03.340 --> 00:24:03.780
packages,

00:24:03.780 --> 00:24:05.160
2015 of our

00:24:05.160 --> 00:24:05.800
most popular

00:24:05.800 --> 00:24:06.200
packages.

00:24:06.200 --> 00:24:07.280
Yeah, so it

00:24:07.280 --> 00:24:08.160
probably doesn't

00:24:08.160 --> 00:24:08.840
really do anything

00:24:08.840 --> 00:24:09.380
other than,

00:24:09.380 --> 00:24:10.000
say, it depends

00:24:10.000 --> 00:24:11.280
on those packages

00:24:11.280 --> 00:24:12.160
or something like

00:24:12.160 --> 00:24:12.540
that, right?

00:24:12.540 --> 00:24:13.520
And then by

00:24:13.520 --> 00:24:14.320
virtue of installing

00:24:14.320 --> 00:24:14.820
it, it'll grab

00:24:14.820 --> 00:24:15.340
all the pieces.

00:24:15.340 --> 00:24:16.600
Yeah, yeah,

00:24:16.600 --> 00:24:17.020
that's a really

00:24:17.020 --> 00:24:17.960
cool idea and I

00:24:17.960 --> 00:24:18.360
like it.

00:24:18.360 --> 00:24:19.280
So maybe one of

00:24:19.280 --> 00:24:19.720
the things I

00:24:19.720 --> 00:24:20.060
thought would be

00:24:20.060 --> 00:24:21.320
fun is to go

00:24:21.320 --> 00:24:22.180
through some of

00:24:22.180 --> 00:24:23.540
the packages there

00:24:23.540 --> 00:24:24.240
to give people a

00:24:24.240 --> 00:24:25.260
sense of what's

00:24:25.260 --> 00:24:25.600
in here.

00:24:25.600 --> 00:24:26.680
Some of these are

00:24:26.680 --> 00:24:27.640
pretty particular

00:24:27.640 --> 00:24:28.580
and I don't think

00:24:28.580 --> 00:24:29.480
would find broad

00:24:29.480 --> 00:24:30.700
use outside of

00:24:30.700 --> 00:24:31.120
CERN.

00:24:31.120 --> 00:24:31.800
For example,

00:24:31.800 --> 00:24:32.980
Conda Forge

00:24:32.980 --> 00:24:33.460
Root.

00:24:33.460 --> 00:24:35.280
It sounds like

00:24:35.280 --> 00:24:35.720
that's about

00:24:35.720 --> 00:24:36.480
building root so

00:24:36.480 --> 00:24:37.260
I can install it

00:24:37.260 --> 00:24:37.880
as a dependency

00:24:37.880 --> 00:24:38.560
or something like

00:24:38.560 --> 00:24:38.960
that, right?

00:24:39.340 --> 00:24:40.320
building root is

00:24:40.320 --> 00:24:41.380
horrible and

00:24:41.380 --> 00:24:42.640
you actually now

00:24:42.640 --> 00:24:43.340
can get it as

00:24:43.340 --> 00:24:44.180
part of a

00:24:44.180 --> 00:24:45.480
Conda package

00:24:45.480 --> 00:24:46.260
which is just

00:24:46.260 --> 00:24:47.640
way better than

00:24:47.640 --> 00:24:48.200
anything that was

00:24:48.200 --> 00:24:49.000
available for

00:24:49.000 --> 00:24:49.680
attaching it to a

00:24:49.680 --> 00:24:50.560
specific version of

00:24:50.560 --> 00:24:51.100
Python because it

00:24:51.100 --> 00:24:52.600
has to compile

00:24:52.600 --> 00:24:53.400
against a very

00:24:53.400 --> 00:24:54.220
specific version of

00:24:54.220 --> 00:24:55.860
Python but that's

00:24:55.860 --> 00:24:56.500
what it does.

00:24:56.500 --> 00:24:57.160
So unless you want

00:24:57.160 --> 00:24:57.820
something in root

00:24:57.820 --> 00:25:00.040
then that's very

00:25:00.040 --> 00:25:00.820
HEP specific.

00:25:00.820 --> 00:25:01.740
Yeah, absolutely.

00:25:01.740 --> 00:25:02.000
Some more general

00:25:02.000 --> 00:25:03.420
ones, probably

00:25:03.420 --> 00:25:05.180
our first, briefly

00:25:05.180 --> 00:25:05.720
Mitch, our very

00:25:05.720 --> 00:25:07.040
first package that I

00:25:07.040 --> 00:25:07.540
think was really

00:25:07.540 --> 00:25:08.220
popular among

00:25:08.220 --> 00:25:09.700
energy physicists

00:25:09.700 --> 00:25:12.000
that we actually

00:25:12.000 --> 00:25:13.920
produced was

00:25:13.920 --> 00:25:15.520
Uproot which was

00:25:15.520 --> 00:25:16.840
just a pure Python

00:25:16.840 --> 00:25:17.440
package so you

00:25:17.440 --> 00:25:17.940
didn't have to

00:25:17.940 --> 00:25:19.020
install it that

00:25:19.020 --> 00:25:19.720
read root files.

00:25:19.720 --> 00:25:20.600
Again, very

00:25:20.600 --> 00:25:21.400
specific for

00:25:21.400 --> 00:25:23.240
somebody who was

00:25:23.240 --> 00:25:24.500
in high energy

00:25:24.500 --> 00:25:25.620
physics but you

00:25:25.620 --> 00:25:26.840
could actually read

00:25:26.840 --> 00:25:27.580
a root file and

00:25:27.580 --> 00:25:28.920
get your data without

00:25:28.920 --> 00:25:29.780
installing root and

00:25:29.780 --> 00:25:30.720
that was a game

00:25:30.720 --> 00:25:31.040
changer.

00:25:31.040 --> 00:25:32.900
So now you can

00:25:32.900 --> 00:25:33.500
actually install

00:25:33.500 --> 00:25:34.400
root slightly easier

00:25:34.400 --> 00:25:35.120
but normally it's a

00:25:35.120 --> 00:25:36.340
multi-hour compile and

00:25:36.340 --> 00:25:38.200
it's got.

00:25:38.200 --> 00:25:38.680
gotten better but

00:25:38.680 --> 00:25:39.560
it's still a bit of

00:25:39.560 --> 00:25:40.260
a beast to compile

00:25:40.260 --> 00:25:40.960
especially for Python.

00:25:40.960 --> 00:25:41.340
Yeah, that does

00:25:41.340 --> 00:25:41.980
sound like a beast.

00:25:41.980 --> 00:25:42.560
Oh my gosh.

00:25:42.560 --> 00:25:43.220
And now you can

00:25:43.220 --> 00:25:43.900
just read in your

00:25:43.900 --> 00:25:44.200
files.

00:25:44.200 --> 00:25:46.280
Basically, Jim

00:25:46.280 --> 00:25:46.920
Povarsky basically

00:25:46.920 --> 00:25:47.880
just taught Python

00:25:47.880 --> 00:25:49.100
to understand the

00:25:49.100 --> 00:25:50.480
decompile the root

00:25:50.480 --> 00:25:52.260
file structure and

00:25:52.260 --> 00:25:52.860
actually can write

00:25:52.860 --> 00:25:53.720
right now too but

00:25:53.720 --> 00:25:54.420
originally reading.

00:25:54.420 --> 00:25:55.900
But that actually

00:25:55.900 --> 00:25:56.120
was really...

00:25:56.120 --> 00:25:56.540
So this is like if

00:25:56.540 --> 00:25:58.520
I want to do, if I

00:25:58.520 --> 00:25:58.940
want to create a

00:25:58.940 --> 00:25:59.560
notebook and maybe

00:25:59.560 --> 00:26:00.400
visualize some of

00:26:00.400 --> 00:26:00.960
the data but I

00:26:00.960 --> 00:26:01.500
don't really need

00:26:01.500 --> 00:26:02.240
access to anything

00:26:02.240 --> 00:26:03.600
else, I shouldn't

00:26:03.600 --> 00:26:04.220
depend on this

00:26:04.220 --> 00:26:06.520
beast of almost its

00:26:06.520 --> 00:26:07.560
own operating system

00:26:07.560 --> 00:26:08.120
type of thing.

00:26:08.120 --> 00:26:09.260
Yeah, we were

00:26:09.260 --> 00:26:10.280
very close to being

00:26:10.280 --> 00:26:11.140
able to use all the

00:26:11.140 --> 00:26:12.360
data science tools in

00:26:12.360 --> 00:26:13.300
Python, pandas, things

00:26:13.300 --> 00:26:13.680
like that.

00:26:13.680 --> 00:26:15.140
For most data

00:26:15.140 --> 00:26:15.860
worked fine.

00:26:15.860 --> 00:26:17.240
You just had to get

00:26:17.240 --> 00:26:17.600
the data.

00:26:17.600 --> 00:26:19.800
And I mean, I've

00:26:19.800 --> 00:26:20.560
done this too where

00:26:20.560 --> 00:26:22.220
I had one special

00:26:22.220 --> 00:26:23.240
install of Python and

00:26:23.240 --> 00:26:24.580
root together that I'd

00:26:24.580 --> 00:26:25.660
worked several hours on

00:26:25.660 --> 00:26:26.480
and it sat somewhere

00:26:26.480 --> 00:26:27.320
and I would convert

00:26:27.320 --> 00:26:27.980
data with it.

00:26:27.980 --> 00:26:29.240
I'd move it to HDF5

00:26:29.240 --> 00:26:31.360
and then I would do

00:26:31.360 --> 00:26:31.860
all the rest of the

00:26:31.860 --> 00:26:32.760
analysis in Python that

00:26:32.760 --> 00:26:33.620
didn't have it because

00:26:33.620 --> 00:26:34.060
then I could do

00:26:34.060 --> 00:26:35.040
virtual environments and all

00:26:35.040 --> 00:26:36.940
that read that HDF5

00:26:36.940 --> 00:26:37.580
format, right?

00:26:37.580 --> 00:26:37.940
Mm-hmm.

00:26:37.940 --> 00:26:38.440
Yeah.

00:26:38.440 --> 00:26:39.100
Right, okay.

00:26:39.100 --> 00:26:40.860
The first package we

00:26:40.860 --> 00:26:41.460
had that was really

00:26:41.460 --> 00:26:43.640
popular on its own was

00:26:43.640 --> 00:26:44.320
Awkward Array.

00:26:44.320 --> 00:26:44.780
Yeah.

00:26:44.780 --> 00:26:45.800
Awkward Arrays.

00:26:45.800 --> 00:26:47.300
I definitely heard

00:26:47.300 --> 00:26:47.960
about this one, yeah.

00:26:47.960 --> 00:26:49.000
Yeah, that was

00:26:49.000 --> 00:26:50.180
originally part of

00:26:50.180 --> 00:26:50.860
Upproot, sort of grew

00:26:50.860 --> 00:26:51.380
out of Upproot.

00:26:51.380 --> 00:26:52.280
When you're reading

00:26:52.280 --> 00:26:53.460
root files, you end up

00:26:53.460 --> 00:26:54.960
with these jagged

00:26:54.960 --> 00:26:55.400
arrays.

00:26:55.400 --> 00:26:57.440
So that's an array that

00:26:57.440 --> 00:26:58.860
is not rectangular.

00:26:58.860 --> 00:27:00.080
So at least one

00:27:00.080 --> 00:27:01.860
dimension is jagged.

00:27:01.920 --> 00:27:02.820
it depends on the data

00:27:02.820 --> 00:27:04.260
and this shows up in

00:27:04.260 --> 00:27:05.360
all sorts of places

00:27:05.360 --> 00:27:07.040
and not just particle

00:27:07.040 --> 00:27:08.200
collisions or obviously

00:27:08.200 --> 00:27:09.080
shows up lots of places

00:27:09.080 --> 00:27:09.820
in particle collisions

00:27:09.820 --> 00:27:10.700
like how many hits

00:27:10.700 --> 00:27:11.780
got triggered in the

00:27:11.780 --> 00:27:12.100
detector.

00:27:12.100 --> 00:27:12.920
That's a variable length

00:27:12.920 --> 00:27:13.200
list.

00:27:13.200 --> 00:27:14.320
How many tracks are in

00:27:14.320 --> 00:27:14.620
an event?

00:27:14.620 --> 00:27:15.320
You know, that's a

00:27:15.320 --> 00:27:15.980
variable length list

00:27:15.980 --> 00:27:17.460
and can be a variable

00:27:17.460 --> 00:27:18.040
length list of

00:27:18.040 --> 00:27:18.580
structured data.

00:27:18.580 --> 00:27:20.180
And to store that

00:27:20.180 --> 00:27:21.220
compactly the same way

00:27:21.220 --> 00:27:23.600
you'd use numpy was

00:27:23.600 --> 00:27:25.260
one thing, but you can

00:27:25.260 --> 00:27:26.420
arrow and there's some

00:27:26.420 --> 00:27:27.100
other, there's some

00:27:27.100 --> 00:27:27.640
other things that do

00:27:27.640 --> 00:27:28.740
this, but Awkward Array

00:27:28.740 --> 00:27:30.640
also gives you numpy

00:27:30.640 --> 00:27:33.160
like indexing and data

00:27:33.160 --> 00:27:33.740
manipulation.

00:27:33.740 --> 00:27:35.100
And that was the sort

00:27:35.100 --> 00:27:36.040
of the breakthrough

00:27:36.040 --> 00:27:36.740
thing here.

00:27:36.740 --> 00:27:38.340
It's like numpy.

00:27:38.340 --> 00:27:39.120
The original one was

00:27:39.120 --> 00:27:40.060
built on top of numpy.

00:27:40.060 --> 00:27:41.640
The new one actually

00:27:41.640 --> 00:27:43.240
has some pybind11

00:27:43.240 --> 00:27:44.680
compiled bits and pieces,

00:27:44.680 --> 00:27:47.120
but it makes working

00:27:47.120 --> 00:27:47.960
with that really well.

00:27:47.960 --> 00:27:48.780
In fact, Jim Povarsky

00:27:48.780 --> 00:27:50.580
has now got a grant

00:27:50.580 --> 00:27:52.300
to expand this to,

00:27:52.300 --> 00:27:53.280
I don't remember the

00:27:53.280 --> 00:27:54.180
number of different

00:27:54.180 --> 00:27:55.160
disciplines that he's

00:27:55.160 --> 00:27:56.240
working with, but lots

00:27:56.240 --> 00:27:57.000
of different areas,

00:27:57.140 --> 00:27:58.300
genomics and things

00:27:58.300 --> 00:27:59.160
like that have all

00:27:59.160 --> 00:28:00.720
use cases and he's

00:28:00.720 --> 00:28:02.040
adding things like

00:28:02.040 --> 00:28:03.080
complex numbers and

00:28:03.080 --> 00:28:03.560
things that weren't

00:28:03.560 --> 00:28:04.940
originally needed by

00:28:04.940 --> 00:28:05.860
energy physicists, but

00:28:05.860 --> 00:28:07.000
make it widely.

00:28:07.000 --> 00:28:08.920
Almost an evangelism,

00:28:08.920 --> 00:28:10.360
like dev evangelism

00:28:10.360 --> 00:28:11.860
type of role, right?

00:28:11.860 --> 00:28:13.040
Go talk to the other

00:28:13.040 --> 00:28:14.360
groups and say, hey,

00:28:14.360 --> 00:28:15.600
we think you should be

00:28:15.600 --> 00:28:16.140
using this.

00:28:16.140 --> 00:28:17.280
What is it missing for

00:28:17.280 --> 00:28:18.100
you to really love it?

00:28:18.100 --> 00:28:18.780
Something like that,

00:28:18.780 --> 00:28:19.000
right?

00:28:19.000 --> 00:28:20.140
How interesting.

00:28:20.140 --> 00:28:20.700
Yeah.

00:28:20.700 --> 00:28:22.440
So, yeah.

00:28:22.440 --> 00:28:22.960
Yeah.

00:28:22.960 --> 00:28:23.640
So looking at the

00:28:23.640 --> 00:28:25.080
Awkward Array page

00:28:25.080 --> 00:28:26.840
here says for a

00:28:26.840 --> 00:28:27.500
similar problem,

00:28:27.500 --> 00:28:28.600
10 million times

00:28:28.600 --> 00:28:29.340
larger than this

00:28:29.340 --> 00:28:30.260
example given above,

00:28:30.260 --> 00:28:31.840
which one above is

00:28:31.840 --> 00:28:32.620
not totally simple.

00:28:32.620 --> 00:28:33.400
So that's pretty

00:28:33.400 --> 00:28:33.680
crazy.

00:28:33.680 --> 00:28:34.860
It says Awkward

00:28:34.860 --> 00:28:36.760
Array, the one

00:28:36.760 --> 00:28:38.220
liner takes 4.6

00:28:38.220 --> 00:28:39.340
seconds to run and

00:28:39.340 --> 00:28:40.240
uses 2 gigs of

00:28:40.240 --> 00:28:40.500
memory.

00:28:40.500 --> 00:28:41.360
The equivalent

00:28:41.360 --> 00:28:42.340
Python list in

00:28:42.340 --> 00:28:43.900
dictionaries takes over

00:28:43.900 --> 00:28:45.600
two minutes and uses

00:28:45.600 --> 00:28:46.800
10 times as much

00:28:46.800 --> 00:28:47.900
memory, 22 gigs.

00:28:47.900 --> 00:28:48.880
So, yeah, that's a

00:28:48.880 --> 00:28:50.300
pretty appealing value

00:28:50.300 --> 00:28:50.960
proposition there.

00:28:50.960 --> 00:28:51.400
Yeah.

00:28:51.400 --> 00:28:52.800
And it supports

00:28:52.800 --> 00:28:53.200
Numba.

00:28:53.200 --> 00:28:55.880
Jim works very closely

00:28:55.880 --> 00:28:56.520
with the Numba team

00:28:56.520 --> 00:28:58.040
and really is one of

00:28:58.040 --> 00:28:58.680
the experts on the

00:28:58.680 --> 00:28:59.360
Numba internals.

00:28:59.360 --> 00:29:01.720
So, yeah, it has full

00:29:01.720 --> 00:29:02.660
Numba support now and

00:29:02.660 --> 00:29:03.560
he's working on adding

00:29:03.560 --> 00:29:04.060
Dask.

00:29:04.060 --> 00:29:05.000
He's working with

00:29:05.000 --> 00:29:06.360
Anaconda on this grant

00:29:06.360 --> 00:29:08.360
and then working with

00:29:08.360 --> 00:29:09.920
adding GPU support.

00:29:09.920 --> 00:29:10.620
Very cool.

00:29:10.620 --> 00:29:11.840
Maybe not everyone out

00:29:11.840 --> 00:29:12.400
there knows what

00:29:12.400 --> 00:29:12.880
Numba is.

00:29:12.880 --> 00:29:14.640
Maybe give us a quick

00:29:14.640 --> 00:29:15.500
elevator pitch on

00:29:15.500 --> 00:29:15.720
Numba.

00:29:15.720 --> 00:29:15.820
Yeah.

00:29:15.820 --> 00:29:16.840
I hear it makes

00:29:16.840 --> 00:29:17.680
Python code fast,

00:29:17.680 --> 00:29:17.900
right?

00:29:18.300 --> 00:29:20.660
Yeah, it's a just-in-time

00:29:20.660 --> 00:29:23.020
compiler and it takes

00:29:23.020 --> 00:29:23.540
Python.

00:29:23.540 --> 00:29:24.600
It takes Python.

00:29:24.600 --> 00:29:25.360
It actually takes the

00:29:25.360 --> 00:29:27.660
bytecode and then it

00:29:27.660 --> 00:29:30.240
basically takes that

00:29:30.240 --> 00:29:31.780
back to something or it

00:29:31.780 --> 00:29:32.760
parses the bytes code

00:29:32.760 --> 00:29:34.420
and turns it into LLVM.

00:29:34.420 --> 00:29:35.640
So it works a lot like

00:29:35.640 --> 00:29:37.600
Julia except instead of

00:29:37.600 --> 00:29:38.400
a new language, it's

00:29:38.400 --> 00:29:39.400
actually reading Python

00:29:39.400 --> 00:29:40.680
bytecode, which is

00:29:40.680 --> 00:29:41.880
challenging because the

00:29:41.880 --> 00:29:43.300
Python bytecode is not

00:29:43.300 --> 00:29:44.180
something that stays

00:29:44.180 --> 00:29:45.420
static or is supposed to

00:29:45.420 --> 00:29:47.340
be a public detail.

00:29:47.340 --> 00:29:48.560
Yeah, there's no

00:29:48.560 --> 00:29:50.360
public promises about

00:29:50.360 --> 00:29:52.440
consistency of bytecode

00:29:52.440 --> 00:29:54.720
across versions because

00:29:54.720 --> 00:29:55.540
they play with that all

00:29:55.540 --> 00:29:56.540
the time to try to

00:29:56.540 --> 00:29:57.700
speed up things and

00:29:57.700 --> 00:29:58.760
they add bytecodes and

00:29:58.760 --> 00:29:59.520
they try to do little

00:29:59.520 --> 00:30:00.180
optimizations.

00:30:00.180 --> 00:30:01.900
Yeah, so every Python

00:30:01.900 --> 00:30:02.660
release breaks Numba.

00:30:02.660 --> 00:30:04.020
So they have to, they

00:30:04.020 --> 00:30:04.800
just know the next

00:30:04.800 --> 00:30:05.740
Python release will not

00:30:05.740 --> 00:30:06.460
support Numba and it

00:30:06.460 --> 00:30:07.620
usually takes a month

00:30:07.620 --> 00:30:07.940
or two.

00:30:07.940 --> 00:30:10.420
But it's very

00:30:10.420 --> 00:30:11.080
impressive though.

00:30:11.080 --> 00:30:12.160
It's the speedups,

00:30:12.160 --> 00:30:13.840
you do get full sort of

00:30:13.840 --> 00:30:15.660
C type speedups for

00:30:15.660 --> 00:30:16.100
something that looks

00:30:16.100 --> 00:30:16.900
just like Python.

00:30:16.900 --> 00:30:18.660
It compiles really fast

00:30:18.660 --> 00:30:20.460
for a small problem and

00:30:20.460 --> 00:30:22.500
it's as fast as anything

00:30:22.500 --> 00:30:23.220
else you can do.

00:30:23.220 --> 00:30:25.820
I've tried lots of

00:30:25.820 --> 00:30:26.900
these various

00:30:26.900 --> 00:30:28.000
programming problems and

00:30:28.000 --> 00:30:29.340
you just about can't

00:30:29.340 --> 00:30:29.720
beat Numba.

00:30:29.720 --> 00:30:30.680
It actually knows what

00:30:30.680 --> 00:30:32.040
your architecture is

00:30:32.040 --> 00:30:32.840
since it's just in time

00:30:32.840 --> 00:30:33.260
compiling.

00:30:33.260 --> 00:30:34.620
So you have to do

00:30:34.620 --> 00:30:35.320
which is an advantage

00:30:35.320 --> 00:30:37.020
over say like C, right?

00:30:37.020 --> 00:30:38.640
It can look exactly at

00:30:38.640 --> 00:30:40.200
what your platform is and

00:30:40.200 --> 00:30:41.060
your machine architecture

00:30:41.060 --> 00:30:41.640
and say we're going to

00:30:41.640 --> 00:30:43.080
target, you know, I see

00:30:43.080 --> 00:30:44.760
your CPU supports this

00:30:44.760 --> 00:30:46.220
special vectorized thing

00:30:46.220 --> 00:30:46.860
or whatever and it's

00:30:46.860 --> 00:30:47.580
going to build that in,

00:30:47.580 --> 00:30:47.760
right?

00:30:47.760 --> 00:30:49.140
and then what sort of Jim

00:30:49.140 --> 00:30:50.060
does with Awkward and

00:30:50.060 --> 00:30:50.800
we've done with some

00:30:50.800 --> 00:30:51.960
other things with Vector

00:30:51.960 --> 00:30:52.600
does this too.

00:30:52.600 --> 00:30:55.360
You can control what

00:30:55.360 --> 00:30:57.260
Python turns into what

00:30:57.260 --> 00:30:58.820
LLVM constructs any

00:30:58.820 --> 00:31:00.080
Python turns into because

00:31:00.080 --> 00:31:00.920
you can control that

00:31:00.920 --> 00:31:02.240
compile phase.

00:31:02.240 --> 00:31:03.220
That's incredibly

00:31:03.220 --> 00:31:04.620
powerful because you can

00:31:04.620 --> 00:31:05.640
say and it doesn't have

00:31:05.640 --> 00:31:06.700
to be the same thing but

00:31:06.700 --> 00:31:08.100
obviously you want it to

00:31:08.100 --> 00:31:08.820
behave the same way.

00:31:08.820 --> 00:31:09.760
You can say if you see

00:31:09.760 --> 00:31:11.840
this structure, this is

00:31:11.840 --> 00:31:12.520
what it turns into.

00:31:12.900 --> 00:31:16.160
in LLVM machine code

00:31:16.160 --> 00:31:17.160
which then gets compiled

00:31:17.160 --> 00:31:18.280
or machine language

00:31:18.280 --> 00:31:19.040
which then gets compiled

00:31:19.040 --> 00:31:20.420
into your native machine

00:31:20.420 --> 00:31:21.140
language.

00:31:21.140 --> 00:31:21.820
Interesting.

00:31:21.820 --> 00:31:22.320
Assembling.

00:31:22.320 --> 00:31:23.180
So if you have like a

00:31:23.180 --> 00:31:24.080
certain data structure

00:31:24.080 --> 00:31:25.060
that you know can be

00:31:25.060 --> 00:31:27.500
well represented or gets

00:31:27.500 --> 00:31:28.400
packed up in a certain

00:31:28.400 --> 00:31:29.440
way to be super efficient

00:31:29.440 --> 00:31:30.580
you can control that?

00:31:30.580 --> 00:31:32.400
Yeah, you can say that

00:31:32.400 --> 00:31:33.320
well this like this

00:31:33.320 --> 00:31:34.220
operation on this data

00:31:34.220 --> 00:31:35.100
structure, this is what

00:31:35.100 --> 00:31:36.280
this is what it should do

00:31:36.280 --> 00:31:38.000
and then that turns into

00:31:38.000 --> 00:31:38.940
LLVM and maybe it can

00:31:38.940 --> 00:31:40.480
get vectorized or things

00:31:40.480 --> 00:31:41.500
like that for you.

00:31:41.500 --> 00:31:42.440
Yeah, yeah.

00:31:42.700 --> 00:31:43.400
That's super neat.

00:31:43.400 --> 00:31:44.800
Another package in the

00:31:44.800 --> 00:31:45.860
list that I got to talk

00:31:45.860 --> 00:31:46.800
about because just the

00:31:46.800 --> 00:31:47.960
name and the graphic is

00:31:47.960 --> 00:31:49.880
fantastic is a gassed.

00:31:49.880 --> 00:31:51.120
What is a gassed?

00:31:51.120 --> 00:31:51.760
It's got like this

00:31:51.760 --> 00:31:53.200
the scream.

00:31:53.200 --> 00:31:54.740
I forgot who was the

00:31:54.740 --> 00:31:55.880
artist of that but the

00:31:55.880 --> 00:31:58.100
scream sort of look as

00:31:58.100 --> 00:31:59.160
part of the logo is good.

00:31:59.160 --> 00:32:01.420
About half of the logos

00:32:01.420 --> 00:32:02.760
come from Jim and I did

00:32:02.760 --> 00:32:04.100
about half and he did

00:32:04.100 --> 00:32:04.940
about half and then

00:32:04.940 --> 00:32:06.480
use other around or from

00:32:06.480 --> 00:32:07.160
the individual package

00:32:07.160 --> 00:32:07.480
authors.

00:32:07.480 --> 00:32:09.660
A gassed was so this is

00:32:09.660 --> 00:32:10.200
sort of part of the

00:32:10.200 --> 00:32:11.440
histogramming area which

00:32:11.440 --> 00:32:12.240
is where sort of the

00:32:12.240 --> 00:32:13.100
area I work in,

00:32:13.100 --> 00:32:13.720
psychic hub.

00:32:13.720 --> 00:32:15.280
But Jim actually wrote

00:32:15.280 --> 00:32:16.320
a gassed and the idea

00:32:16.320 --> 00:32:16.860
was that it would

00:32:16.860 --> 00:32:17.800
convert between

00:32:17.800 --> 00:32:19.040
histogram representations.

00:32:19.040 --> 00:32:20.260
I think it came up

00:32:20.260 --> 00:32:21.680
because Jim got tired of

00:32:21.680 --> 00:32:22.700
writing histogram libraries.

00:32:22.700 --> 00:32:23.800
I think he's written at

00:32:23.800 --> 00:32:24.280
least five.

00:32:24.280 --> 00:32:25.720
Yeah, one of the things

00:32:25.720 --> 00:32:27.960
I got the sense of by

00:32:27.960 --> 00:32:28.680
looking through all the

00:32:28.680 --> 00:32:29.480
psychic hub stuff,

00:32:29.480 --> 00:32:30.340
there's a lot of

00:32:30.340 --> 00:32:31.280
histogram stuff happening

00:32:31.280 --> 00:32:31.740
over there.

00:32:32.800 --> 00:32:34.640
histograms are sort of the

00:32:34.640 --> 00:32:35.840
area that I was in and it

00:32:35.840 --> 00:32:36.920
ended up coming in in

00:32:36.920 --> 00:32:37.640
several pieces.

00:32:37.640 --> 00:32:39.060
But I think one of the

00:32:39.060 --> 00:32:40.080
important things was

00:32:40.080 --> 00:32:40.880
actually, and I think a

00:32:40.880 --> 00:32:41.820
gassed may not really

00:32:41.820 --> 00:32:43.300
matter, it may get

00:32:43.300 --> 00:32:44.240
archived at some point

00:32:44.240 --> 00:32:45.840
because instead of

00:32:45.840 --> 00:32:47.500
translating between

00:32:47.500 --> 00:32:49.160
different representations of

00:32:49.160 --> 00:32:50.460
histograms in memory,

00:32:50.460 --> 00:32:52.080
what you can do is define

00:32:52.080 --> 00:32:53.600
a static typing protocol

00:32:53.600 --> 00:32:56.460
and it can be checked by

00:32:56.460 --> 00:32:57.920
mypy that describes

00:32:57.920 --> 00:33:01.060
what a object needs

00:33:01.060 --> 00:33:01.660
to be called a

00:33:01.660 --> 00:33:01.980
histogram.

00:33:01.980 --> 00:33:03.300
And so I've defined that

00:33:03.300 --> 00:33:04.020
as a package called

00:33:04.020 --> 00:33:05.500
UHI, Universal Histogram

00:33:05.500 --> 00:33:06.200
Interface.

00:33:06.200 --> 00:33:07.380
And anything that

00:33:07.380 --> 00:33:08.600
implements UHI, it can

00:33:08.600 --> 00:33:09.860
be fully checked by

00:33:09.860 --> 00:33:11.960
mypy, will then be able

00:33:11.960 --> 00:33:14.040
to take any object

00:33:14.040 --> 00:33:16.060
from any library that

00:33:16.060 --> 00:33:17.040
implements UHI.

00:33:17.040 --> 00:33:19.620
And so all the libraries

00:33:19.620 --> 00:33:20.320
we have that produce

00:33:20.320 --> 00:33:21.260
histograms, so uproot,

00:33:21.260 --> 00:33:22.140
when it reads a root

00:33:22.140 --> 00:33:24.240
histogram or hist and

00:33:24.240 --> 00:33:25.400
boost histogram, when they

00:33:25.400 --> 00:33:26.880
produce histograms, they

00:33:26.880 --> 00:33:27.680
don't need to depend on

00:33:27.680 --> 00:33:28.060
each other.

00:33:28.060 --> 00:33:28.900
They don't even depend on

00:33:28.900 --> 00:33:29.840
UHI, that's just a

00:33:29.840 --> 00:33:31.240
static dependency for

00:33:31.240 --> 00:33:32.360
mypy time.

00:33:32.360 --> 00:33:34.140
And then they can be

00:33:34.140 --> 00:33:37.220
plotted in MPLHEP or

00:33:37.220 --> 00:33:38.560
they can be printed to

00:33:38.560 --> 00:33:39.380
the terminal with

00:33:39.380 --> 00:33:40.700
histoprint and there's

00:33:40.700 --> 00:33:42.460
no dependencies there.

00:33:42.460 --> 00:33:43.420
One doesn't need the

00:33:43.420 --> 00:33:43.620
other.

00:33:43.620 --> 00:33:44.620
And that's sort of

00:33:44.620 --> 00:33:46.040
making a gassed somewhat

00:33:46.040 --> 00:33:47.780
unneeded because now

00:33:47.780 --> 00:33:48.560
it really doesn't matter.

00:33:48.560 --> 00:33:49.120
You don't have to

00:33:49.120 --> 00:33:49.900
convert between two

00:33:49.900 --> 00:33:51.400
because they both just

00:33:51.400 --> 00:33:51.680
work.

00:33:51.680 --> 00:33:52.820
They work on the same

00:33:52.820 --> 00:33:54.380
underlying structure

00:33:54.380 --> 00:33:54.960
basically, right?

00:33:54.960 --> 00:33:56.740
They work through the

00:33:56.740 --> 00:33:57.660
same interface.

00:33:57.660 --> 00:33:58.280
Right.

00:33:58.280 --> 00:33:58.800
Yeah.

00:33:58.800 --> 00:34:00.560
So a gassed is a way to

00:34:00.560 --> 00:34:01.460
work with different

00:34:01.460 --> 00:34:02.920
histogramming libraries

00:34:02.920 --> 00:34:04.920
that kind of is the

00:34:04.920 --> 00:34:06.460
intermediary of that.

00:34:06.460 --> 00:34:07.800
It's like an abstraction

00:34:07.800 --> 00:34:08.360
layer on that.

00:34:08.360 --> 00:34:08.800
Okay.

00:34:08.800 --> 00:34:10.360
What are some other ones?

00:34:10.360 --> 00:34:11.440
Yeah.

00:34:11.440 --> 00:34:12.520
What are some other ones

00:34:12.520 --> 00:34:13.720
we should kind of give a

00:34:13.720 --> 00:34:14.040
shout out to?

00:34:14.040 --> 00:34:15.800
we talked about GUFIT, which

00:34:15.800 --> 00:34:16.720
is there.

00:34:16.720 --> 00:34:17.860
It's an affiliated package.

00:34:17.860 --> 00:34:18.440
It's not part of

00:34:18.440 --> 00:34:19.440
scikit-hep, but it has.

00:34:19.440 --> 00:34:21.680
So we developed this idea

00:34:21.680 --> 00:34:22.740
of an affiliated package

00:34:22.740 --> 00:34:23.560
for sure things that

00:34:23.560 --> 00:34:24.300
didn't need to be moved

00:34:24.300 --> 00:34:26.940
in, but had at least one

00:34:26.940 --> 00:34:27.900
scikit-hep developer

00:34:27.900 --> 00:34:30.200
working with them.

00:34:30.200 --> 00:34:31.360
At least that's my

00:34:31.360 --> 00:34:31.760
definition.

00:34:31.760 --> 00:34:32.520
I was never able to

00:34:32.520 --> 00:34:33.660
actually get the rest to

00:34:33.660 --> 00:34:34.660
agree to exactly that

00:34:34.660 --> 00:34:36.200
definition, but that's my

00:34:36.200 --> 00:34:36.820
working definition.

00:34:37.380 --> 00:34:38.120
And so that's why

00:34:38.120 --> 00:34:39.200
pybind11 gets listed

00:34:39.200 --> 00:34:39.460
there.

00:34:39.460 --> 00:34:41.920
It's an affiliated package

00:34:41.920 --> 00:34:42.780
because we share a

00:34:42.780 --> 00:34:44.520
developer, me, with the

00:34:44.520 --> 00:34:45.480
pybind11 library.

00:34:45.480 --> 00:34:47.980
And we sort of have a

00:34:47.980 --> 00:34:49.620
say in how that is

00:34:49.620 --> 00:34:50.140
developed.

00:34:50.140 --> 00:34:52.000
And most importantly, if

00:34:52.000 --> 00:34:52.840
we have somebody come

00:34:52.840 --> 00:34:53.820
into scikit-hep, we want

00:34:53.820 --> 00:34:54.720
them to use pybind11

00:34:54.720 --> 00:34:55.860
over the other tools

00:34:55.860 --> 00:34:57.820
because that one we have

00:34:57.820 --> 00:34:58.760
a lot of experience with.

00:34:58.760 --> 00:34:59.400
Very cool.

00:34:59.400 --> 00:35:00.720
Another one I thought

00:35:00.720 --> 00:35:01.380
was interesting is

00:35:01.380 --> 00:35:02.200
hep units.

00:35:02.200 --> 00:35:03.860
So this idea of

00:35:03.860 --> 00:35:05.380
representing units like

00:35:05.380 --> 00:35:06.740
the standard units,

00:35:07.020 --> 00:35:08.360
they're not enough for

00:35:08.360 --> 00:35:08.520
us.

00:35:08.520 --> 00:35:09.920
We have our own kind of

00:35:09.920 --> 00:35:12.420
things like molarity and

00:35:12.420 --> 00:35:14.840
stuff, but also luminosity

00:35:14.840 --> 00:35:16.060
and other stuff, right?

00:35:16.060 --> 00:35:16.620
Yeah.

00:35:16.620 --> 00:35:17.640
Different experience

00:35:17.640 --> 00:35:20.180
service can differ a bit.

00:35:20.180 --> 00:35:21.300
So there's a sort of a

00:35:21.300 --> 00:35:22.320
standard that got built up

00:35:22.320 --> 00:35:23.400
for units.

00:35:23.400 --> 00:35:25.280
And so this just sort of

00:35:25.280 --> 00:35:26.500
puts that together and

00:35:26.500 --> 00:35:29.660
has, and the unit that

00:35:29.660 --> 00:35:30.500
we've sort of decided on

00:35:30.500 --> 00:35:31.160
this should be the

00:35:31.160 --> 00:35:32.440
standard unit that's one

00:35:32.440 --> 00:35:32.920
and the rest are

00:35:32.920 --> 00:35:33.640
different scalers.

00:35:33.640 --> 00:35:34.800
It's a very tiny little

00:35:34.800 --> 00:35:35.140
library.

00:35:35.140 --> 00:35:36.340
It was the first one to be

00:35:36.340 --> 00:35:36.660
fully statically

00:35:36.660 --> 00:35:38.080
typed because it was

00:35:38.080 --> 00:35:38.340
tiny.

00:35:38.340 --> 00:35:39.660
That's easy to do.

00:35:39.660 --> 00:35:40.680
It was like, because

00:35:40.680 --> 00:35:41.900
mypy infers constants,

00:35:41.900 --> 00:35:43.100
there was like two

00:35:43.100 --> 00:35:43.860
functions or something

00:35:43.860 --> 00:35:44.520
and then it was done.

00:35:44.520 --> 00:35:45.060
Yeah.

00:35:45.060 --> 00:35:46.620
Probably a lot of floats.

00:35:46.620 --> 00:35:47.520
Mm-hmm.

00:35:47.520 --> 00:35:49.500
So, but, and that's,

00:35:49.500 --> 00:35:50.540
that's sort of what it is.

00:35:50.540 --> 00:35:51.540
So you can use that if,

00:35:51.540 --> 00:35:52.640
and the idea is that the

00:35:52.640 --> 00:35:53.560
rest of the libraries

00:35:53.560 --> 00:35:55.580
will, will adhere to

00:35:55.580 --> 00:35:57.260
that system of units.

00:35:57.260 --> 00:35:58.440
So then if you use this

00:35:58.440 --> 00:35:59.540
and then use that, the

00:35:59.540 --> 00:36:00.440
values it gives you,

00:36:00.440 --> 00:36:02.960
then you can have a nice

00:36:02.960 --> 00:36:03.960
human, human readable

00:36:03.960 --> 00:36:05.080
units and be sure of your

00:36:05.080 --> 00:36:05.380
units.

00:36:05.720 --> 00:36:05.940
Yeah.

00:36:05.940 --> 00:36:06.960
That's really neat.

00:36:06.960 --> 00:36:08.140
Have you heard of pint?

00:36:08.140 --> 00:36:09.360
Are you familiar with this

00:36:09.360 --> 00:36:09.500
one?

00:36:09.500 --> 00:36:10.420
Yes, I love pint.

00:36:10.420 --> 00:36:11.780
Oh gosh, I think

00:36:11.780 --> 00:36:13.260
pint is interesting as well.

00:36:13.260 --> 00:36:14.500
It takes the types

00:36:14.500 --> 00:36:15.780
through and I use pint

00:36:15.780 --> 00:36:17.640
some, but it actually

00:36:17.640 --> 00:36:19.260
gives you a quantity out

00:36:19.260 --> 00:36:21.580
or a numpy quantity.

00:36:21.580 --> 00:36:23.760
Whereas the happiness just

00:36:23.760 --> 00:36:24.640
stays out of the way and

00:36:24.640 --> 00:36:26.280
it's a way to be more

00:36:26.280 --> 00:36:27.240
clear in your code, but it's

00:36:27.240 --> 00:36:27.960
not enforced.

00:36:27.960 --> 00:36:28.960
Pint is enforced, which I

00:36:28.960 --> 00:36:30.700
like enforcing, but it's

00:36:30.700 --> 00:36:31.760
also can slow down.

00:36:31.760 --> 00:36:32.780
You can't, these are not

00:36:32.780 --> 00:36:33.660
actual real numbers

00:36:33.660 --> 00:36:34.000
anymore.

00:36:34.000 --> 00:36:35.140
So you pay it.

00:36:35.140 --> 00:36:35.300
Yeah.

00:36:35.300 --> 00:36:36.180
So it's going to add a ton

00:36:36.180 --> 00:36:36.960
of overhead, right?

00:36:36.960 --> 00:36:37.760
But pint's interesting

00:36:37.760 --> 00:36:38.500
because you can do things

00:36:38.500 --> 00:36:40.520
like three times meter

00:36:40.520 --> 00:36:41.940
plus four times centimeter

00:36:41.940 --> 00:36:43.320
and you end up with

00:36:43.320 --> 00:36:44.440
3.04 meters.

00:36:44.440 --> 00:36:45.020
Yeah.

00:36:45.020 --> 00:36:45.900
Those are actually real

00:36:45.900 --> 00:36:46.340
quantities.

00:36:46.340 --> 00:36:47.180
They're actually a different

00:36:47.180 --> 00:36:49.300
object, which is the good

00:36:49.300 --> 00:36:50.120
thing about it, but it's

00:36:50.120 --> 00:36:50.920
also the reason that then

00:36:50.920 --> 00:36:51.820
it's not going to talk to

00:36:51.820 --> 00:36:52.800
say a C library that

00:36:52.800 --> 00:36:54.340
expects a regular number

00:36:54.340 --> 00:36:55.340
or something as well.

00:36:55.560 --> 00:36:55.800
Sure.

00:36:55.800 --> 00:36:56.440
Okay.

00:36:56.440 --> 00:36:58.240
Maybe one or two more and

00:36:58.240 --> 00:36:59.520
then we'll probably be out

00:36:59.520 --> 00:37:00.180
of time for these.

00:37:00.180 --> 00:37:01.180
What else should people

00:37:01.180 --> 00:37:02.380
maybe pay attention to that

00:37:02.380 --> 00:37:03.420
they could generally find

00:37:03.420 --> 00:37:04.020
useful over here?

00:37:04.020 --> 00:37:05.020
You mentioned vector.

00:37:05.020 --> 00:37:05.680
It's a little bit newer,

00:37:05.680 --> 00:37:07.620
but it's certainly for

00:37:07.620 --> 00:37:08.460
general physics.

00:37:08.460 --> 00:37:10.880
I think it's useful because

00:37:10.880 --> 00:37:12.720
it's a library for 2D,

00:37:12.720 --> 00:37:14.740
3D and relativistic

00:37:14.740 --> 00:37:15.280
vectors.

00:37:15.280 --> 00:37:16.580
And there aren't really,

00:37:16.580 --> 00:37:19.000
it's a very common sort of

00:37:19.000 --> 00:37:20.160
learning example you see,

00:37:20.160 --> 00:37:21.080
but there aren't really

00:37:21.080 --> 00:37:22.300
very many libraries that do

00:37:22.300 --> 00:37:23.720
this, that actually have,

00:37:23.720 --> 00:37:24.900
if you want to take the

00:37:24.900 --> 00:37:27.360
magnitude of a vector in 3D

00:37:27.360 --> 00:37:28.820
space, there just isn't a

00:37:28.820 --> 00:37:29.720
nice library for that.

00:37:29.720 --> 00:37:31.060
So we wrote vector to do

00:37:31.060 --> 00:37:31.280
that.

00:37:31.280 --> 00:37:33.840
And vectors is supported by

00:37:33.840 --> 00:37:34.180
awkward.

00:37:34.180 --> 00:37:35.680
It has an awkward backend.

00:37:35.680 --> 00:37:36.960
It has a number backend,

00:37:36.960 --> 00:37:38.600
a numpy backend, and then

00:37:38.600 --> 00:37:39.560
plain object backend.

00:37:39.560 --> 00:37:40.860
Eventually we might work on

00:37:40.860 --> 00:37:41.060
more.

00:37:41.060 --> 00:37:42.120
And it even has a number

00:37:42.120 --> 00:37:42.420
awkward.

00:37:42.420 --> 00:37:44.180
So you can, you can use a,

00:37:44.180 --> 00:37:46.520
a vector inside an awkward

00:37:46.520 --> 00:37:48.520
array inside a number jet

00:37:48.520 --> 00:37:50.320
compiled loop and still take

00:37:50.320 --> 00:37:51.400
magnitudes and do stuff like

00:37:51.400 --> 00:37:51.600
that.

00:37:51.600 --> 00:37:52.560
That's really cool.

00:37:52.560 --> 00:37:53.660
That integration there.

00:37:53.660 --> 00:37:54.260
Yeah.

00:37:54.260 --> 00:37:55.560
vectors because we have a lot

00:37:55.560 --> 00:37:56.240
of those in physics.

00:37:56.240 --> 00:37:56.840
Sure.

00:37:56.840 --> 00:37:58.740
And you can, you can do things

00:37:58.740 --> 00:38:00.300
like ask if one vector is

00:38:00.300 --> 00:38:01.720
close to another vector and

00:38:01.720 --> 00:38:03.020
things like that, even in

00:38:03.020 --> 00:38:04.500
different, it looks like a

00:38:04.500 --> 00:38:05.840
one in polar coordinates and

00:38:05.840 --> 00:38:07.080
one in, you know, a

00:38:07.080 --> 00:38:08.040
Cartesian or something like

00:38:08.040 --> 00:38:08.260
that.

00:38:08.260 --> 00:38:09.860
It has different unit systems

00:38:09.860 --> 00:38:11.860
and it can actually, it

00:38:11.860 --> 00:38:12.960
actually stores the vector in

00:38:12.960 --> 00:38:13.140
that.

00:38:13.140 --> 00:38:14.840
So you don't waste memory or

00:38:14.840 --> 00:38:15.060
something.

00:38:15.060 --> 00:38:15.760
If that's, that's the

00:38:15.760 --> 00:38:17.140
representation you have, that

00:38:17.140 --> 00:38:18.420
was a feature from a route

00:38:18.420 --> 00:38:19.420
that we wanted to make sure we,

00:38:19.420 --> 00:38:20.020
we got.

00:38:20.020 --> 00:38:21.280
And it also has sort of the

00:38:21.280 --> 00:38:22.780
idea of, of momentums too

00:38:22.780 --> 00:38:23.920
and stuff for the, for the

00:38:23.920 --> 00:38:24.840
relativistic stuff.

00:38:24.840 --> 00:38:26.280
We end up with a lot of that.

00:38:26.280 --> 00:38:27.840
And then maybe just mention the,

00:38:27.840 --> 00:38:29.000
since we mentioned the

00:38:29.000 --> 00:38:29.960
histogramming stuff and that's

00:38:29.960 --> 00:38:30.940
the area, that's the ones that I

00:38:30.940 --> 00:38:31.700
really work on.

00:38:31.700 --> 00:38:33.480
The ones I specifically work on

00:38:33.480 --> 00:38:34.340
that are general purpose.

00:38:34.340 --> 00:38:36.860
Boost histogram is a wrapper for

00:38:36.860 --> 00:38:38.820
the C++ boost histogram library.

00:38:38.820 --> 00:38:40.680
Boost is the sort of the big

00:38:40.680 --> 00:38:43.220
C++ library, just one step below

00:38:43.220 --> 00:38:44.560
the standard library.

00:38:45.120 --> 00:38:47.240
And right at the time I was

00:38:47.240 --> 00:38:49.080
starting at, at Princeton, the,

00:38:49.080 --> 00:38:50.960
I met the author of boost

00:38:50.960 --> 00:38:51.860
histogramming who's from

00:38:51.860 --> 00:38:54.660
physics and he was in the

00:38:54.660 --> 00:38:55.920
process, I believe, of getting

00:38:55.920 --> 00:38:57.060
this accepted into boost.

00:38:57.060 --> 00:38:58.480
And it got accepted after that.

00:38:58.480 --> 00:39:00.700
But one of the things that he

00:39:00.700 --> 00:39:02.000
decided to do is pull out his

00:39:02.000 --> 00:39:03.400
initial Python bindings that

00:39:03.400 --> 00:39:06.420
were written in boost Python,

00:39:06.420 --> 00:39:08.260
which is actually very similar to

00:39:08.260 --> 00:39:10.040
pybind11, but requires boost

00:39:10.040 --> 00:39:11.640
instead of not requiring anything.

00:39:11.900 --> 00:39:13.180
But the design is intentionally

00:39:13.180 --> 00:39:13.720
very similar.

00:39:13.720 --> 00:39:16.340
And so I proposed I would, I

00:39:16.340 --> 00:39:17.840
would work on boost histogram

00:39:17.840 --> 00:39:20.500
and write these, this, the

00:39:20.500 --> 00:39:22.080
Python bindings for it inside

00:39:22.080 --> 00:39:22.640
scikit-hep.

00:39:22.640 --> 00:39:23.860
And that would be sort of the

00:39:23.860 --> 00:39:25.460
main project I started on when I

00:39:25.460 --> 00:39:26.380
started at Princeton.

00:39:26.380 --> 00:39:28.700
And that's, you know, that's

00:39:28.700 --> 00:39:29.020
what I did.

00:39:29.020 --> 00:39:30.760
Boost histogram is a extremely

00:39:30.760 --> 00:39:32.240
powerful histogramming library.

00:39:32.240 --> 00:39:33.620
So it's a histogram as an

00:39:33.620 --> 00:39:35.320
object rather than like a

00:39:35.320 --> 00:39:36.480
NumPy, you can, there's a

00:39:36.480 --> 00:39:37.660
histogram function and you give

00:39:37.660 --> 00:39:38.900
it an array and then it spits a

00:39:38.900 --> 00:39:40.420
couple of arrays back out at

00:39:40.420 --> 00:39:40.580
you.

00:39:40.820 --> 00:39:43.680
You are now, you now have to

00:39:43.680 --> 00:39:44.300
manage these.

00:39:44.300 --> 00:39:45.220
They don't have any special

00:39:45.220 --> 00:39:45.580
meaning.

00:39:45.580 --> 00:39:46.940
Whereas boost histogram,

00:39:46.940 --> 00:39:48.220
histograms really are much more

00:39:48.220 --> 00:39:49.360
natural as an object, just like a

00:39:49.360 --> 00:39:50.640
data frame is more natural as an

00:39:50.640 --> 00:39:51.840
object where you tie that

00:39:51.840 --> 00:39:52.660
information together.

00:39:52.660 --> 00:39:54.540
A histogram's really natural that

00:39:54.540 --> 00:39:55.460
way, where you still have the

00:39:55.460 --> 00:39:56.840
information about what the data

00:39:56.840 --> 00:39:58.140
actually was on the axes.

00:39:58.140 --> 00:40:00.440
If you have labels, you want to

00:40:00.440 --> 00:40:02.300
keep those attached to those, to

00:40:02.300 --> 00:40:03.280
that, to that data.

00:40:03.280 --> 00:40:04.700
And you may need to fill again,

00:40:04.700 --> 00:40:05.760
which is one of the main things

00:40:05.760 --> 00:40:06.580
that,

00:40:06.580 --> 00:40:07.960
energy physicists really wanted

00:40:07.960 --> 00:40:09.140
because we tend to fill

00:40:09.140 --> 00:40:09.940
histograms and then keep

00:40:09.940 --> 00:40:11.540
filling them or rebinning them

00:40:11.540 --> 00:40:12.800
or doing operations on them.

00:40:12.800 --> 00:40:13.980
And you can do all those very

00:40:13.980 --> 00:40:14.360
naturally.

00:40:14.360 --> 00:40:15.800
And boost histograms, the

00:40:15.800 --> 00:40:18.260
actual, the C++ wrapper in

00:40:18.260 --> 00:40:20.660
PyBind 11 and a lot of, and,

00:40:20.660 --> 00:40:22.140
I actually got involved in CI

00:40:22.140 --> 00:40:22.960
BuildWell because of boost

00:40:22.960 --> 00:40:24.420
histogram, because I, one of the

00:40:24.420 --> 00:40:25.400
things I wanted to just make sure

00:40:25.400 --> 00:40:26.060
it worked everywhere.

00:40:26.060 --> 00:40:27.760
And it obviously requires C++.

00:40:27.760 --> 00:40:29.600
It requires compilation.

00:40:30.080 --> 00:40:31.600
and then hist is a nice

00:40:31.600 --> 00:40:32.700
wrapper on top of that that just

00:40:32.700 --> 00:40:33.780
makes it a lot more friendly to,

00:40:33.780 --> 00:40:35.880
to use because the original

00:40:35.880 --> 00:40:37.220
boost histogram author wants to

00:40:37.220 --> 00:40:38.680
keep this, Hans Dubinsky wants

00:40:38.680 --> 00:40:40.420
to keep this quite, pure and

00:40:40.420 --> 00:40:40.680
clean.

00:40:40.680 --> 00:40:42.740
So hist is a more, the more

00:40:42.740 --> 00:40:43.080
natural.

00:40:43.080 --> 00:40:44.140
And even if you're not in hep,

00:40:44.140 --> 00:40:45.020
I think that's still the more

00:40:45.020 --> 00:40:45.820
natural one to use.

00:40:45.820 --> 00:40:46.340
Yeah.

00:40:46.340 --> 00:40:48.280
Gold.plot and plot.

00:40:48.280 --> 00:40:48.920
Right, right.

00:40:48.920 --> 00:40:50.060
There's a lot of people who do,

00:40:50.060 --> 00:40:52.200
who use histograms across all

00:40:52.200 --> 00:40:52.840
sorts of disciplines.

00:40:52.840 --> 00:40:54.460
So that would definitely be one of

00:40:54.460 --> 00:40:56.060
those that is generally useful.

00:40:56.060 --> 00:40:56.580
All right.

00:40:56.580 --> 00:40:59.260
So I think that brings us to CI

00:40:59.260 --> 00:40:59.940
build wheel.

00:40:59.940 --> 00:41:01.880
let's, let's talk a bit

00:41:01.880 --> 00:41:02.260
about that.

00:41:02.260 --> 00:41:04.120
And I mean, maybe the place to

00:41:04.120 --> 00:41:05.720
start here is, you want

00:41:05.720 --> 00:41:06.460
to wheels, right?

00:41:06.460 --> 00:41:08.240
The, the first sentence

00:41:08.240 --> 00:41:09.480
describing it as Python wheels

00:41:09.480 --> 00:41:10.540
are great building them across

00:41:10.540 --> 00:41:11.840
Mac, Linux, windows, and other

00:41:11.840 --> 00:41:13.240
multiple versions of Python.

00:41:13.240 --> 00:41:14.120
Not so much.

00:41:14.120 --> 00:41:15.560
So no description.

00:41:15.560 --> 00:41:17.400
Yeah, exactly.

00:41:17.400 --> 00:41:19.400
Well, wheels are good.

00:41:19.400 --> 00:41:21.340
There's times when there are no

00:41:21.340 --> 00:41:23.140
wheels and things install slower.

00:41:23.140 --> 00:41:24.520
They might not install at all.

00:41:24.520 --> 00:41:26.020
It's generally a bad thing if

00:41:26.020 --> 00:41:27.720
you don't have a wheel, but,

00:41:27.720 --> 00:41:29.240
they're, they're not easy

00:41:29.240 --> 00:41:29.640
to make.

00:41:29.640 --> 00:41:29.980
Right.

00:41:29.980 --> 00:41:31.500
So tell us what is a wheel and

00:41:31.500 --> 00:41:32.680
then let's talk about why maybe

00:41:32.680 --> 00:41:33.600
building them across all these

00:41:33.600 --> 00:41:35.360
platforms and this, cross

00:41:35.360 --> 00:41:37.240
product along with like versions

00:41:37.240 --> 00:41:38.360
of Python and whatnot.

00:41:38.360 --> 00:41:39.280
It's a mess.

00:41:39.280 --> 00:41:40.720
When you distribute Python, you

00:41:40.720 --> 00:41:41.580
have several options.

00:41:41.580 --> 00:41:43.260
The most common one and most

00:41:43.260 --> 00:41:45.040
packages have at least an S

00:41:45.040 --> 00:41:46.700
dist, which is just basically a

00:41:46.700 --> 00:41:48.480
tar ball of the, of the source.

00:41:48.480 --> 00:41:49.040
Right.

00:41:49.040 --> 00:41:50.280
When you pip install it,

00:41:50.280 --> 00:41:51.480
it basically, you're missing

00:41:51.480 --> 00:41:52.520
some things or adding some

00:41:52.520 --> 00:41:53.100
things, but right.

00:41:53.100 --> 00:41:53.800
Otherwise it's mostly

00:41:53.800 --> 00:41:54.160
unzips.

00:41:54.160 --> 00:41:54.700
Yeah.

00:41:54.700 --> 00:41:55.920
It unzips your source and puts

00:41:55.920 --> 00:41:56.220
it somewhere.

00:41:56.220 --> 00:41:57.060
Python will find it.

00:41:57.120 --> 00:41:58.060
And then that's that.

00:41:58.060 --> 00:41:58.460
Yeah.

00:41:58.460 --> 00:42:00.080
So it runs your build system.

00:42:00.080 --> 00:42:01.640
So set up tools traditionally

00:42:01.640 --> 00:42:02.760
that's become a lot more

00:42:02.760 --> 00:42:04.940
powerful recently, but, it

00:42:04.940 --> 00:42:06.320
has to run the build system to

00:42:06.320 --> 00:42:07.340
figure out what, what do you do

00:42:07.340 --> 00:42:07.540
with it?

00:42:07.540 --> 00:42:08.540
This is just a bunch of files.

00:42:08.900 --> 00:42:10.540
and then it puts it together in a

00:42:10.540 --> 00:42:12.380
particular structure in your, in

00:42:12.380 --> 00:42:14.220
your, on your computer.

00:42:14.220 --> 00:42:16.940
And so a wheel was a package that

00:42:16.940 --> 00:42:19.020
was already, everything was

00:42:19.020 --> 00:42:19.880
already in place.

00:42:19.880 --> 00:42:21.200
So it's already in a particular

00:42:21.200 --> 00:42:21.660
structure.

00:42:21.660 --> 00:42:23.580
It knows, knows the structure and

00:42:23.580 --> 00:42:25.700
all Python has to do for a pure

00:42:25.700 --> 00:42:26.940
Python wheel, one that does not

00:42:26.940 --> 00:42:29.900
have any, binary pieces in

00:42:29.900 --> 00:42:30.020
it.

00:42:30.020 --> 00:42:33.660
It just grabs the contents inside and

00:42:33.660 --> 00:42:35.780
dumps them following a specific,

00:42:35.840 --> 00:42:38.240
set of rules into places into your,

00:42:38.240 --> 00:42:39.160
um, site packages.

00:42:39.160 --> 00:42:39.720
Right.

00:42:39.720 --> 00:42:40.520
So then you now have something

00:42:40.520 --> 00:42:40.920
installed.

00:42:40.920 --> 00:42:42.980
There's no setup.py in your wheel.

00:42:42.980 --> 00:42:45.300
There's no pyproject.tomol.

00:42:45.300 --> 00:42:46.900
There's those sorts of things are

00:42:46.900 --> 00:42:47.680
not in the wheel.

00:42:47.680 --> 00:42:48.920
The wheel's already there.

00:42:48.920 --> 00:42:51.020
It can't run arbitrary code.

00:42:51.020 --> 00:42:51.840
Yeah, exactly.

00:42:51.840 --> 00:42:52.940
That was one of the points I was

00:42:52.940 --> 00:42:53.920
going to make is one of the

00:42:53.920 --> 00:42:55.700
things that can be scary about

00:42:55.700 --> 00:42:57.840
installing packages is just by

00:42:57.840 --> 00:42:59.620
virtue of installing them, you're

00:42:59.620 --> 00:43:01.240
running arbitrary code because

00:43:01.240 --> 00:43:03.820
often that is execute, you know,

00:43:03.820 --> 00:43:06.980
Python space, set up py space,

00:43:06.980 --> 00:43:08.180
you know, install or something

00:43:08.180 --> 00:43:08.540
like that.

00:43:08.540 --> 00:43:09.400
And like, whatever that thing

00:43:09.400 --> 00:43:11.060
does, that's what happens when

00:43:11.060 --> 00:43:11.780
you pip install.

00:43:11.780 --> 00:43:12.240
Right.

00:43:12.240 --> 00:43:13.380
But not with wheels, as you said,

00:43:13.380 --> 00:43:14.880
it comes down in a binary blob

00:43:14.880 --> 00:43:16.100
and just like, boom, here it is.

00:43:16.100 --> 00:43:18.760
Obviously the thinking is we have

00:43:18.760 --> 00:43:20.540
this package delivered to a

00:43:20.540 --> 00:43:21.260
million computers.

00:43:21.260 --> 00:43:22.560
Why do we need to have every

00:43:22.560 --> 00:43:23.880
million computer run all the

00:43:23.880 --> 00:43:24.240
steps?

00:43:24.240 --> 00:43:25.840
Why don't we just run it once and

00:43:25.840 --> 00:43:26.500
then go here?

00:43:26.500 --> 00:43:28.120
And then also that saves you a

00:43:28.120 --> 00:43:28.660
ton of time.

00:43:28.660 --> 00:43:29.160
Right.

00:43:29.180 --> 00:43:31.580
like I just installed micro

00:43:31.580 --> 00:43:32.800
whiskey and it took, I don't

00:43:32.800 --> 00:43:34.740
know, 30 seconds, 45 seconds to

00:43:34.740 --> 00:43:36.040
install because it didn't have a

00:43:36.040 --> 00:43:36.200
wheel.

00:43:36.200 --> 00:43:37.160
So it sat there and it just

00:43:37.160 --> 00:43:39.440
grinded away compiling it, you

00:43:39.440 --> 00:43:39.600
know?

00:43:39.600 --> 00:43:40.060
Yeah.

00:43:40.060 --> 00:43:41.340
So there's two possibilities.

00:43:41.340 --> 00:43:44.260
A pure Python package, a wheel is

00:43:44.260 --> 00:43:46.260
still superior because of the not

00:43:46.260 --> 00:43:47.360
running arbitrary code.

00:43:47.360 --> 00:43:49.600
Pip will actually go ahead and

00:43:49.600 --> 00:43:51.480
compile all your PYC files.

00:43:51.480 --> 00:43:53.920
Your, that goes ahead and makes

00:43:53.920 --> 00:43:55.180
the bytecode for all those.

00:43:55.180 --> 00:43:56.680
If it's a wheel, if it's an S, if

00:43:56.680 --> 00:43:58.080
it's a tarball, it doesn't do

00:43:58.080 --> 00:43:58.320
that.

00:43:58.960 --> 00:44:00.100
If it doesn't pass through the

00:44:00.100 --> 00:44:00.820
wheel stage anyway.

00:44:00.820 --> 00:44:03.260
And then when, every time you open

00:44:03.260 --> 00:44:04.100
the file, then it's going to have

00:44:04.100 --> 00:44:05.460
the first time it's going to have

00:44:05.460 --> 00:44:06.980
to make that, that bytecode.

00:44:06.980 --> 00:44:07.820
So it'll be a little slower the

00:44:07.820 --> 00:44:08.560
first time you open it.

00:44:08.560 --> 00:44:10.300
There's, there's a variety of, of

00:44:10.300 --> 00:44:10.840
reasons.

00:44:10.840 --> 00:44:12.420
I think it's pythonwheels.com,

00:44:12.420 --> 00:44:13.260
something like that.

00:44:13.260 --> 00:44:15.720
That describes why you should use

00:44:15.720 --> 00:44:16.060
wheels.

00:44:16.060 --> 00:44:18.240
That's maybe that's not it, but

00:44:18.240 --> 00:44:18.860
I think it is.

00:44:18.860 --> 00:44:19.320
Yes.

00:44:19.320 --> 00:44:19.780
Python wheels.

00:44:19.780 --> 00:44:20.960
So they have like a list of

00:44:20.960 --> 00:44:23.000
advantages there, but.

00:44:23.000 --> 00:44:23.420
Yeah.

00:44:23.420 --> 00:44:25.180
I also have a little like

00:44:25.180 --> 00:44:25.740
checklist.

00:44:25.740 --> 00:44:28.300
It says, how are we doing for the

00:44:28.300 --> 00:44:30.260
top 360 packages?

00:44:30.260 --> 00:44:32.220
And apparently 342 of them have

00:44:32.220 --> 00:44:32.580
wheels.

00:44:32.580 --> 00:44:34.320
And it shows you for your popular

00:44:34.320 --> 00:44:37.020
packages, which ones like click does,

00:44:37.020 --> 00:44:38.880
but future doesn't, for example, and so

00:44:38.880 --> 00:44:39.060
on.

00:44:39.060 --> 00:44:39.260
So.

00:44:39.260 --> 00:44:41.080
Future's been there for a long time.

00:44:41.080 --> 00:44:42.200
Yeah.

00:44:42.200 --> 00:44:44.980
But, but yeah, so wheels are really

00:44:44.980 --> 00:44:45.340
good.

00:44:45.340 --> 00:44:47.720
And they actually replaced an older

00:44:47.720 --> 00:44:49.320
mechanism that was trying to do

00:44:49.320 --> 00:44:50.680
something somewhat similar called

00:44:50.680 --> 00:44:52.740
eggs, but I avoid talking about

00:44:52.740 --> 00:44:53.000
those.

00:44:53.000 --> 00:44:54.320
I don't really understand.

00:44:54.320 --> 00:44:55.380
Let it live in the past.

00:44:55.380 --> 00:44:56.240
Let it live in the past.

00:44:56.240 --> 00:44:59.000
Wheels also are a great way if you

00:44:59.000 --> 00:45:01.160
have compile and compile that

00:45:01.160 --> 00:45:01.420
happens.

00:45:01.420 --> 00:45:04.000
So if you compile some code as part of

00:45:04.000 --> 00:45:07.280
your, as part of your build, then

00:45:07.280 --> 00:45:08.340
that of course is much slower.

00:45:08.780 --> 00:45:10.380
If you have the, if you just have

00:45:10.380 --> 00:45:11.300
the example.

00:45:11.300 --> 00:45:11.740
Yeah.

00:45:11.740 --> 00:45:13.660
It's like it was doing GCC or something

00:45:13.660 --> 00:45:13.720
forever.

00:45:13.720 --> 00:45:14.320
And if you don't have a compiler, it

00:45:14.320 --> 00:45:15.000
won't even work.

00:45:15.000 --> 00:45:15.480
Right.

00:45:15.480 --> 00:45:15.780
Exactly.

00:45:15.780 --> 00:45:17.340
You have to have some setup, at least

00:45:17.340 --> 00:45:18.060
a little setup.

00:45:18.060 --> 00:45:19.700
You have to have a compiler setup at

00:45:19.700 --> 00:45:20.160
the very moment.

00:45:20.160 --> 00:45:20.320
Right.

00:45:20.320 --> 00:45:22.340
How many windows users have seen

00:45:22.340 --> 00:45:24.140
cannot find vcvars.bat?

00:45:24.140 --> 00:45:25.240
Right.

00:45:25.240 --> 00:45:26.120
Like what is this?

00:45:26.120 --> 00:45:26.640
I don't want this.

00:45:26.640 --> 00:45:27.500
In windows you have to be in a, in

00:45:27.500 --> 00:45:28.680
the environment or you have to have

00:45:28.680 --> 00:45:30.140
the, the right script sourced.

00:45:30.140 --> 00:45:30.440
Yes.

00:45:30.920 --> 00:45:35.420
So wheels had also can contain binary

00:45:35.420 --> 00:45:37.520
components like .so's and things.

00:45:37.520 --> 00:45:39.400
And they have a tag as part of their

00:45:39.400 --> 00:45:39.700
name.

00:45:39.700 --> 00:45:41.500
They have a very special naming scheme

00:45:41.500 --> 00:45:43.820
for wheels and the tag is stored in

00:45:43.820 --> 00:45:44.200
the wheel too.

00:45:44.200 --> 00:45:47.320
And they can tell you what Python version

00:45:47.320 --> 00:45:51.000
they're good for, what platform they can,

00:45:51.000 --> 00:45:52.420
are supported on.

00:45:52.420 --> 00:45:54.960
They have a build number and then they

00:45:54.960 --> 00:45:57.100
have a, the Python's actually in two

00:45:57.100 --> 00:45:57.400
pieces.

00:45:57.400 --> 00:46:00.640
There's the AVI and the interface.

00:46:00.640 --> 00:46:01.640
Yeah.

00:46:01.640 --> 00:46:03.560
You can see there's some huge long name

00:46:03.560 --> 00:46:04.740
that with a bunch of underscores

00:46:04.740 --> 00:46:05.460
separating it.

00:46:05.460 --> 00:46:07.400
And basically, when you try to

00:46:07.400 --> 00:46:09.460
install it, sorry, go ahead.

00:46:09.460 --> 00:46:10.520
I was saying it's also one of the

00:46:10.520 --> 00:46:11.820
reasons that names are normalized.

00:46:11.820 --> 00:46:13.300
There's no difference between a dash

00:46:13.300 --> 00:46:13.800
and underscore.

00:46:13.800 --> 00:46:15.520
It's because that special wheel name

00:46:15.520 --> 00:46:16.740
has dashes in it.

00:46:16.740 --> 00:46:18.880
So the package name at that point in

00:46:18.880 --> 00:46:20.140
the, in the file name has to be

00:46:20.140 --> 00:46:20.540
underscores.

00:46:20.540 --> 00:46:21.080
Yeah.

00:46:21.080 --> 00:46:22.440
And so basically when you pip install,

00:46:22.440 --> 00:46:24.720
it says it, it builds up that, that

00:46:24.720 --> 00:46:26.880
name and says, do you have this as a

00:46:26.880 --> 00:46:27.220
binary?

00:46:27.220 --> 00:46:27.940
Give it to me, right?

00:46:27.940 --> 00:46:28.620
Something like this.

00:46:28.620 --> 00:46:29.060
Yeah.

00:46:29.060 --> 00:46:30.500
It knows how to pick out the, it

00:46:30.500 --> 00:46:31.480
looks for the right one.

00:46:31.480 --> 00:46:32.740
If it finds a binary, it'll just

00:46:32.740 --> 00:46:34.580
download it depending slightly on the

00:46:34.580 --> 00:46:36.460
system and how new your pip is.

00:46:36.460 --> 00:46:37.280
Right.

00:46:37.280 --> 00:46:38.360
And this is one of the main

00:46:38.360 --> 00:46:40.960
innovations ideas or philosophies

00:46:40.960 --> 00:46:44.180
behind Conda and, Anaconda,

00:46:44.180 --> 00:46:44.420
right?

00:46:44.420 --> 00:46:45.800
It's like, let's just take that and

00:46:45.800 --> 00:46:47.500
make sure that we build all of these

00:46:47.500 --> 00:46:48.700
things in a really clear way.

00:46:48.700 --> 00:46:50.100
And then sort of package up the,

00:46:50.100 --> 00:46:52.500
the testing and come compilation

00:46:52.500 --> 00:46:54.380
and distribute, distributing all

00:46:54.380 --> 00:46:54.840
that together.

00:46:54.840 --> 00:46:55.140
Right.

00:46:55.460 --> 00:46:55.820
Yes.

00:46:55.820 --> 00:46:57.280
This is very similar to this.

00:46:57.280 --> 00:46:58.780
This came, I think, I'm pretty sure it

00:46:58.780 --> 00:46:59.760
came after Conda.

00:46:59.760 --> 00:47:01.100
I think where they were still in

00:47:01.100 --> 00:47:03.020
eggs when Conda was invented and

00:47:03.020 --> 00:47:05.200
then sort of building up wheels was

00:47:05.200 --> 00:47:05.660
challenging.

00:47:05.660 --> 00:47:08.020
Building a wheel was, was challenging.

00:47:08.020 --> 00:47:09.620
That's, that's, yeah, build wheel has

00:47:09.620 --> 00:47:10.400
really changed that.

00:47:10.400 --> 00:47:11.960
if you want a pure Python, it's

00:47:11.960 --> 00:47:12.440
really easy.

00:47:12.440 --> 00:47:14.600
You use today, you should be using the

00:47:14.600 --> 00:47:15.780
build tool, which I'm also, I'm a

00:47:15.780 --> 00:47:17.260
maintainer of that as well.

00:47:17.260 --> 00:47:19.900
but build just builds an S

00:47:19.900 --> 00:47:22.020
dist for you or it builds a wheel.

00:47:22.020 --> 00:47:24.160
And so you would say something like

00:47:24.160 --> 00:47:27.640
Python, set up PY, B dist or

00:47:27.640 --> 00:47:28.200
something like that.

00:47:28.200 --> 00:47:29.420
And then boom, I shouldn't be doing

00:47:29.420 --> 00:47:29.800
that anymore.

00:47:29.800 --> 00:47:30.280
Please don't.

00:47:30.280 --> 00:47:31.740
But that is how you do it.

00:47:31.740 --> 00:47:31.980
Yeah.

00:47:31.980 --> 00:47:32.780
How would I do it?

00:47:32.780 --> 00:47:33.460
Tell me the right way.

00:47:33.460 --> 00:47:34.580
The best.

00:47:34.760 --> 00:47:37.840
well you could do Python, or

00:47:37.840 --> 00:47:40.060
pip install build and then Python dash

00:47:40.060 --> 00:47:42.400
M build, and that will build both an

00:47:42.400 --> 00:47:44.400
S dist and a wheel and it'll build the

00:47:44.400 --> 00:47:45.520
wheel from the S dist.

00:47:45.520 --> 00:47:47.380
if you use pip X, which I would

00:47:47.380 --> 00:47:49.020
recommend, then you can just say pip X

00:47:49.020 --> 00:47:50.540
run build and you don't have to do

00:47:50.540 --> 00:47:52.400
anything that'll, that'll download build

00:47:52.400 --> 00:47:54.040
into a virtual environment for you.

00:47:54.040 --> 00:47:54.780
It'll do it.

00:47:54.780 --> 00:47:55.860
And then it eventually it'll throw away

00:47:55.860 --> 00:47:57.300
the virtual environment, after a

00:47:57.300 --> 00:47:57.460
week.

00:47:57.460 --> 00:47:57.980
Interesting.

00:47:57.980 --> 00:47:58.420
Okay.

00:47:58.420 --> 00:48:00.120
So we could just use the build.

00:48:00.120 --> 00:48:01.840
We should be using the build.

00:48:01.840 --> 00:48:03.280
You should be using the build tool.

00:48:03.280 --> 00:48:04.440
It's for an S dist.

00:48:04.520 --> 00:48:06.840
There's a big, benefit to this.

00:48:06.840 --> 00:48:09.580
And that is it will, it will, use

00:48:09.580 --> 00:48:10.800
your piproject.toml.

00:48:10.800 --> 00:48:13.200
And if you say you require numpy, then

00:48:13.200 --> 00:48:15.700
it will go, like you're using the

00:48:15.700 --> 00:48:17.580
numpy headers, the C headers, then it

00:48:17.580 --> 00:48:19.780
will go and it will, when it's, when

00:48:19.780 --> 00:48:21.920
it's building S dist, it will make the

00:48:21.920 --> 00:48:25.160
pep, 517 virtual environment.

00:48:25.160 --> 00:48:27.060
It'll install numpy, anything that's in

00:48:27.060 --> 00:48:29.860
your, your, your, requires in

00:48:29.860 --> 00:48:30.960
your piproject.toml.

00:48:30.960 --> 00:48:34.100
And then it will run the setup.py,

00:48:34.100 --> 00:48:35.220
inside that environment.

00:48:35.220 --> 00:48:37.200
So you can now import numpy directly in

00:48:37.200 --> 00:48:37.480
there.

00:48:37.480 --> 00:48:39.180
and it'll work even when you're

00:48:39.180 --> 00:48:40.040
building an S dist.

00:48:40.040 --> 00:48:42.740
If you do Python S dist or Python,

00:48:42.740 --> 00:48:45.620
setup.py stuff, you can't do that because

00:48:45.620 --> 00:48:47.900
you're literally running Python, giving it

00:48:47.900 --> 00:48:49.560
setup.py import numpy.

00:48:49.560 --> 00:48:50.100
now it's broken.

00:48:50.100 --> 00:48:51.060
Right.

00:48:51.060 --> 00:48:54.520
It, it, nothing, nothing triggers that, that

00:48:54.520 --> 00:49:00.180
call to the, pyproject.toml to see what, what you need, for a wheel.

00:49:00.180 --> 00:49:01.980
The best way to do it is with pip.

00:49:01.980 --> 00:49:04.440
or the original way to do it was with pip wheel.

00:49:04.440 --> 00:49:08.780
because pip has to be able to build wheels in order to, install things.

00:49:08.780 --> 00:49:13.260
The, that got added to pip before build existed.

00:49:13.520 --> 00:49:16.340
but now the best way to do it would be with build wheel.

00:49:16.340 --> 00:49:17.780
And that's actually, it's doing the right thing.

00:49:17.780 --> 00:49:19.700
It's actually trying to build the wheel you want.

00:49:19.700 --> 00:49:23.220
Whereas pip wheel is actually just building a wheelhouse.

00:49:23.220 --> 00:49:27.680
So if you depend on numpy and numpy doesn't have wheels, which they did

00:49:27.680 --> 00:49:28.720
better with Python 3.10.

00:49:28.720 --> 00:49:33.220
So I'm not going to complain about, about numpy for Python 3.10, but for 3.9, they didn't

00:49:33.220 --> 00:49:33.960
have wheels for a while.

00:49:33.960 --> 00:49:37.640
So it'll build the wheels there and it'll build your wheels and it'll dump them all in the

00:49:37.640 --> 00:49:39.800
wheelhouse or whatever the output is.

00:49:39.800 --> 00:49:43.120
So you'll get, you'll be building numpy wheels, which you definitely don't want to try to upload.

00:49:43.120 --> 00:49:43.660
Yeah.

00:49:43.660 --> 00:49:44.520
Yeah, definitely not.

00:49:44.520 --> 00:49:45.040
All right.

00:49:45.040 --> 00:49:46.160
Well, that's, that's really cool.

00:49:46.160 --> 00:49:47.200
And I definitely learned something.

00:49:47.200 --> 00:49:51.320
I will start using build instead of, doing it the other way.

00:49:51.320 --> 00:49:53.200
You can now delete your setup.py too.

00:49:53.200 --> 00:49:53.800
Yeah.

00:49:53.800 --> 00:49:54.720
That's the big thing, right?

00:49:54.720 --> 00:49:56.600
So you don't have to run that kind of stuff, right?

00:49:56.600 --> 00:49:57.140
Yeah.

00:49:57.260 --> 00:50:01.440
The, they're trying to move away from the any commands to setup.py because you don't

00:50:01.440 --> 00:50:02.260
even need one anymore.

00:50:02.260 --> 00:50:04.900
And, you can't control that, that environment.

00:50:04.900 --> 00:50:06.620
It's, it's very much an internal detail.

00:50:06.620 --> 00:50:12.340
Like wrapping up this segment of the conversation, we want a wheel because that's best.

00:50:12.340 --> 00:50:15.300
It installs without requiring the compiler tools on our system.

00:50:15.300 --> 00:50:16.960
It installs faster.

00:50:16.960 --> 00:50:19.020
It's built just for our platform.

00:50:19.020 --> 00:50:26.120
The challenge is when you become a maintainer, you got to solve this, this matrix of different

00:50:26.120 --> 00:50:28.920
Python versions that are supported and different platforms.

00:50:28.920 --> 00:50:33.360
Like for example, there's macOS Intel, there's macOS, M1, Apple Silicon.

00:50:33.360 --> 00:50:35.680
There's multiple versions of windows.

00:50:35.680 --> 00:50:38.240
There's different versions of Linux, right?

00:50:38.240 --> 00:50:41.920
Like arm Linux versus AMD 64 Linux.

00:50:41.920 --> 00:50:42.800
Yeah.

00:50:42.800 --> 00:50:45.980
And now muscle, muscle Linux versus the other Linux varieties.

00:50:45.980 --> 00:50:46.560
Yeah.

00:50:46.620 --> 00:50:51.280
So one of the challenges with a wheel, is making it distributable.

00:50:51.280 --> 00:50:54.320
So if you just go out and you build a wheel and then you try to give it to someone else,

00:50:54.320 --> 00:50:54.940
it may not work.

00:50:54.940 --> 00:51:00.600
certainly on, Linux, if you try to pretty much, if you do that, that it just won't

00:51:00.600 --> 00:51:00.840
work.

00:51:00.840 --> 00:51:02.580
because the systems are going to be different.

00:51:02.780 --> 00:51:07.600
on macOS, it'll only work on the version you compiled it on and not anything, older.

00:51:07.600 --> 00:51:13.160
And, you'll even see people trying to compile on, on macOS 10.14 because they're,

00:51:13.160 --> 00:51:16.160
they want their wheels to work as in many places as you want.

00:51:16.160 --> 00:51:17.600
Well, you can use the latest one.

00:51:17.600 --> 00:51:18.600
There's ways to fix that.

00:51:19.020 --> 00:51:20.200
Well, exactly.

00:51:20.200 --> 00:51:20.540
It's fine.

00:51:20.540 --> 00:51:24.260
The jankiest, like I've got a Mac mini from nine, from 2009.

00:51:24.260 --> 00:51:25.900
We're building on that thing.

00:51:25.900 --> 00:51:27.060
Cause it will work for most people.

00:51:27.060 --> 00:51:27.320
Right.

00:51:27.320 --> 00:51:30.780
I think that's how they actually build the official Python binaries.

00:51:30.780 --> 00:51:31.700
Interesting.

00:51:31.700 --> 00:51:32.420
I'm not sure.

00:51:32.420 --> 00:51:37.860
But then Apple went in like last year around this time, they threw a big spanner in the

00:51:37.860 --> 00:51:38.800
works and said, you know what?

00:51:38.800 --> 00:51:41.220
We're going to completely switch to arm and our own silicon.

00:51:41.220 --> 00:51:43.760
And, you got to compile for something different now.

00:51:43.760 --> 00:51:44.200
Yeah.

00:51:44.200 --> 00:51:46.300
And cross compiling has always been a challenge.

00:51:46.300 --> 00:51:47.600
yeah.

00:51:47.620 --> 00:51:49.780
And then windows is actually the easiest of all of them.

00:51:49.780 --> 00:51:53.360
You're most likely on windows to be able to compile something that you can give to someone

00:51:53.360 --> 00:51:53.640
else.

00:51:53.640 --> 00:51:57.460
But, the rest of the, that is one of the things that Microsoft's been really pretty good

00:51:57.460 --> 00:51:58.640
at is backwards compatibility.

00:51:58.640 --> 00:52:03.420
I get holds them back in other ways, but yeah, typically you can run an app from 20 years ago

00:52:03.420 --> 00:52:04.400
and it'll still run.

00:52:04.400 --> 00:52:04.860
Yeah.

00:52:04.860 --> 00:52:09.060
And there are a few caveats, but not, not many compared, at least compared to the other

00:52:09.060 --> 00:52:09.400
systems.

00:52:09.400 --> 00:52:14.240
Apple's really good, but you do have to, you do have to understand how to, you do have

00:52:14.240 --> 00:52:17.540
to set your minimum version and you have to get a Python that had that

00:52:17.540 --> 00:52:19.680
minimum version set when it was compiled.

00:52:19.680 --> 00:52:21.160
But if you do that, it works really well.

00:52:21.160 --> 00:52:26.460
So what I actually did with what actually started with in scikit-heb, I had this, I had, I was

00:52:26.460 --> 00:52:29.360
building boost histogram, which needed to be able to run anywhere.

00:52:29.360 --> 00:52:30.700
That was something I absolutely wanted.

00:52:30.700 --> 00:52:33.200
It had to be pip install boost histogram and it just worked no matter what.

00:52:33.480 --> 00:52:37.360
And also we had several other compiled packages at the time, several we had inherited.

00:52:37.360 --> 00:52:41.660
and, I mean, you is, was compiled and that was quite popular.

00:52:41.660 --> 00:52:45.860
We had a couple of specific ones and we had a couple, a couple more that ended up being,

00:52:45.860 --> 00:52:47.520
becoming interested in, in that.

00:52:47.620 --> 00:52:51.200
fact during this sort of period is when awkward started compiling pieces.

00:52:51.520 --> 00:52:55.240
And so what I started with was building my own system to do this.

00:52:55.240 --> 00:53:00.360
It was called, Azure wheel helpers, which, was, as you can guess by the name, Azure

00:53:00.360 --> 00:53:04.140
was basically a set of Azure, DevOps scripts.

00:53:04.140 --> 00:53:05.480
It was right after Azure had come out.

00:53:05.740 --> 00:53:09.700
And I wrote a series of blog posts on this and described the exact process.

00:53:09.700 --> 00:53:13.560
and sort of the things I'd found out about how you build a compatible wheel.

00:53:13.560 --> 00:53:20.580
on macOS, you have to make sure you get the most compatible, CPython from, from

00:53:20.580 --> 00:53:21.880
python.org itself.

00:53:21.880 --> 00:53:25.420
You, you know, can't use, you can't use brew or something like that because those are

00:53:25.420 --> 00:53:27.780
going to be compiled for whatever system they were targeting.

00:53:27.780 --> 00:53:33.220
And on Linux, you have to, you have to run the mini Linux system and you should run audit

00:53:33.220 --> 00:53:35.520
wheel and actually in Mac, you should run develop.

00:53:35.580 --> 00:53:39.040
develop, bill wheel, although I might be getting him, I think it's the bill wheel.

00:53:39.040 --> 00:53:41.820
so there's this, this series of things that you have to do.

00:53:41.820 --> 00:53:47.520
And I started maintaining this, this multi hundred line set of scripts to do this.

00:53:47.520 --> 00:53:51.240
And, and I was also being limited by Azure at the time.

00:53:51.240 --> 00:53:53.200
They didn't have these, all the templates and stuff they have now.

00:53:53.200 --> 00:53:57.420
So everything had to be managed through get subtree because it couldn't be a separate

00:53:57.420 --> 00:53:58.200
repository.

00:53:58.200 --> 00:54:03.620
And, and I, and then when, Jim started working awkward, he went and just rewrote the

00:54:03.620 --> 00:54:05.200
whole thing to, cause it thought it.

00:54:05.420 --> 00:54:09.800
He wanted it to look simpler for him and took a couple of things out that were needed and

00:54:09.800 --> 00:54:11.420
suddenly made it two separate things.

00:54:11.420 --> 00:54:13.460
Now I had to, had to help maintain that.

00:54:13.460 --> 00:54:17.400
So when Python 3.8 or whatever it was came out, now I had, I had a completely different

00:54:17.400 --> 00:54:18.780
set of changes I had to make for that one.

00:54:18.880 --> 00:54:21.320
And it was really just not, it was not working out.

00:54:21.320 --> 00:54:22.960
it was not very easy to maintain.

00:54:22.960 --> 00:54:25.300
And I was watching, cibuildwheel.

00:54:25.300 --> 00:54:27.620
I, and it was this package.

00:54:27.620 --> 00:54:30.840
It was a Python package that would, would do this.

00:54:30.840 --> 00:54:34.100
And it didn't matter what CI system you were on because it was written in Python.

00:54:34.100 --> 00:54:40.760
and it, it followed nice, Python principles for good package design and had unit tests and all that sort of stuff.

00:54:40.760 --> 00:54:41.960
So it looked really good.

00:54:41.960 --> 00:54:43.220
There were a couple of things that was missing.

00:54:43.220 --> 00:54:47.960
I came in, I added, I made PRs for the things that I'd, I'd come up with that it didn't have.

00:54:47.960 --> 00:54:48.760
And they got accepted.

00:54:48.760 --> 00:54:52.840
And, there was a shared maintainer between PI bind 11 and cibuildwheel as well.

00:54:52.840 --> 00:54:55.200
I think that's one of the reasons that I sort of had heard about it.

00:54:55.200 --> 00:54:55.980
It was really watching it.

00:54:55.980 --> 00:54:57.880
And I finally decided just to make the switch.

00:54:57.880 --> 00:55:02.520
And, I did at some point, a little later, I actually became a maintainer of cibuildwheel.

00:55:02.760 --> 00:55:09.580
But, I think I started doing the switch before I made it really easy once I was a maintainer to say, oh, this is a package that, you know, we have some control over.

00:55:09.580 --> 00:55:09.920
It's okay.

00:55:09.920 --> 00:55:10.360
Let's just.

00:55:10.360 --> 00:55:10.880
Right.

00:55:10.880 --> 00:55:13.160
Your package is a choice to depend upon this.

00:55:13.160 --> 00:55:14.360
Cause we have a say.

00:55:14.360 --> 00:55:16.480
It just took out all of that, that maintenance.

00:55:16.480 --> 00:55:20.520
And now, depend about does all the maintenance for us.

00:55:20.520 --> 00:55:23.060
It does the pin moves for the pin and see a build wheel.

00:55:23.060 --> 00:55:23.460
And that's it.

00:55:23.460 --> 00:55:23.980
Nice.

00:55:23.980 --> 00:55:32.740
So if I want to accomplish, if I'm a package developer owner, and I want to share that package with everybody,

00:55:32.740 --> 00:55:37.740
we've already determined we would ideally want to have a wheel, but getting that wheel is hard.

00:55:37.740 --> 00:55:43.180
So cibuildwheel will let you integrate it as the name indicates into your continuous integration.

00:55:43.180 --> 00:55:46.660
And one of those steps of CI could be build the wheel, right?

00:55:46.660 --> 00:55:54.320
But it pretty almost, it reduces it down to pretty much that, that there's a step in your CI that says, you know, run cibuildwheel.

00:55:54.320 --> 00:55:59.360
And then cibuildwheel is designed to really integrate nicely with the build matrix.

00:55:59.360 --> 00:56:06.920
So, you could, in, for a fairly simple package or for many packages, you can really just do Mac, Windows, and Linux have the same job.

00:56:06.920 --> 00:56:07.940
I can get up actions.

00:56:07.940 --> 00:56:09.000
It's easy to do the same job.

00:56:09.000 --> 00:56:11.620
and then I'll see, I build wheel.

00:56:11.620 --> 00:56:13.080
And that's about it.

00:56:13.080 --> 00:56:16.540
It just goes through all the different versions of Python that are supported.

00:56:16.540 --> 00:56:21.000
It goes, it just goes through and makes, a wheel for each.

00:56:21.000 --> 00:56:26.860
And, in fact, it even has one feature that was really nice that, I'd always struggled with a bit is testing.

00:56:27.080 --> 00:56:32.040
So if you give it a test command, it will even take your, your package, it will install it in a new environment.

00:56:32.040 --> 00:56:33.920
That's not, you know, in a different directory.

00:56:33.920 --> 00:56:35.520
That's not related to your build at all.

00:56:35.520 --> 00:56:38.240
And make sure it works and passes whatever test you give it.

00:56:38.240 --> 00:56:38.880
And,

00:56:38.880 --> 00:56:40.320
We'll do that across the platforms.

00:56:40.320 --> 00:56:42.840
We'll do like a macOS test and a Windows test.

00:56:42.840 --> 00:56:43.280
Yeah.

00:56:43.280 --> 00:56:43.840
For each.

00:56:43.840 --> 00:56:47.960
So cibuildwheel really just sees the platform it's sitting on because it's inside the build matrix.

00:56:47.960 --> 00:56:49.460
And so it's run, run for each.

00:56:49.460 --> 00:56:52.100
And, yeah, it does for each, each one.

00:56:52.100 --> 00:56:55.400
It, it, will run that, that test.

00:56:55.400 --> 00:56:57.520
And the most, the simplest test is just echo.

00:56:57.520 --> 00:56:58.920
And that will just make sure it installs.

00:56:58.920 --> 00:57:03.240
Cause it won't try to install your wheel unless, there's something in that test command.

00:57:03.240 --> 00:57:04.740
Even that's useful sometimes.

00:57:04.740 --> 00:57:09.100
Even that's broken sometimes because of NumPy not supporting one of those things in that matrix.

00:57:09.100 --> 00:57:09.720
Yeah.

00:57:09.720 --> 00:57:11.080
It can't install the dependencies.

00:57:11.080 --> 00:57:12.660
So that step fails or something.

00:57:12.660 --> 00:57:17.120
So, it says it currently supports GitHub actions, Azure pipelines,

00:57:17.400 --> 00:57:20.380
which I don't know how long those are going to be two separate things.

00:57:20.380 --> 00:57:23.820
Maybe they'll always be separate, but Microsoft owning GitHub be like,

00:57:23.820 --> 00:57:26.100
they're saying do stuff in Azure pipelines.

00:57:26.100 --> 00:57:27.380
And then they're kind of moving.

00:57:27.380 --> 00:57:28.520
Like, yeah, I think they're similar.

00:57:28.520 --> 00:57:29.660
The runners are the same.

00:57:29.660 --> 00:57:31.280
They actually have the same environments.

00:57:31.280 --> 00:57:35.460
so I think they'll exist just as two different interfaces probably.

00:57:35.460 --> 00:57:39.500
And Azure is not so tied to GitHub and it has more of an enterprise type.

00:57:39.500 --> 00:57:40.300
Yeah, for sure.

00:57:40.300 --> 00:57:41.780
It definitely has a different focus.

00:57:41.780 --> 00:57:44.900
It was just a rewrite and a better rewrite in most cases of it.

00:57:44.900 --> 00:57:45.680
I got to learn.

00:57:45.680 --> 00:57:46.180
Yeah.

00:57:46.300 --> 00:57:47.720
They, I think GitHub actions came second.

00:57:47.720 --> 00:57:48.040
All right.

00:57:48.040 --> 00:57:53.800
So then Travis CI app, Bayer circle CI and get lab CI, at least all of those, right?

00:57:53.800 --> 00:57:56.160
At least those are the, those are the ones we test on.

00:57:56.160 --> 00:57:58.480
And then, it runs locally.

00:57:58.480 --> 00:58:00.880
there are some limitations to running it locally.

00:58:00.880 --> 00:58:06.220
if it's, if you target Linux and you can, any, any system that has Docker in target

00:58:06.220 --> 00:58:08.560
Linux, you can just ask to build Linux wheels.

00:58:08.560 --> 00:58:11.400
You can actually run it from like my Mac or from windows.

00:58:11.560 --> 00:58:12.520
I assume from a Windows machine.

00:58:12.520 --> 00:58:15.640
I tried Windows with Docker and, Windows.

00:58:15.640 --> 00:58:20.160
It does install to a, a standard location, C colon backslash cibuildwheel.

00:58:20.160 --> 00:58:22.320
But other than that, it's safe to run it there.

00:58:22.320 --> 00:58:25.560
And macOS, it will install to your macOS system.

00:58:25.560 --> 00:58:27.080
It's installed system versions of Python.

00:58:27.220 --> 00:58:29.060
So that's something we haven't solved yet.

00:58:29.060 --> 00:58:30.300
Might be able to someday.

00:58:30.300 --> 00:58:34.840
so it's not a good idea unless you really are okay with installing every version of Python

00:58:34.840 --> 00:58:37.400
that ever existed into your, into your system.

00:58:37.400 --> 00:58:40.160
maybe get a VM of your Mac.

00:58:40.160 --> 00:58:41.720
The Python.org Python.

00:58:41.720 --> 00:58:42.540
Yeah.

00:58:42.540 --> 00:58:42.760
Yeah.

00:58:42.760 --> 00:58:43.840
I mean, it's somewhat safe.

00:58:43.840 --> 00:58:49.900
if you're on Windows, you could use, a Windows subsystem for Linux, WSL as well.

00:58:50.220 --> 00:58:51.740
In addition to Docker, I suspect.

00:58:51.740 --> 00:58:53.020
Although I haven't tried that.

00:58:53.020 --> 00:58:57.900
Mini Linux has to run, you could, I'm sure as long as you can launch Docker, the thing

00:58:57.900 --> 00:59:02.000
that you have to be able to do is launch Docker because you have to use the, mini Linux

00:59:02.000 --> 00:59:05.140
Docker images or you should use that or derivative of that.

00:59:05.140 --> 00:59:09.880
There's lots of rules to exactly what can be in the environment and things like that.

00:59:09.880 --> 00:59:12.080
And PyPA maintains that.

00:59:12.080 --> 00:59:17.320
One thing that also helps is that we have the main, mini Linux maintainer is also a CI

00:59:17.320 --> 00:59:18.220
build wheel maintainer.

00:59:18.340 --> 00:59:22.060
So it's one reason that those things tend that, they fit well together.

00:59:22.060 --> 00:59:25.080
Features tend to match and come out at the same time.

00:59:25.080 --> 00:59:27.620
Like, like Musa Linux, which is a big, big thing recently.

00:59:27.620 --> 00:59:30.640
It's not actually in a released version of cibuildwheel yet.

00:59:30.640 --> 00:59:31.700
What is Musa Linux?

00:59:31.700 --> 00:59:37.260
So a normal Linux, is based on Glib C and that's actually what controls.

00:59:37.260 --> 00:59:39.520
It's one of two things that controls mini Linux.

00:59:39.520 --> 00:59:43.600
So if, can you download the binary wheel or do you have to build?

00:59:43.600 --> 00:59:48.100
If you have a old version of pip that will, they had to teach pip about each

00:59:48.100 --> 00:59:48.980
version of mini Linux.

00:59:48.980 --> 00:59:50.340
That was a mess.

00:59:50.340 --> 00:59:53.520
So they eventually switched to not to a standard numbering system.

00:59:53.520 --> 00:59:55.040
That is your G lib C number.

00:59:55.040 --> 00:59:56.200
And now pip come.

00:59:56.200 --> 00:59:56.760
Doesn't.

00:59:56.760 --> 01:00:00.680
And the current pip will be able to install a future mini Linux as long as your systems.

01:00:00.680 --> 01:00:02.280
But, that was a big problem.

01:00:02.280 --> 01:00:05.700
So pip nine can only install mini Linux one.

01:00:05.700 --> 01:00:06.780
It can't install mini Linux.

01:00:06.780 --> 01:00:08.840
And even if your G lib C is fine for it.

01:00:08.840 --> 01:00:14.600
So the real, the other thing is the G lib C version and, mini Linux one was based on

01:00:14.600 --> 01:00:16.420
Sinto S five, grid hat five.

01:00:16.420 --> 01:00:21.980
mini Linux 2010 was Sinto S six mini Linux 2014 was Sinto S seven.

01:00:22.220 --> 01:00:26.680
And then now they switched to Debian because of the, Sinto S sort of switching to the

01:00:26.680 --> 01:00:27.560
stream model.

01:00:27.560 --> 01:00:32.980
so mini Linux two 14 or two 24 is G lib C 2.24.

01:00:32.980 --> 01:00:36.020
And that's Debian eight or something like that.

01:00:36.020 --> 01:00:38.440
And so, but that's G lib C based.

01:00:38.440 --> 01:00:43.900
There are, distributions that are not G lib C based, most notably Alpine, very used

01:00:43.900 --> 01:00:44.480
Alpine.

01:00:44.480 --> 01:00:47.100
it's this tiny, tiny little Docker image.

01:00:47.100 --> 01:00:51.480
It's really fun distribution to use if you're on Docker, but, it actually sounds fun to

01:00:51.480 --> 01:00:56.120
install, but I've never tried it without Docker, but, it's these five megabyte Docker

01:00:56.120 --> 01:00:58.660
wheels or Docker is Docker doesn't do wheels.

01:00:58.660 --> 01:01:00.420
Docker images, Docker images.

01:01:00.420 --> 01:01:00.620
Yeah.

01:01:00.620 --> 01:01:02.960
But, that doesn't use G lib C.

01:01:02.960 --> 01:01:03.780
That uses muscle.

01:01:03.780 --> 01:01:07.020
And so muscle Linux will run on Alpine.

01:01:07.020 --> 01:01:07.620
Okay.

01:01:07.620 --> 01:01:08.060
Got it.

01:01:08.060 --> 01:01:12.040
So if you're building for the platform Alpine and similar ones, right.

01:01:12.040 --> 01:01:14.060
So anything.

01:01:14.060 --> 01:01:15.380
Yeah.

01:01:15.380 --> 01:01:17.280
And you said, I can run this locally as well.

01:01:17.280 --> 01:01:22.600
I know I would use it in CI cause I'm trying, I've got that matrix of all the versions

01:01:22.600 --> 01:01:28.320
of CPython and pipe pipe, P Y P Y, and then all the platforms that I want to check as many

01:01:28.320 --> 01:01:30.800
of those boxes as possible to put wheels in it.

01:01:30.800 --> 01:01:31.040
Right.

01:01:31.040 --> 01:01:31.780
yeah.

01:01:31.780 --> 01:01:37.080
Suppose I'm on my Mac and I want to make use of this to fill in, maybe do some testing,

01:01:37.080 --> 01:01:38.600
at least on some of these columns.

01:01:38.600 --> 01:01:39.880
Like, how do I do that?

01:01:39.880 --> 01:01:41.040
What's the benefit there?

01:01:41.040 --> 01:01:42.800
Well, I can tell you the case where it happened.

01:01:42.800 --> 01:01:50.620
so we were shipping, CMake and the psychic build organization ran out of Travis credits

01:01:50.620 --> 01:01:52.900
and, they were being built.

01:01:52.900 --> 01:01:57.000
We hadn't switched them over to being emulated builds on, GitHub actions yet.

01:01:57.100 --> 01:01:58.200
And it just ran out.

01:01:58.200 --> 01:01:59.320
We couldn't, we couldn't build them.

01:01:59.320 --> 01:02:02.160
And one of them had been missed and we also weren't waiting to upload.

01:02:02.160 --> 01:02:07.020
So we had uploaded everything, but we had one, one set or maybe, maybe it was all of the emulated

01:02:07.020 --> 01:02:07.360
builds.

01:02:07.360 --> 01:02:09.460
I think it was one set that didn't, didn't work.

01:02:09.660 --> 01:02:13.260
And so we wanted to go ahead and upload those, those missing wheels.

01:02:13.260 --> 01:02:16.360
And I tried, but, I couldn't actually get emulation.

01:02:16.360 --> 01:02:20.320
Docker Q, Q, Q, Q, E, M, U emulation.

01:02:20.320 --> 01:02:20.960
What?

01:02:20.960 --> 01:02:22.340
I couldn't get that working on my Mac.

01:02:22.340 --> 01:02:29.660
So, the mini links maintainer used his Linux machine and he, yeah, had Q, Q, E, M, U emulation

01:02:29.660 --> 01:02:34.740
on it and he built the emulated images a few hours, but he just built them locally and then sent

01:02:34.740 --> 01:02:36.980
and then uploaded, filled in the missing wheels.

01:02:36.980 --> 01:02:43.460
So if, if I'm, maintaining a package, I'm, I got some package I'm putting on pipe AI and I want

01:02:43.460 --> 01:02:44.500
to test it.

01:02:44.500 --> 01:02:47.760
Does it make sense to do it locally or does it just make sense to put it on a summit?

01:02:47.760 --> 01:02:48.820
Some CI system.

01:02:49.180 --> 01:02:54.280
for C and build wheel, usually I do some local testing, but I'm also developing C and build

01:02:54.280 --> 01:02:59.040
wheel, but, you know, usually it's probably fine to do this in your, just in your CI and

01:02:59.040 --> 01:03:00.960
usually don't want to run the full, full thing.

01:03:00.960 --> 01:03:04.060
Every time usually you have your regular unit tests, but C and build wheel is going to be

01:03:04.060 --> 01:03:08.040
a lot slower because it's going through and it's making each set of wheels and launching

01:03:08.040 --> 01:03:09.280
Docker images and things like that.

01:03:09.280 --> 01:03:13.380
and it's installing Python each time, for macOS and windows.

01:03:13.380 --> 01:03:17.400
So, usually unless if you have a fairly quick build, I've seen some people just run C and

01:03:17.400 --> 01:03:19.600
people just run cibuildwheel as part of their test suite.

01:03:19.600 --> 01:03:22.380
but usually you just run it, say right before release.

01:03:22.380 --> 01:03:25.480
Maybe I usually do it once before the release and then I'm the release.

01:03:25.480 --> 01:03:26.000
Right.

01:03:26.000 --> 01:03:26.320
Exactly.

01:03:26.320 --> 01:03:26.560
Okay.

01:03:26.560 --> 01:03:27.120
That makes sense.

01:03:27.120 --> 01:03:29.680
Cause it's a pretty heavyweight type of operation.

01:03:30.000 --> 01:03:35.820
So when I look at all these different platforms, I see macOS Intel, macOS, Apple Silicon, different,

01:03:35.820 --> 01:03:37.180
businesses of windows.

01:03:37.180 --> 01:03:43.340
And then I think about CI systems, you know, what CI systems can I use that support all these

01:03:43.340 --> 01:03:43.540
things?

01:03:43.540 --> 01:03:48.520
Like does GitHub actions support both versions of macOS, for example, plus windows.

01:03:48.620 --> 01:03:52.480
GitHub actions is by far our most popular, platform.

01:03:52.480 --> 01:03:53.380
It switched very quickly.

01:03:53.380 --> 01:03:54.180
It used to be Travis.

01:03:54.180 --> 01:03:56.660
Travis was a challenge cause they didn't do windows very well.

01:03:56.660 --> 01:03:57.800
They still don't do windows very well.

01:03:57.800 --> 01:04:02.380
and it's a challenge for us because we actually can't run our macOS, tests on them anymore.

01:04:02.380 --> 01:04:07.340
Because once we joined the pipe PA, the billing became an issue and we just basically just

01:04:07.340 --> 01:04:09.940
lost, macOS running for it.

01:04:09.940 --> 01:04:16.700
but, circle, I think, Azure and GitHub actions, I think they do all three.

01:04:16.700 --> 01:04:18.860
and you can always flip things up.

01:04:18.860 --> 01:04:22.220
I always do Travis for the Linux and then app fair for windows.

01:04:22.220 --> 01:04:24.060
You can do it that way.

01:04:24.060 --> 01:04:29.220
One of the big things that I had developed for cibuildwheel was the, pipe project

01:04:29.220 --> 01:04:34.740
dot Tomo or any Tomo configuration, usually that, configuration for cibuildwheel.

01:04:34.740 --> 01:04:40.620
That way you can get your cibuildwheel configuration out of your, your, YAML files.

01:04:40.620 --> 01:04:41.940
That way it works locally.

01:04:41.940 --> 01:04:45.900
which is one of the main, one of the things I was after, but also you can just do it and

01:04:45.900 --> 01:04:47.440
then run on several different systems.

01:04:47.440 --> 01:04:51.700
Like you might like the fact that Travis, Travis is, I think the only one that does the,

01:04:51.700 --> 01:04:53.860
native strange architectures.

01:04:53.860 --> 01:04:57.980
You have to emulate it other places, which is a lot slower, five times slower or something.

01:04:58.200 --> 01:04:58.400
Yeah.

01:04:58.400 --> 01:05:02.760
So kind of split that up, get the definition and then create maybe multiple

01:05:02.760 --> 01:05:04.440
CI, jobs.

01:05:04.440 --> 01:05:05.840
Your CI scripts are really simple.

01:05:05.840 --> 01:05:05.960
Yeah.

01:05:05.960 --> 01:05:06.520
Yeah.

01:05:06.520 --> 01:05:07.120
Yeah.

01:05:07.120 --> 01:05:07.480
Very cool.

01:05:07.480 --> 01:05:09.700
The example script is just a few lines.

01:05:09.700 --> 01:05:12.220
It doesn't, it does not take much to do this comparing.

01:05:12.220 --> 01:05:12.800
Oh yeah.

01:05:12.800 --> 01:05:14.440
Hundreds of lines it used to take.

01:05:14.440 --> 01:05:15.340
Yeah, sure.

01:05:15.340 --> 01:05:16.360
And I didn't even scroll down here.

01:05:16.360 --> 01:05:22.200
You've got a nice grid on github.com/IPA slash cibuildwheel that shows on GitHub actions,

01:05:22.200 --> 01:05:24.680
which is supported on Azure pipelines.

01:05:24.680 --> 01:05:25.400
What's supported.

01:05:25.400 --> 01:05:26.240
It's not right.

01:05:26.240 --> 01:05:27.680
Circle CI doesn't do this.

01:05:27.680 --> 01:05:29.660
No, but yeah.

01:05:29.660 --> 01:05:34.220
App there, Travis, Azure and GitHub do.

01:05:34.220 --> 01:05:36.780
Where does the macOS, but we can't test it.

01:05:36.780 --> 01:05:38.360
Theoretically, it does it.

01:05:38.360 --> 01:05:39.300
Gotcha.

01:05:39.300 --> 01:05:46.100
And then, yeah, I wonder about the M1, the Apple Silicon arm versions versus the Intel versions.

01:05:46.240 --> 01:05:49.620
I don't know how, how well that's permeated into the world yet.

01:05:49.620 --> 01:05:52.040
but the fact that they have Mac at all is kind of impressive.

01:05:52.040 --> 01:05:53.900
Nobody has an M1 runner yet.

01:05:53.900 --> 01:05:59.600
there are a few places I think now that you can purchase time on one, but no runners.

01:05:59.600 --> 01:06:03.580
I mean, last I checked GitHub actions, you couldn't even write yourself on the M1.

01:06:03.960 --> 01:06:05.880
that may be, that may have changed.

01:06:05.880 --> 01:06:06.280
I don't know.

01:06:06.280 --> 01:06:07.080
That was a while back.

01:06:07.080 --> 01:06:07.380
Yeah.

01:06:07.380 --> 01:06:10.620
I mean, there are some crazy, places out there.

01:06:10.620 --> 01:06:13.040
I think there's one called Mac mini Colo.

01:06:13.040 --> 01:06:14.040
I think that's what it's called.

01:06:14.040 --> 01:06:16.900
Let me see if that's, yeah, I think that's it.

01:06:17.120 --> 01:06:17.520
Yeah.

01:06:17.520 --> 01:06:24.780
So you can get, you can go to these places like Mac mini Colo and get, get a whole bunch

01:06:24.780 --> 01:06:28.120
of Mac minis and put them into this crazy data center.

01:06:28.120 --> 01:06:35.080
But you know, that's not the same as I upload a text file into GitHub that says run on Azure

01:06:35.080 --> 01:06:36.480
on GitHub actions.

01:06:36.480 --> 01:06:37.540
And then that's the end of it.

01:06:37.540 --> 01:06:37.700
Right.

01:06:37.700 --> 01:06:42.660
You probably got to set up your whole, like some whole build system into a set of minis.

01:06:42.660 --> 01:06:45.340
And like, that doesn't sound very practical for most people.

01:06:45.700 --> 01:06:49.340
Ideally with what you could do is, I mean, you just need one mini and then you set up

01:06:49.340 --> 01:06:53.140
a GitHub actions, hosted runner, a locally hosted runner.

01:06:53.140 --> 01:06:57.560
and other systems in that too, get, get, get lab CI was big on that.

01:06:57.560 --> 01:06:59.720
you can, you can do anything on get lab CI.

01:06:59.720 --> 01:07:01.760
We just haven't tested that because they don't have those publicly.

01:07:01.760 --> 01:07:04.920
But, if you, if you have your own, you can do that.

01:07:04.920 --> 01:07:09.140
I know, I know somebody who does this with basically with root and runs the, has a Mac mini

01:07:09.140 --> 01:07:11.000
and runs the M1 builds on that.

01:07:11.000 --> 01:07:13.000
But, you could do that.

01:07:13.000 --> 01:07:15.480
And I have a Mac mini and the lead developer of CI build.

01:07:15.560 --> 01:07:18.400
Will also has a, Mac mini or the M1.

01:07:18.400 --> 01:07:20.020
He has an M1 of something.

01:07:20.020 --> 01:07:20.400
I don't know.

01:07:20.400 --> 01:07:21.260
I have a Mac mini.

01:07:21.260 --> 01:07:22.020
Mine is Mac.

01:07:22.020 --> 01:07:24.160
That's what I'm, talking to you right now on.

01:07:24.160 --> 01:07:25.400
It's a fantastic little machine.

01:07:25.400 --> 01:07:25.900
Yeah.

01:07:25.900 --> 01:07:27.320
It's, it's very impressive.

01:07:27.320 --> 01:07:28.620
I love the way the boost histogram.

01:07:28.620 --> 01:07:29.200
It was fast.

01:07:29.200 --> 01:07:34.620
I have a 16 inch, almost maxed out, MacBook and the Mac mini M1.

01:07:34.720 --> 01:07:36.440
It was faster on boost histogram than this thing.

01:07:36.440 --> 01:07:36.940
Wow.

01:07:36.940 --> 01:07:37.440
Yeah.

01:07:37.440 --> 01:07:42.020
I have a maxed out six, 15 inches, a little bit older, a couple of years, but I just don't

01:07:42.020 --> 01:07:45.780
touch that thing unless I literally need it as a laptop because I want to be somewhere else.

01:07:45.780 --> 01:07:47.600
But yeah, I'm definitely not drawn to it.

01:07:47.600 --> 01:07:52.120
These, so you could probably set up one of these minis for 700 bucks and then tie it up.

01:07:52.180 --> 01:07:56.820
But that's again, not as easy as, you know, just clicking the public free option that works,

01:07:56.820 --> 01:07:59.600
but still it's, it's within the realm of possibility.

01:07:59.600 --> 01:08:00.160
Yeah.

01:08:00.160 --> 01:08:04.860
And Apple has actually helped out several, like, I know, homebrew and a few others they've

01:08:04.860 --> 01:08:10.080
helped out with, by giving them either Mac minis or some, some, something that they

01:08:10.080 --> 01:08:10.700
could build with.

01:08:10.700 --> 01:08:17.080
So they, I believe, brew actually builds, homebrew actually builds on him on real

01:08:17.080 --> 01:08:17.540
M1s.

01:08:17.540 --> 01:08:20.060
I know it does because they're, the builds are super fast.

01:08:20.060 --> 01:08:20.640
I remember that.

01:08:20.860 --> 01:08:24.600
Like it builds root like 20 minutes, the root recipe, because I maintain that.

01:08:24.600 --> 01:08:29.240
And the normal one takes about an hour because running on multiple cores, but, it's like

01:08:29.240 --> 01:08:30.020
three times faster.

01:08:30.020 --> 01:08:31.560
It's done in 20 minutes.

01:08:31.560 --> 01:08:33.780
I just thought something was wrong when I first saw that.

01:08:33.780 --> 01:08:34.460
That's it.

01:08:34.460 --> 01:08:35.140
How could it be done?

01:08:35.140 --> 01:08:36.160
Something broke.

01:08:36.160 --> 01:08:36.600
What broke?

01:08:36.600 --> 01:08:37.240
Interesting.

01:08:37.240 --> 01:08:40.520
All right, Henry, we're getting really short on time, a little bit over, but it's been a

01:08:40.520 --> 01:08:41.140
fun conversation.

01:08:41.140 --> 01:08:42.660
How about you give us a look at the future?

01:08:42.660 --> 01:08:43.420
Where are things going?

01:08:43.420 --> 01:08:45.840
with all the stuff.

01:08:45.840 --> 01:08:50.780
Next thing I'm interested in, in, being involved with is scikit build, which,

01:08:50.840 --> 01:08:57.220
is a, a package that currently sort of augments setup tools, but hopefully will eventually

01:08:57.220 --> 01:09:01.380
sort of replace setup tools as your, as the thing that you, build with.

01:09:01.380 --> 01:09:02.900
And it will call out to CMake.

01:09:02.900 --> 01:09:06.360
So you basically just, you basically write a CMake, file.

01:09:06.620 --> 01:09:11.160
And this could wrap an existing package, or maybe you need some of the other things that

01:09:11.160 --> 01:09:11.700
CMake has.

01:09:11.700 --> 01:09:16.380
And this will then let you build that as a regular Python package.

01:09:16.380 --> 01:09:19.820
In fact, recently somebody, sort of put together, see, I build a wheel.

01:09:19.820 --> 01:09:20.360
Yeah.

01:09:20.360 --> 01:09:25.520
It's like it built in the CMake example and, and built, LLVM and pulled out just the

01:09:25.520 --> 01:09:28.140
claim format tool and made wheels out of that.

01:09:28.500 --> 01:09:30.580
And now you can just do pip install clang format.

01:09:30.580 --> 01:09:32.000
It's one to two megabytes.

01:09:32.000 --> 01:09:34.260
It works on all systems, including Apple Silicon and things.

01:09:34.260 --> 01:09:37.540
I just tried it on Apple Silicon yesterday and it's a pip install.

01:09:37.540 --> 01:09:39.380
Now you can claim format C++ code.

01:09:39.380 --> 01:09:41.320
And that's just, you know, mind blowing.

01:09:41.320 --> 01:09:42.320
You can add it to pre-commit.

01:09:42.320 --> 01:09:44.460
The pre-commit CI, it runs in two.

01:09:44.460 --> 01:09:48.980
I mean, I'd been fighting for about a week to reduce the, the, size of the claim format

01:09:48.980 --> 01:09:52.540
recipe from 600 megabytes to just under the 250.

01:09:52.540 --> 01:09:54.500
That was the maximum for pre-commit.CI.

01:09:54.500 --> 01:09:59.800
And then you can now pip install under about a megabyte for Linux, that, that sort of thing.

01:09:59.800 --> 01:10:04.780
And I think that would be really, that would be a really great thing to, to work on.

01:10:04.780 --> 01:10:08.940
It's been around since 2014, but it needs some, some serious work.

01:10:08.940 --> 01:10:13.280
And so I'm currently actually working on writing a grant to, to try to get funded, to just work

01:10:13.280 --> 01:10:17.820
on, basically the scikit build, scikit build system and looking for interesting science

01:10:17.820 --> 01:10:23.340
use cases that would be interested in, adapting, or switching, existing build

01:10:23.340 --> 01:10:24.940
system over or adapting to it.

01:10:24.940 --> 01:10:30.400
or taking something that has never been available from Python and making it available.

01:10:30.400 --> 01:10:32.080
And yes, root, root might be one.

01:10:32.080 --> 01:10:33.480
Scikit build package.

01:10:33.480 --> 01:10:35.280
I'm looking for wide variety.

01:10:35.280 --> 01:10:35.840
Yeah.

01:10:35.840 --> 01:10:36.300
How neat.

01:10:36.300 --> 01:10:40.500
Scikit build package is fundamentally just the glue between set of tools, Python module and

01:10:40.500 --> 01:10:40.840
CMake.

01:10:40.840 --> 01:10:41.360
Yeah.

01:10:41.360 --> 01:10:45.800
So it's a real way to take some of these things that were based on CMake and sort of expose

01:10:45.800 --> 01:10:46.440
them to Python.

01:10:46.800 --> 01:10:46.960
Yeah.

01:10:46.960 --> 01:10:51.380
So you can just, have a CMake package that does all the CMake things well, you know,

01:10:51.380 --> 01:10:55.720
like finding, finding different libraries and, and that I'm a big CMake person.

01:10:55.720 --> 01:10:57.520
But how many of you physically uses it very heavily.

01:10:57.520 --> 01:10:58.960
Most C++ does.

01:10:58.960 --> 01:10:59.960
It's about 60%.

01:10:59.960 --> 01:11:03.160
I think of all field systems are, are CMake based now.

01:11:03.160 --> 01:11:06.300
going from GitWare's numbers, but they make CMake.

01:11:06.300 --> 01:11:10.060
But, it's, I think it's a, it's very powerful.

01:11:10.180 --> 01:11:11.300
It can be used for things like that.

01:11:11.300 --> 01:11:18.200
And, will really open up a much easier C++, more natural in C++ and C and, and

01:11:18.200 --> 01:11:19.020
Fortran and things like that.

01:11:19.020 --> 01:11:20.660
And CUDA then is currently available.

01:11:20.660 --> 01:11:23.980
Setup tools is, disto tools is going away in Python 3.12.

01:11:23.980 --> 01:11:29.580
Setup tools is not really designed to build C++ packages or packages.

01:11:29.580 --> 01:11:34.820
It was really just a hack on top of disto tools, which happened to be build just Python itself.

01:11:34.820 --> 01:11:40.440
So, well, scikit build sounds like the perfect tool to apply to the science space because

01:11:40.440 --> 01:11:45.500
there's so many of these weird compiled things that are challenging to, you know, install and

01:11:45.500 --> 01:11:47.140
deploy and share and so on.

01:11:47.140 --> 01:11:48.300
So making that easier.

01:11:48.300 --> 01:11:48.840
Sounds good.

01:11:48.840 --> 01:11:49.480
All right.

01:11:49.480 --> 01:11:53.700
Well, I think we're probably going to need, need to leave it there just for the sake of time,

01:11:53.700 --> 01:11:59.020
but it's been, it's been awesome to talk about all the internals of supporting scikit

01:11:59.020 --> 01:12:02.380
hep and people should check out cibuildwheel.

01:12:02.380 --> 01:12:06.720
It looks like it, you know, if you're maintaining a package either publicly or just for internal

01:12:06.720 --> 01:12:08.640
for your organization, it looks like it'd be a big help.

01:12:08.640 --> 01:12:09.040
Yeah.

01:12:09.040 --> 01:12:11.080
If it's got binary, any sort of binary build in it.

01:12:11.080 --> 01:12:11.300
Yes.

01:12:11.300 --> 01:12:12.160
Yeah, absolutely.

01:12:12.160 --> 01:12:13.560
If not build is fine.

01:12:13.560 --> 01:12:14.080
Yeah.

01:12:14.080 --> 01:12:14.520
Right.

01:12:14.520 --> 01:12:16.880
And I learned about build, which is good to know.

01:12:16.880 --> 01:12:17.680
All right.

01:12:17.680 --> 01:12:20.880
So before you get out of your Henry, let me ask you the two final questions.

01:12:20.880 --> 01:12:25.800
you're going to write some code, I mean, like Python code, what editor would you use?

01:12:25.800 --> 01:12:27.680
Depends on how much it'll either be VI.

01:12:27.680 --> 01:12:29.260
If it's a very small amount.

01:12:29.260 --> 01:12:34.920
if it's a really large project that let's say it takes several days, then I'll use,

01:12:34.920 --> 01:12:35.660
PyCharm.

01:12:35.660 --> 01:12:38.600
And then I've really started using VS Code quite a bit.

01:12:38.600 --> 01:12:42.200
And that's sort of expanding to fill in all the middle ground and kind of eating in on both

01:12:42.200 --> 01:12:43.920
of the other, both of the edges.

01:12:43.920 --> 01:12:44.620
Yeah.

01:12:44.620 --> 01:12:45.020
Yeah.

01:12:45.020 --> 01:12:46.220
There's some interesting stuff going there.

01:12:46.220 --> 01:12:46.600
Good choice.

01:12:46.600 --> 01:12:50.640
But all with the VI, mode or plugins added, of course.

01:12:51.060 --> 01:12:53.180
And then, notable PyPI package.

01:12:53.180 --> 01:12:55.160
I mean, we probably talked about 20 already.

01:12:55.160 --> 01:12:57.580
If you want to just give a shout out to one of those, that's fine.

01:12:57.580 --> 01:12:58.760
Or if you got a new idea.

01:12:58.760 --> 01:13:03.480
I'm going to go with one that's, unlike, I might not get mentioned, but I, I, I'm really

01:13:03.480 --> 01:13:04.260
excited by it.

01:13:04.260 --> 01:13:09.680
The development of it is, the, I think the developer is quite new, but what he's actually

01:13:09.680 --> 01:13:12.460
done as far as the actual package has been, been nice.

01:13:12.460 --> 01:13:14.140
It needs, it needs some, some nice touches.

01:13:14.360 --> 01:13:19.300
But, and that is plot text, yellow T T E X T.

01:13:19.300 --> 01:13:23.680
And I'm really excited about that because it makes these, the actual plots it makes are

01:13:23.680 --> 01:13:24.900
really, really nice.

01:13:24.900 --> 01:13:28.520
And they're plotted to the terminal and, it can integrate with rich.

01:13:28.520 --> 01:13:32.760
and of course, I'm, I'm interested in it because I want to integrate it with tech.

01:13:32.760 --> 01:13:38.720
I want to see it integrated with a textual, I think a textual app that combines this with,

01:13:38.720 --> 01:13:41.440
uh, file browsers and things like that.

01:13:41.440 --> 01:13:42.220
It'd be incredible.

01:13:42.220 --> 01:13:42.820
Yeah.

01:13:42.820 --> 01:13:45.220
So you can do things like the terminal, for example.

01:13:45.220 --> 01:13:45.740
Yeah.

01:13:45.740 --> 01:13:51.160
So you could like cruise around your files, use your, your root IO integration, pull

01:13:51.160 --> 01:13:53.560
these things up here and, you know, put the plot right on the screen.

01:13:53.560 --> 01:13:53.780
Right.

01:13:53.780 --> 01:13:54.620
But in the terminal.

01:13:54.620 --> 01:13:55.160
Okay.

01:13:55.160 --> 01:13:55.500
Yeah.

01:13:55.500 --> 01:13:56.080
This is really cool.

01:13:56.080 --> 01:13:56.720
I had no idea.

01:13:56.720 --> 01:13:57.960
And this is based on rich.

01:13:57.960 --> 01:13:58.400
You say.

01:13:58.400 --> 01:13:59.920
it can integrate with rich.

01:13:59.920 --> 01:14:00.680
It integrates with rich.

01:14:00.680 --> 01:14:00.940
Okay.

01:14:00.940 --> 01:14:01.420
Got it.

01:14:01.420 --> 01:14:01.620
Yeah.

01:14:01.620 --> 01:14:04.800
So as soon as I saw it, I started trying to make sure the two people were talking to each

01:14:04.800 --> 01:14:05.040
other.

01:14:05.040 --> 01:14:08.000
Will and the person who is developing this.

01:14:08.000 --> 01:14:08.800
Yeah, exactly.

01:14:08.800 --> 01:14:09.160
All right.

01:14:09.160 --> 01:14:10.020
These things work together.

01:14:10.020 --> 01:14:10.900
That's very cool.

01:14:11.260 --> 01:14:12.380
They seem like they should, right?

01:14:12.380 --> 01:14:14.100
They're in the same general zone.

01:14:14.100 --> 01:14:14.680
Yeah.

01:14:14.680 --> 01:14:15.420
And they do now.

01:14:15.420 --> 01:14:19.500
The, you had, there had to be some communication back and forth as far as what size the plots

01:14:19.500 --> 01:14:19.680
were.

01:14:19.680 --> 01:14:20.300
Right.

01:14:20.300 --> 01:14:21.800
This should, this should work in it.

01:14:21.800 --> 01:14:22.620
A good recommendation.

01:14:22.620 --> 01:14:24.600
definitely one I had not learned about.

01:14:24.600 --> 01:14:25.860
So I'm sure people will enjoy that.

01:14:25.860 --> 01:14:27.280
All right, Henry, final call to action.

01:14:27.280 --> 01:14:31.520
People want to do more with wheels, cibuildwheel, or maybe some of the other stuff we talked

01:14:31.520 --> 01:14:31.760
about.

01:14:31.760 --> 01:14:32.340
What do you tell them?

01:14:32.340 --> 01:14:36.980
look through, I think one of the best places to go is the psychic developer pages.

01:14:36.980 --> 01:14:40.380
If you have no interest in psychic tools or hep at all.

01:14:40.740 --> 01:14:43.480
and that sort of shows you how all these things integrate together really well.

01:14:43.480 --> 01:14:46.440
And, has nice, has nice documentation.

01:14:46.440 --> 01:14:48.300
Of course, cibuildwheel itself is nice.

01:14:48.300 --> 01:14:54.080
And the pipe PA, a lot of the IP projects have gotten, good documentation as well as packaging

01:14:54.080 --> 01:14:55.020
of python.org.

01:14:55.140 --> 01:14:56.440
We've updated that quite a bit.

01:14:56.440 --> 01:15:01.540
Look like to reflect some of these things, but I would really, I really like the psychic

01:15:01.540 --> 01:15:03.020
developer pages.

01:15:03.020 --> 01:15:04.460
I mean, I'm biased because I wrote most of them.

01:15:04.460 --> 01:15:06.340
Nice.

01:15:06.340 --> 01:15:06.780
Yeah.

01:15:06.780 --> 01:15:07.660
I'll link to those.

01:15:07.660 --> 01:15:11.120
And I'll, I'll try to link to pretty much everything else that we spoke to as well.

01:15:11.120 --> 01:15:14.420
So people can check out the podcast player show notes to find all that stuff.

01:15:14.480 --> 01:15:18.540
I guess one final thing that we didn't call out that I think is worth pointing out is CI build

01:15:18.540 --> 01:15:21.200
wheel is under the pipe, the Python packaging authority.

01:15:21.200 --> 01:15:24.120
So it gives it some officialness, I guess you should say.

01:15:24.120 --> 01:15:24.820
Yes.

01:15:24.820 --> 01:15:28.740
That happened after, after I joined one of the first things I wanted to do was I thought this

01:15:28.740 --> 01:15:29.760
should really be in the pipe PA.

01:15:30.220 --> 01:15:34.820
And, I was sort of pushing for that and the other developers were fine with that.

01:15:34.820 --> 01:15:39.720
And so we brought it up and, I actually joined the pipe PA just before that by becoming

01:15:39.720 --> 01:15:40.420
a member of build.

01:15:40.420 --> 01:15:44.700
so I got to vote on cibuildwheel coming in, but it was a very enthusiastic vote, even

01:15:44.700 --> 01:15:45.300
without my vote.

01:15:45.300 --> 01:15:48.060
and pipX joined right at the same time too.

01:15:48.060 --> 01:15:50.080
So those were, it was fighting time.

01:15:50.080 --> 01:15:51.440
PipX is a great library.

01:15:51.440 --> 01:15:53.800
I really like the way pipX works.

01:15:53.800 --> 01:15:54.440
It's a great tool.

01:15:54.440 --> 01:15:56.380
All right, Henry, thank you for being here.

01:15:56.380 --> 01:15:57.160
It's been great.

01:15:57.380 --> 01:16:01.660
Thanks for all the insight on all these internals around building and installing Python packages.

01:16:01.660 --> 01:16:03.220
There's also a lot more in my blog.

01:16:03.220 --> 01:16:04.300
So I sign them, pie.

01:16:04.300 --> 01:16:04.540
Dot.

01:16:04.540 --> 01:16:05.380
Get lab.

01:16:05.380 --> 01:16:05.600
Dot.

01:16:05.600 --> 01:16:05.740
I.

01:16:05.740 --> 01:16:09.360
Oh, so that's also a link to look that links to all those other things, obviously do.

01:16:09.360 --> 01:16:10.680
Thanks again for being here.

01:16:10.680 --> 01:16:11.100
Yeah.

01:16:11.100 --> 01:16:11.420
See ya.

01:16:11.420 --> 01:16:11.980
Thanks for having me.

01:16:11.980 --> 01:16:12.380
Yeah.

01:16:12.380 --> 01:16:12.680
You bet.

01:16:12.680 --> 01:16:16.080
This has been another episode of talk Python to me.

01:16:16.080 --> 01:16:20.760
Our guest on this episode was Henry Schreiner and it's brought to you by us over at talk

01:16:20.760 --> 01:16:24.080
Python training and the transcripts were brought to you by assembly AI.

01:16:24.940 --> 01:16:27.540
Do you need a great automatic speech to text API?

01:16:27.540 --> 01:16:30.060
Get human level accuracy in just a few lines of code.

01:16:30.060 --> 01:16:32.920
Visit talkpython.fm/assembly AI.

01:16:32.920 --> 01:16:34.680
Want to level up your Python?

01:16:34.680 --> 01:16:38.740
We have one of the largest catalogs of Python video courses over at talk Python.

01:16:38.740 --> 01:16:43.900
Our content ranges from true beginners to deeply advanced topics like memory and async.

01:16:43.900 --> 01:16:46.580
And best of all, there's not a subscription in sight.

01:16:46.580 --> 01:16:49.480
Check it out for yourself at training.talkpython.fm.

01:16:49.640 --> 01:16:51.380
Be sure to subscribe to the show.

01:16:51.380 --> 01:16:54.160
Open your favorite podcast app and search for Python.

01:16:54.160 --> 01:16:55.480
We should be right at the top.

01:16:55.480 --> 01:17:01.280
You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the

01:17:01.280 --> 01:17:04.820
direct RSS feed at /rss on talkpython.fm.

01:17:05.700 --> 01:17:08.260
We're live streaming most of our recordings these days.

01:17:08.260 --> 01:17:12.380
If you want to be part of the show and have your comments featured on the air, be sure to

01:17:12.380 --> 01:17:16.100
subscribe to our YouTube channel at talkpython.fm/youtube.

01:17:16.100 --> 01:17:17.940
This is your host, Michael Kennedy.

01:17:17.940 --> 01:17:19.240
Thanks so much for listening.

01:17:19.240 --> 01:17:20.400
I really appreciate it.

01:17:20.660 --> 01:17:22.320
Now get out there and write some Python code.

01:17:22.320 --> 01:17:43.080
I'll see you next time.