WEBVTT

00:00:00.001 --> 00:00:05.700
We've spoken previously about security and software supply chains, and we're back at it on this episode.

00:00:05.700 --> 00:00:10.560
We're diving in again with Charlie Coggins. Charlie works at a software supply chain company

00:00:10.560 --> 00:00:16.900
and is on the episode to give us an insider's look and a defender's perspective on how to keep

00:00:16.900 --> 00:00:25.620
our Python apps and infrastructure safe. This is Talk Python To Me, episode 457, recorded January 24th, 2024.

00:00:25.620 --> 00:00:45.000
Welcome to Talk Python To Me, a weekly podcast on Python. This is your host, Michael Kennedy.

00:00:45.000 --> 00:00:50.060
Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,

00:00:50.060 --> 00:00:55.600
both on fosstodon.org. Keep up with the show and listen to over seven years of past

00:00:55.600 --> 00:01:02.260
episodes at talkpython.fm. We've started streaming most of our episodes live on YouTube. Subscribe to

00:01:02.260 --> 00:01:07.800
our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and

00:01:07.800 --> 00:01:13.820
be part of that episode. This episode is brought to you by Sentry. Don't let those errors go unnoticed.

00:01:13.820 --> 00:01:20.880
Use Sentry like we do here at Talk Python. Sign up at talkpython.fm/sentry. And it's brought to

00:01:20.880 --> 00:01:27.480
you by Mailtrap, an email delivery platform that developers love. Use their email sandbox to inspect

00:01:27.540 --> 00:01:32.220
debug emails and debug emails in staging, dev, and QA environments before sending them to recipients in

00:01:32.220 --> 00:01:40.040
production. Try Mailtrap for free at talkpython.fm/Mailtrap. Hey, Charlie, welcome to Talk Python To Me.

00:01:40.040 --> 00:01:40.860
Hi, Michael.

00:01:40.860 --> 00:01:47.400
Great to have you here. We have corresponded back and forth about security things. And now,

00:01:48.200 --> 00:01:54.720
you're scared. It's going to seem that way. There are threats everywhere, especially when you start

00:01:54.720 --> 00:02:00.440
looking. And that's the problem. You look, you'll find them. And if you're not looking, you might get

00:02:00.440 --> 00:02:03.140
affected without even knowing it.

00:02:03.140 --> 00:02:07.920
Yeah, but that's true. But we're also going to come with some tools and techniques and tips on

00:02:07.920 --> 00:02:11.300
how to avoid security problems with your Python code.

00:02:11.300 --> 00:02:12.540
Yes, absolutely.

00:02:12.540 --> 00:02:21.020
Yeah, I think it's especially concerning. That certainly catches my attention that if you mess

00:02:21.020 --> 00:02:27.740
with somebody's software, like the software builders, the developers, it gets shipped to however many

00:02:27.740 --> 00:02:31.960
users are on the other side of that equation, right? It's not like I just took over some

00:02:31.960 --> 00:02:39.560
teenagers gaming PC, and now what can I do? It's like, I took over, you know, name your big web app,

00:02:39.560 --> 00:02:44.660
and now we're going to start shipping some stuff around. All right, that's where the sort of

00:02:44.660 --> 00:02:50.120
multiplicative aspect of this gets more concerning than just standard personal computer safety, right?

00:02:50.120 --> 00:02:58.080
Oh, absolutely. You know, a single developer can have very broad impacts. You know, maybe they

00:02:58.080 --> 00:03:04.240
publish one package, but that one package could be included in hundreds, thousands of other packages

00:03:04.240 --> 00:03:10.200
as a dependency. And then everyone using those packages could be affected, whether the code is

00:03:10.200 --> 00:03:15.220
good and works as intended or poorly written and has bugs and vulnerabilities.

00:03:15.220 --> 00:03:21.860
Yeah, this is not to say there's any chance of there being a problem with Pydantic. But just to make

00:03:21.860 --> 00:03:27.520
your point, if you go to like Pydantic or Request or something like that, a lot of these have used by

00:03:27.520 --> 00:03:35.460
projects, right? And this Pydantic is used by 315,000 people, not people, software projects that

00:03:35.460 --> 00:03:40.760
themselves have users, right? And so that's the kind of stuff that I'm thinking about when I said that

00:03:40.760 --> 00:03:44.400
multiplicative effect, right? It's a big multiplier, not just a couple.

00:03:44.400 --> 00:03:45.980
Oh, yeah. Yeah, for sure.

00:03:45.980 --> 00:03:51.240
Yeah. Now, before we dive into our main topic, of course, you know, tell people a bit about yourself.

00:03:51.240 --> 00:03:59.080
Hi, well, my name is Charles Coggins. I usually go by Charlie. And I'm a Python developer. I'm a

00:03:59.080 --> 00:04:03.240
software developer, but not through the traditional sense. I don't have a computer science degree. I

00:04:03.240 --> 00:04:11.560
didn't come to this, you know, straight out of school. I got my first taste of programming long

00:04:11.560 --> 00:04:18.620
enough ago, back in the 80s, in 1987. My dad got a computer for us. And, you know, I was messing

00:04:18.620 --> 00:04:25.100
around on there with some games, always with games, right? You know, at the time, it was basic.

00:04:25.100 --> 00:04:30.160
You know, it was this bowling game that my brother and I would play. And I saw that I could look at the

00:04:30.160 --> 00:04:34.700
code. I could look at the source. And I went in there and modified it a bit to make it so that I would

00:04:34.700 --> 00:04:36.740
always win whenever I played him.

00:04:38.340 --> 00:04:39.960
How long did it take him to catch on?

00:04:39.960 --> 00:04:44.640
Oh, he figured out pretty quickly. And he was in there to change in, you know, ball speed and,

00:04:44.640 --> 00:04:52.880
you know, how often he could get a gutter or get a gutter. But yeah, I, you know, took a class or two in high

00:04:52.880 --> 00:04:59.620
school and college. But I was an electrical engineering major and then went to work for the government

00:04:59.620 --> 00:05:07.460
doing something that wasn't even really that. So I spent 10 years working for the government before

00:05:07.460 --> 00:05:17.580
they stood up the U.S. Cyber Command and decided or figured out that they needed to hire 6,000 new

00:05:17.580 --> 00:05:24.080
developers to fill the positions. And there weren't that many available in the industry, let alone those

00:05:24.080 --> 00:05:30.040
who could, you know, pass the clearances and work in that environment. So they looked to people already

00:05:30.040 --> 00:05:35.480
working in the government. And I raised my hand. I said, yes, yes, I want to cross train. I'll be a

00:05:35.480 --> 00:05:38.040
developer. And so they trained me.

00:05:38.040 --> 00:05:41.200
Excellent. What did they teach you for language in that program?

00:05:41.200 --> 00:05:49.540
We started with C, C++, and then there was some Python. So I went through a couple of boot camps and

00:05:49.540 --> 00:05:55.560
a lot of self-learning, self-teaching. Python's the one that really clicked for me. It just made sense

00:05:55.560 --> 00:05:56.580
in my head.

00:05:56.580 --> 00:06:02.720
Yeah, of course. If you're learning to do cybersecurity stuff, you know, a lot of times I'd be happy to tell

00:06:02.720 --> 00:06:09.240
people like, ah, you don't really need to learn C or Rust or Java. If you just know Python, you're

00:06:09.240 --> 00:06:15.560
probably 90% of the time golden. But if you're trying to do cybersecurity, a lot of times it's about

00:06:15.560 --> 00:06:21.720
like the machine level stuff, right? Understanding things like C and pointers and buffer overflows and

00:06:21.720 --> 00:06:24.980
all of that kind of stuff is where you actually kind of need to be.

00:06:24.980 --> 00:06:30.860
And they taught us all that as well. In fact, we learned assembly language as well. And that one

00:06:30.860 --> 00:06:33.040
really didn't fit in my brain.

00:06:33.040 --> 00:06:38.100
You're not like, I didn't want to become an assembly language programmer.

00:06:38.580 --> 00:06:41.380
I mean, yeah, that's a whole different breed.

00:06:41.380 --> 00:06:47.680
Yeah, it sure is. And, you know, it used to be, I remember when I first got into programming,

00:06:47.680 --> 00:06:53.800
I was doing some C, C++ and inline assembly was something people would do a lot to optimize. A

00:06:53.800 --> 00:06:59.880
lot like people might do Cython or Numba or something like that to make Python fast. Like we'll find this

00:06:59.880 --> 00:07:03.560
little part and we'll rewrite it in this way and be like, we're just going to do inline assembly. I'm like,

00:07:03.560 --> 00:07:09.660
that just doesn't seem like worthwhile. I don't need that much performance. We're going to not do that.

00:07:09.660 --> 00:07:10.600
Yeah. Yeah.

00:07:10.600 --> 00:07:20.580
Fun. So now you're working at Phylum. Is it Python focused or just software security?

00:07:20.740 --> 00:07:28.920
It's not Python focused. In fact, the company primarily develops with Rust, as you were mentioning.

00:07:28.920 --> 00:07:29.840
Okay. Yeah.

00:07:29.840 --> 00:07:36.560
Yeah. Yeah. We've got some excellent Rust developers at our company. And I think that's what attracted

00:07:36.560 --> 00:07:43.420
a lot of them is that that is the primary language we use. But we also have some elements in Python. And

00:07:43.420 --> 00:07:51.020
when I came on board, I got assigned to work on our integrations. So like GitHub,

00:07:51.020 --> 00:07:59.080
integrations, GitLab, pre-commit hooks, things like that. And so I was able to kind of architect it the

00:07:59.080 --> 00:08:06.920
way I thought best. And because I love Python, I made it all in Python and exposed it through Docker

00:08:06.920 --> 00:08:07.320
containers.

00:08:07.320 --> 00:08:17.980
Are you doing direct integration with Rust, like Py03? Or is it more issuing commands out?

00:08:17.980 --> 00:08:25.000
The Rust elements that our company works on, like our API, the command line interface, a lot of the

00:08:25.000 --> 00:08:31.120
backend, it's just written straight Rust. And then the Python is just plain Python. There's no

00:08:31.120 --> 00:08:33.900
interface between the two, really.

00:08:33.900 --> 00:08:37.920
Yeah. Okay. Consuming APIs and Docker containers and stuff like that.

00:08:37.920 --> 00:08:45.020
Right. Right. Right. Although I am interested in the Py03. And I think there's room to

00:08:45.020 --> 00:08:47.660
bridge the two languages at our company.

00:08:47.660 --> 00:08:56.780
I mean, for sure, people are adopting Rust for the performance foundations of Python. It's pretty

00:08:56.780 --> 00:08:57.120
interesting.

00:08:57.120 --> 00:09:03.880
Yeah. Yeah. I've been at the company almost two years now. I keep saying it's what I'm going to

00:09:03.880 --> 00:09:09.060
learn next is Rust. And I felt like I would just kind of absorb it by going through code reviews and

00:09:09.060 --> 00:09:14.480
the people on my team. It hasn't happened yet. I can kind of understand what's going on by reading it,

00:09:14.600 --> 00:09:17.000
but I just, yeah, I need to jump in.

00:09:17.000 --> 00:09:19.540
It's in depth. Okay. Got it. Those are the same. Okay. Got it.

00:09:19.540 --> 00:09:20.120
Yeah. Yeah.

00:09:20.120 --> 00:09:26.300
No, it's interesting. Okay. Well, we're not here to talk about Rust, although I do think

00:09:26.300 --> 00:09:32.220
it's becoming one of those things that is sort of, I don't know if you need to be a little one level

00:09:32.220 --> 00:09:38.200
deeper in the Python space that used to be C and now it's, I think it's pretty solidly moving to be

00:09:38.200 --> 00:09:43.240
Rust, right? It's, there's a lot of popular things. Identic, for example, I pulled up earlier where

00:09:43.240 --> 00:09:46.600
that's the foundation, but that also seems to be where the momentum is.

00:09:46.600 --> 00:09:53.040
Yeah. The oxidation of Python libraries is a real thing. I mean, you know, look at Ruff, right?

00:09:53.040 --> 00:10:01.260
Yeah. Ruff. I just heard about how Granian, I think it was, which is a new, similar,

00:10:01.440 --> 00:10:08.800
similar to G Unicorn and Microwisky is a Rust-based async server. You know, there's, it goes on and on.

00:10:08.800 --> 00:10:15.700
This portion of Talk Python is brought to you by OpenTelemetry support at Sentry.

00:10:16.580 --> 00:10:22.200
In the previous two episodes, you heard how we use Sentry's error monitoring at Talk Python and how

00:10:22.200 --> 00:10:27.880
distributed tracing connects errors, performance, and slowdowns, and more across services and tiers.

00:10:27.880 --> 00:10:34.200
But you may be thinking, our company uses OpenTelemetry, so it doesn't make sense for us to switch to

00:10:34.200 --> 00:10:40.620
Sentry. After all, OpenTelemetry is a standard and you've already adopted it, right? Well, did you know,

00:10:40.940 --> 00:10:45.520
with just a couple of lines of code, you can connect OpenTelemetry's monitoring and reporting

00:10:45.520 --> 00:10:51.680
to Sentry's backend. OpenTelemetry does not come with a backend to store your data, analytics on top

00:10:51.680 --> 00:10:58.060
of that data, a UI, or error monitoring. And that's exactly what you get when you integrate Sentry with

00:10:58.060 --> 00:11:05.060
your OpenTelemetry setup. Don't fly blind. Fix and monitor code faster with Sentry. Integrate your

00:11:05.060 --> 00:11:09.760
OpenTelemetry systems with Sentry and see what you've been missing. Create your Sentry account at

00:11:09.760 --> 00:11:16.180
talkpython.fm/sentry dash telemetry. And when you sign up, use the code TALKPYTHON,

00:11:16.180 --> 00:11:20.980
all caps, no spaces. It's good for two free months of Sentry's business plan, which will give you

00:11:20.980 --> 00:11:27.240
20 times as many monthly events as well as other features. My thanks to Sentry for supporting Talk

00:11:27.240 --> 00:11:34.420
Python and me. All right, well, let's talk about software security though. You know, like we touched

00:11:34.420 --> 00:11:39.280
on it a little bit with the multiplicative aspect of like why software developers should care.

00:11:39.580 --> 00:11:46.260
But maybe let's start with some ways in which viruses might get on a computer from a software

00:11:46.260 --> 00:11:51.560
perspective. Not from like, oh, you know, I found this cool app on BitTor and normally it's paid,

00:11:51.560 --> 00:11:56.200
but this one's free. It's like, maybe don't install that. But, you know, not that kind of advice,

00:11:56.200 --> 00:11:58.540
right? But, you know, specifically for software developers.

00:11:59.060 --> 00:12:07.020
Right, right. So for software developers, I think the primary vector, you know, for malicious code

00:12:07.020 --> 00:12:12.940
running in your environment or really any developer environment along the way, it doesn't just have to

00:12:12.940 --> 00:12:19.320
be your system. It could be your CI, CD servers and your runners. It's going to be software

00:12:19.320 --> 00:12:26.160
dependencies, third-party code, right? Code from strangers on the internet, right? That's really what it boils down to.

00:12:26.160 --> 00:12:32.760
They just, Charlie, they're just here to help out. They're just giving you the code to help out.

00:12:32.880 --> 00:12:35.100
They have no bad intentions. Right, right.

00:12:35.100 --> 00:12:38.200
Except for that one. That one over there, don't take that one.

00:12:39.100 --> 00:12:49.840
Yeah. And it's hard to tell, you know, what's good, what's bad. And I think we all rely on third-party code.

00:12:49.840 --> 00:12:57.980
I mean, I think it's a rare company, rare project that writes everything from scratch on their own without any dependencies.

00:12:57.980 --> 00:12:59.000
Yeah.

00:12:59.320 --> 00:13:07.900
So that's a vector for sure, is allowing code from strangers on the internet to run. I think, like, the name of the game, right,

00:13:07.900 --> 00:13:16.200
for attackers and threat actors is arbitrary code execution. Like, that's the key phrase.

00:13:16.200 --> 00:13:21.980
Arbitrary code execution. If I can get arbitrary code execution with this vulnerability, then I've won.

00:13:21.980 --> 00:13:22.620
I can attack your stuff.

00:13:22.620 --> 00:13:26.180
You're going to get a CVE score of nine or above. It's right there.

00:13:26.440 --> 00:13:32.180
Yeah, exactly. And that's for vulnerabilities. That's just, you know, poorly written code or code with bugs.

00:13:32.180 --> 00:13:41.940
But forget about vulnerabilities. I mean, if you're an attacker, you're a threat actor, you've already got the perfect means to run arbitrary code,

00:13:41.940 --> 00:13:46.760
to gain arbitrary code execution on a developer system. That's with third-party dependencies.

00:13:46.760 --> 00:13:55.920
Open source software is just the perfect target for writing malware or slipping malware into packages.

00:13:55.920 --> 00:14:02.340
Now, when people hear this, we've talked about it enough. It actually came as quite a surprise a few years ago.

00:14:02.340 --> 00:14:12.200
People theoretically knew that it could happen, but that it was happening is that packages on package stores like PyPI and npm and so on

00:14:12.200 --> 00:14:16.720
got published vulnerabilities that people could then install and make part of theirs.

00:14:16.840 --> 00:14:19.260
But there's a whole software supply chain, right?

00:14:19.260 --> 00:14:21.980
Maybe talk us through some of the different elements that make that up.

00:14:21.980 --> 00:14:24.400
Only one of which is these libraries, right?

00:14:24.400 --> 00:14:25.480
That's right. That's right.

00:14:25.480 --> 00:14:35.380
So the software supply chain is really it's using third-party code securely, as well as securing the end-to-end development process.

00:14:35.840 --> 00:14:39.960
So that process is very broadly broken into three phases.

00:14:39.960 --> 00:14:43.400
You've got the source phase.

00:14:43.400 --> 00:14:49.020
That's source control management systems and then actually coding.

00:14:49.020 --> 00:14:54.400
Developers coding on their systems, committing to repositories.

00:14:55.400 --> 00:14:55.960
Yeah.

00:14:55.960 --> 00:15:01.160
And you mentioned the dependencies like pip install this or that.

00:15:01.160 --> 00:15:13.120
There's also, for many of the really popular IDs and editors, there's a whole massive array of varying levels of trusted plugins or extensions, right?

00:15:13.120 --> 00:15:14.020
As well.

00:15:14.020 --> 00:15:14.900
That's right.

00:15:14.900 --> 00:15:15.220
Yeah.

00:15:15.220 --> 00:15:18.840
Like Visual Studio Code, that's what I use for my IDE.

00:15:19.340 --> 00:15:22.960
It's got an extensive extension ecosystem.

00:15:22.960 --> 00:15:26.060
Just about anything you want to do.

00:15:26.060 --> 00:15:30.920
I get a little pop-up when I open a new project and it says, oh, I recognize you're using a YAML file.

00:15:30.920 --> 00:15:35.080
Do you want to download this extension that will lend to YAML files, right?

00:15:35.080 --> 00:15:35.940
Yeah.

00:15:35.940 --> 00:15:37.200
I got one for CBEs.

00:15:37.200 --> 00:15:42.180
It was like rainbow CSV syntax highlighter or something.

00:15:42.180 --> 00:15:43.380
I'm like, you know what?

00:15:43.380 --> 00:15:46.680
That's not really made by a trusted company.

00:15:46.680 --> 00:15:47.760
It's probably fine.

00:15:47.760 --> 00:15:48.800
It's probably fine.

00:15:49.080 --> 00:15:56.760
But I don't need my CSV files highlighted so much so that I'm willing to just run arbitrary code from a stranger on the internet.

00:15:56.760 --> 00:15:57.700
That's right.

00:15:57.700 --> 00:15:58.300
Right?

00:15:58.300 --> 00:15:58.560
Yep.

00:15:58.560 --> 00:16:07.840
And, you know, I use both PyCharm and VS Code and they both, especially PyCharm, has sort of a warning that says, this is untrusted.

00:16:07.840 --> 00:16:08.960
It's a third-party thing.

00:16:08.960 --> 00:16:10.060
Are you sure you want it?

00:16:10.060 --> 00:16:10.360
Right.

00:16:10.360 --> 00:16:11.420
Just saying no.

00:16:11.420 --> 00:16:12.520
That's a pretty light warning.

00:16:12.520 --> 00:16:13.280
Yeah.

00:16:13.280 --> 00:16:15.800
And also, they're not the same, right?

00:16:15.800 --> 00:16:18.720
Is it installed by a million people used every day?

00:16:18.820 --> 00:16:21.300
Or is it for you the fourth person to use it?

00:16:21.300 --> 00:16:27.540
And it hasn't, you know, hasn't had the experience of people going, why is it opening a network socket?

00:16:27.540 --> 00:16:28.180
What's it doing?

00:16:28.180 --> 00:16:29.740
You know, something like that.

00:16:31.320 --> 00:16:31.720
Yeah.

00:16:31.720 --> 00:16:32.020
Yeah.

00:16:32.020 --> 00:16:36.840
That's another entry point you got to be careful about.

00:16:36.840 --> 00:16:37.560
All right.

00:16:37.560 --> 00:16:38.540
Well, I cut you off.

00:16:38.540 --> 00:16:40.920
We're only in like square one of maybe nine.

00:16:40.920 --> 00:16:41.020
Yeah.

00:16:41.020 --> 00:16:42.560
Square one, source code.

00:16:42.840 --> 00:16:45.720
And then there's the build phase.

00:16:45.720 --> 00:16:48.160
That's where you take the code.

00:16:48.160 --> 00:16:50.780
You take the commits that have gone into source control.

00:16:50.780 --> 00:16:54.100
And you build something with it, right?

00:16:54.100 --> 00:17:00.580
This usually happens in, you know, your CI, CD systems, GitHub, and Git Labs of the world.

00:17:02.440 --> 00:17:12.200
And it's at that point where, you know, your third party dependencies get included and wrapped up into your artifacts, right?

00:17:13.160 --> 00:17:20.960
Which brings us to the third stage of the software supply chain, which is the package and deploy phase.

00:17:20.960 --> 00:17:27.440
That's where you're creating your artifacts and making them available to the world to use.

00:17:27.880 --> 00:17:28.600
Could be anything.

00:17:28.600 --> 00:17:34.320
Could be a wheel for a library that other parts of your company use to build software.

00:17:34.320 --> 00:17:34.900
Yep.

00:17:34.900 --> 00:17:37.140
Could be some app you ship.

00:17:37.140 --> 00:17:39.740
It could actually be a website, an API.

00:17:39.740 --> 00:17:40.560
Who knows, right?

00:17:40.560 --> 00:17:41.060
Yeah.

00:17:41.060 --> 00:17:42.840
Docker container.

00:17:42.840 --> 00:17:43.300
Docker container.

00:17:43.300 --> 00:17:43.620
Yeah.

00:17:43.620 --> 00:17:44.180
Yeah.

00:17:44.180 --> 00:17:44.840
Yeah, exactly.

00:17:44.840 --> 00:17:57.440
And then by the time you get to that, you know, the end of the supply chain and, you know, the products or the packaged product that people are going to see and use and work with, you know,

00:17:57.480 --> 00:18:11.660
you've baked in so many elements at that point, you know, from your third party dependencies to, you know, any other external resources that are getting called.

00:18:11.660 --> 00:18:17.100
So there's lots of points along the way that it's possible to.

00:18:17.100 --> 00:18:18.140
Yeah.

00:18:18.140 --> 00:18:26.000
One of the things that can be sneaky is, you know, it doesn't happen that often in Python, but you're shipping like a Windows or a Mac app.

00:18:26.360 --> 00:18:32.160
There's a digital signature proof of we're going to sign this with our trusted certificate.

00:18:32.160 --> 00:18:34.340
So it doesn't even give you any warnings.

00:18:34.340 --> 00:18:36.700
Like, look, this is it's signed by the company.

00:18:36.700 --> 00:18:37.560
It is trusted.

00:18:37.560 --> 00:18:38.660
Here you go.

00:18:38.660 --> 00:18:39.440
Pick it.

00:18:39.440 --> 00:18:39.920
Right.

00:18:39.920 --> 00:18:45.900
And somewhere upstream from that, there's an issue like with packages or other things.

00:18:45.900 --> 00:18:50.240
Well, that issue is now that that problem is signed and verified as well.

00:18:50.640 --> 00:18:50.780
Yeah.

00:18:50.780 --> 00:18:51.280
Yeah.

00:18:51.280 --> 00:18:51.320
Yeah.

00:18:51.320 --> 00:18:57.180
You know, so you mentioned you mentioned code code signing the research team at our company.

00:18:57.180 --> 00:19:00.400
I mean, they're amazing, amazing group there.

00:19:00.400 --> 00:19:03.580
They're always finding new and novel attacks.

00:19:04.080 --> 00:19:15.700
And one they found just this past week involves something kind of cool where the attacker had bundled up a valid Microsoft binary.

00:19:15.700 --> 00:19:17.160
It had been signed by Microsoft.

00:19:17.620 --> 00:19:22.740
But they bundled it with a DLL that was malicious.

00:19:22.740 --> 00:19:25.700
It was named something to be expected.

00:19:25.700 --> 00:19:26.060
Right.

00:19:26.140 --> 00:19:38.180
So when you run the executable on the binary, you know, you could see that there's this Microsoft signs application looking for permissions, looking to continue.

00:19:38.180 --> 00:19:39.100
And you're like, oh, yeah, great.

00:19:39.100 --> 00:19:39.680
Signed by Microsoft.

00:19:39.680 --> 00:19:40.080
No problem.

00:19:40.080 --> 00:19:45.680
But then it uses this technique called like DLL search order hijacking.

00:19:45.680 --> 00:19:47.060
Okay.

00:19:47.060 --> 00:19:47.520
That technique.

00:19:47.520 --> 00:19:47.760
Right.

00:19:47.760 --> 00:19:56.180
So if you have a DLL that's being called by the application more locally than not, that's what it's called.

00:19:56.180 --> 00:19:57.300
So it's looking for something in like.

00:19:57.300 --> 00:19:58.400
Yeah.

00:19:58.400 --> 00:20:04.420
It'll look for the name of the DLL in the same directory first, basically, is what's happening.

00:20:04.420 --> 00:20:04.860
Right.

00:20:04.860 --> 00:20:05.320
Right.

00:20:05.320 --> 00:20:09.040
They had shipped their bad DLL with a good binary.

00:20:09.040 --> 00:20:20.640
So you pick something in system 32 that's got like a real common name like VC runtime whatever dot DLL or, you know, some of the standard ones.

00:20:20.640 --> 00:20:22.820
But then you completely reprogram it.

00:20:22.820 --> 00:20:23.220
Yeah.

00:20:23.220 --> 00:20:24.400
And stick it in there with that app.

00:20:24.400 --> 00:20:26.780
Or maybe not completely because you need the app to not crash.

00:20:26.780 --> 00:20:30.260
But you give it some extra boost when it does something.

00:20:30.260 --> 00:20:30.560
Right.

00:20:30.560 --> 00:20:31.020
Yeah.

00:20:31.020 --> 00:20:31.620
Yeah.

00:20:31.620 --> 00:20:40.740
In this case, they had just copied all the files needed for execution into a new directory, including the known good binary, the known bad DLL.

00:20:40.740 --> 00:20:44.120
And then, you know, it had everything it needed in that directory to run.

00:20:44.120 --> 00:20:45.600
And it looked like it was legitimate.

00:20:45.600 --> 00:20:46.220
Right.

00:20:46.220 --> 00:20:53.740
Because a lot of the OS dependent, a lot of these OS checks are on the executable, the system libraries that they use.

00:20:53.740 --> 00:20:53.960
Right?

00:20:53.960 --> 00:20:54.480
Right.

00:20:54.480 --> 00:20:54.960
Right.

00:20:54.960 --> 00:20:55.200
Right.

00:20:55.200 --> 00:20:57.960
You'll see like this, this executable is downloaded from the internet.

00:20:57.960 --> 00:20:59.000
Are you sure you want to run it?

00:20:59.000 --> 00:21:05.340
Like that doesn't say this executable, what you trust is maybe possibly using a library that you downloaded.

00:21:05.340 --> 00:21:06.460
Like it doesn't say that.

00:21:06.460 --> 00:21:06.640
Right.

00:21:06.640 --> 00:21:07.200
Yeah.

00:21:07.200 --> 00:21:11.780
Cause we could never get work done if there was that level of checking all over the place.

00:21:11.780 --> 00:21:12.960
This is what updated somewhere.

00:21:12.960 --> 00:21:18.580
This portion of talk Python to me is brought to you by Mailtrap.

00:21:18.580 --> 00:21:20.540
We're going to keep this super short.

00:21:20.540 --> 00:21:22.460
So please pay attention or you'll miss it.

00:21:22.460 --> 00:21:25.900
Mailtrap is an email delivery platform that developers love.

00:21:26.320 --> 00:21:36.040
An email sending solution with industry best analytics, SMTP, and email APIs and SDKs for major programming languages with 24 seven human support.

00:21:36.040 --> 00:21:38.980
What makes them unique is their email sandbox.

00:21:38.980 --> 00:21:46.780
Use email sandbox to inspect and debug emails in staging, dev, and QA environments before sending them to recipients in production.

00:21:46.780 --> 00:21:53.120
Try Mailtrap for free at talkpython.fm/mailtrap.

00:21:53.120 --> 00:21:56.000
That's kind of the space that we're talking about, right?

00:21:56.000 --> 00:21:57.300
We've got editors.

00:21:57.300 --> 00:22:00.120
We've got libraries that you use.

00:22:00.120 --> 00:22:01.700
CI, CD pipelines.

00:22:01.700 --> 00:22:04.700
Containers are super interesting as well.

00:22:04.700 --> 00:22:07.100
And all the tools to go with those.

00:22:07.340 --> 00:22:14.180
So let's talk through some of the posts that you've written and also just selected about some of these things.

00:22:14.180 --> 00:22:17.480
And maybe starting to the front of that list there with block files.

00:22:17.480 --> 00:22:18.180
Yeah.

00:22:18.180 --> 00:22:18.600
Okay.

00:22:18.600 --> 00:22:22.340
So, yes, I wrote a blog post.

00:22:22.340 --> 00:22:24.600
I guess it's looking at the date on your screen.

00:22:24.600 --> 00:22:26.920
It looks like it was over a year ago now.

00:22:26.920 --> 00:22:28.880
And probably seems like yesterday, but no.

00:22:28.880 --> 00:22:29.780
Yeah, that's right.

00:22:29.780 --> 00:22:30.140
That's right.

00:22:31.140 --> 00:22:34.000
So I'm sure the landscape has changed since then a bit.

00:22:34.000 --> 00:22:36.240
And maybe there's some new players out there.

00:22:36.240 --> 00:22:37.700
But, yeah.

00:22:37.700 --> 00:22:38.040
Yeah.

00:22:38.040 --> 00:22:47.980
I think one thing you can do as a developer, a big one I would recommend, is use lock files for your dependencies, right?

00:22:47.980 --> 00:22:51.460
And, you know, what's a lock file?

00:22:51.660 --> 00:23:01.300
Well, it's the fully resolved set of dependencies that are used by your application, your package.

00:23:01.300 --> 00:23:09.800
And, you know, if nothing else, like, you should know what's going into your code, right?

00:23:09.800 --> 00:23:10.320
Like, what?

00:23:10.320 --> 00:23:10.640
Right.

00:23:10.640 --> 00:23:12.480
Well, one of the ways this helps.

00:23:12.480 --> 00:23:13.440
Yeah, exactly.

00:23:13.440 --> 00:23:15.580
That's a really, that's a bit of a challenge, right?

00:23:15.580 --> 00:23:20.740
And I think I'll admit when I first got into Python, I didn't do this that well.

00:23:20.740 --> 00:23:27.720
And, you know, to me, it felt like probably the biggest issue I might run into is instability in my app, right?

00:23:27.720 --> 00:23:33.360
Like, for example, if I don't pin a dependency, some new thing comes out, I reinstall it on a new computer.

00:23:33.360 --> 00:23:35.140
Maybe it gets an upgraded version.

00:23:35.140 --> 00:23:37.300
And there's some library that doesn't work, right?

00:23:37.300 --> 00:23:44.940
I mean, there's been certainly popular libraries that just said we're having a major version change and we're fixing the mistakes we made 10 years ago.

00:23:44.940 --> 00:23:47.420
And these three functions are changing or whatever, right?

00:23:47.420 --> 00:23:48.300
That would break it.

00:23:48.300 --> 00:23:55.060
But it could also be there's now a malicious version of library X and that's version two.

00:23:55.060 --> 00:24:01.260
But if you pinned it on version one, even though it's bad, you're still not getting the bad one, at least for a while, right?

00:24:01.260 --> 00:24:02.160
Absolutely.

00:24:02.160 --> 00:24:02.600
Yes.

00:24:02.600 --> 00:24:05.680
So I think I got to look it up.

00:24:05.680 --> 00:24:06.400
I always forget.

00:24:06.400 --> 00:24:08.860
Pep 665.

00:24:08.860 --> 00:24:09.840
Okay.

00:24:09.840 --> 00:24:10.800
Yeah.

00:24:10.800 --> 00:24:11.640
Pep 665.

00:24:11.640 --> 00:24:12.260
665.

00:24:12.260 --> 00:24:14.620
It's a rejected PEP.

00:24:14.620 --> 00:24:18.160
Unfortunately, but it was written by Brett Cannon, some others.

00:24:18.160 --> 00:24:20.360
I know you've had Brett on the show a number of times.

00:24:20.360 --> 00:24:22.520
I love the stuff he does.

00:24:22.520 --> 00:24:24.320
Yeah, he does excellent work.

00:24:24.320 --> 00:24:33.120
And it's kind of a shame this was rejected, but this Pep tried to create a standard block file format for Python.

00:24:35.020 --> 00:24:43.180
And, you know, if you look into the Pep a little bit, you know, there's some motivation about like why you'd want to do this and, you know, four big reasons.

00:24:43.180 --> 00:24:48.480
And the third one is when I really key on, which is that, you know, lock files allow for reproducibility.

00:24:48.480 --> 00:24:51.340
And reproducibility is just more secure.

00:24:51.340 --> 00:25:00.960
Because when, you know, I'm quoting here from the Pep says, when you control exactly what files are installed, you can make sure no malicious actor is attempting to slip nefarious code into your application.

00:25:01.720 --> 00:25:03.520
IE, some supply chain attacks.

00:25:03.520 --> 00:25:09.120
By using a lock file, which always leads to reproducible installs, we can avoid certain risks entirely.

00:25:09.800 --> 00:25:12.380
And, I mean, that's the name of the game.

00:25:12.380 --> 00:25:27.620
That's like, that's what our company focuses on, which is avoiding those risks by ensuring you know which dependencies you're using and you're knowing that those dependencies are benign or good, you know, doing no harm.

00:25:27.620 --> 00:25:40.760
Even if there's something that happens, usually it's going to happen to a popular library because you're using it, hence probably other people are using it, other than type of squatting, which we can talk about.

00:25:40.760 --> 00:25:47.720
But, you know, if you pin your dependencies, chances are it's, these things only stick around for a little while.

00:25:47.720 --> 00:25:50.160
It's not like, oh, they discovered it had been there for eight months.

00:25:50.160 --> 00:25:52.540
It's like, oh my gosh, we heard about it.

00:25:52.540 --> 00:25:55.040
A few people got it and then we got rid of it, right?

00:25:55.040 --> 00:25:55.680
Yes.

00:25:55.680 --> 00:25:57.260
The folks at Pype are pretty excellent.

00:25:57.440 --> 00:26:00.260
So it's, to some degree, a timing issue as well.

00:26:00.260 --> 00:26:00.780
Yes.

00:26:00.780 --> 00:26:01.820
Yeah.

00:26:01.820 --> 00:26:03.700
Vulnerabilities are different, right?

00:26:03.700 --> 00:26:06.280
Where that's what a lot of people focus on.

00:26:06.280 --> 00:26:13.060
A lot of the tooling exists to, you know, discover vulnerabilities in your dependencies, which is good to know about those.

00:26:13.060 --> 00:26:16.220
But those exist for a long time, right?

00:26:16.220 --> 00:26:21.600
You have CVEs for known vulnerabilities and they end up in these databases and they're there for years.

00:26:21.600 --> 00:26:31.120
And if you're using old dependencies or maybe transitive dependencies or using old ones and you're stuck on it, then you're going to be exposed to those vulnerabilities.

00:26:31.120 --> 00:26:33.380
But what's different about that?

00:26:33.380 --> 00:26:38.820
Examples of those include the WebP library not too long ago, right?

00:26:38.860 --> 00:26:43.680
That was baked into Python and then also open SSL, right?

00:26:43.680 --> 00:26:45.420
So people discovered issues in those.

00:26:45.420 --> 00:26:48.800
Those are baked into different aspects of Python or some of the libraries.

00:26:48.800 --> 00:26:51.680
And it's like, well, all of a sudden there's this fire drill.

00:26:51.680 --> 00:26:52.420
Yes.

00:26:52.420 --> 00:26:55.660
Which is different than somebody going, I'm going to sneak a thing into the library.

00:26:55.660 --> 00:26:56.100
Right.

00:26:56.100 --> 00:26:58.580
And then it is a timing matter.

00:26:58.580 --> 00:27:01.000
So malicious dependencies, that's a whole other story.

00:27:01.000 --> 00:27:06.900
Because if a malicious package is discovered, there's not a CVE created for it.

00:27:06.900 --> 00:27:09.340
The package is just taken off of the registry.

00:27:09.340 --> 00:27:17.100
You know, you report it to the good people at PyPI and, you know, they'll review the submission and take it down.

00:27:17.100 --> 00:27:21.140
I've done a few of those myself and they're really fast.

00:27:21.140 --> 00:27:28.200
But there's still a window of time where that malicious package, that malicious dependency is up and available.

00:27:28.200 --> 00:27:32.380
And that's, you know, often all that's needed.

00:27:32.920 --> 00:27:33.360
Yeah, exactly.

00:27:33.360 --> 00:27:36.060
I do think having a pin dependency there is worthwhile.

00:27:36.060 --> 00:27:39.900
Because if you make a commit, your CI runs, et cetera, et cetera, right?

00:27:39.900 --> 00:27:44.880
Like the chances that you just bump the version to this malicious thing is pretty low.

00:27:44.880 --> 00:27:45.900
Yeah, exactly.

00:27:45.900 --> 00:27:46.640
So, yeah.

00:27:46.640 --> 00:27:47.600
Yeah.

00:27:47.600 --> 00:27:50.100
And having version ranges is not enough.

00:27:50.100 --> 00:27:53.340
You know, you need to have explicit versions, you know.

00:27:53.340 --> 00:27:55.960
Let's talk more about these lock files then, right?

00:27:55.960 --> 00:27:59.600
So there's actually a bunch of choices these days.

00:27:59.840 --> 00:28:03.340
You know, Brett's PEP tried to make it less of a choice.

00:28:03.340 --> 00:28:07.660
Say, well, it doesn't matter if you use hatch or pip or poetry or whatever.

00:28:07.660 --> 00:28:09.340
The outcome is the same.

00:28:09.340 --> 00:28:14.320
And for reasons that I don't haven't learned enough about, I don't know why that didn't work.

00:28:14.320 --> 00:28:16.840
But let's talk about what's out there now.

00:28:16.840 --> 00:28:18.420
Because there's a couple options at this point.

00:28:18.420 --> 00:28:19.360
Sure.

00:28:19.360 --> 00:28:21.000
I think the, yeah.

00:28:21.000 --> 00:28:25.740
So most Python developers are going to be most familiar with pip, right?

00:28:25.740 --> 00:28:27.800
That's the standard.

00:28:29.020 --> 00:28:31.060
And pip has requirements files.

00:28:31.060 --> 00:28:40.180
And, you know, they're unique in the lock file world because they can be named anything, right?

00:28:40.180 --> 00:28:43.160
Most other lock files have a defined name.

00:28:43.160 --> 00:28:44.180
We were talking about Rust earlier.

00:28:44.180 --> 00:28:47.160
You know, they're the gold standard for a lot of this stuff.

00:28:47.160 --> 00:28:49.100
And, you know, they're very clear.

00:28:49.100 --> 00:28:50.540
They have cargo.lock.

00:28:50.540 --> 00:28:51.520
That's their lock file.

00:28:51.520 --> 00:28:53.200
You can't name it anything else.

00:28:53.200 --> 00:28:55.400
Its contents are well defined.

00:28:55.400 --> 00:28:57.040
It is what it is.

00:28:57.040 --> 00:29:01.540
But in Python with pip, I mean, you could name it whatever you want.

00:29:01.540 --> 00:29:03.760
You know, dev requirements.ext.

00:29:03.760 --> 00:29:05.380
You could name it cargo.lock.

00:29:05.380 --> 00:29:08.820
But it can contain Python dependencies in it.

00:29:08.820 --> 00:29:09.340
Surprise.

00:29:09.340 --> 00:29:10.440
I'm not Rust.

00:29:11.900 --> 00:29:18.120
Basically, you can just put more or less arbitrary commands that are sent to pip in a text file, right?

00:29:18.120 --> 00:29:18.320
Yes.

00:29:18.320 --> 00:29:19.500
Which is more or less what it is.

00:29:19.500 --> 00:29:19.940
Yeah.

00:29:19.940 --> 00:29:20.900
Yep.

00:29:20.900 --> 00:29:21.320
Yep.

00:29:21.320 --> 00:29:25.520
Any command line option you can feed the pip, you can put in a requirements file.

00:29:25.520 --> 00:29:29.920
It's cool because you can have an import by saying dash or some other file.

00:29:29.920 --> 00:29:30.760
Yes.

00:29:30.760 --> 00:29:31.260
Yes.

00:29:31.260 --> 00:29:33.280
But it's also not super structured.

00:29:33.280 --> 00:29:33.700
You can get a hierarchy that way.

00:29:33.700 --> 00:29:34.120
Mm-hmm.

00:29:34.120 --> 00:29:34.420
Yeah.

00:29:34.420 --> 00:29:34.860
Yeah.

00:29:34.860 --> 00:29:47.320
So there are some tools available to turn those, like, loose requirements files, the pip requirements files, into strict lock files, right?

00:29:47.320 --> 00:29:51.320
Where every entry is pinned to a specific version.

00:29:51.320 --> 00:29:54.720
And pip itself can do it with the pip freeze command.

00:29:54.720 --> 00:29:57.600
So that's the one most people know about.

00:29:57.600 --> 00:30:05.680
But that one's kind of not so great because it only freezes the packages for the environment that you ran pip freeze in.

00:30:05.680 --> 00:30:06.080
You know?

00:30:06.080 --> 00:30:13.360
And maybe you're trying to publish your lock file for users of a different platform or system.

00:30:13.360 --> 00:30:19.180
The other thing that I don't like about it is you want to put just the things you actually use into your requirements file.

00:30:19.180 --> 00:30:21.940
Like, I'm using HTTPX and Pydantic.

00:30:21.940 --> 00:30:22.820
That's it.

00:30:22.820 --> 00:30:27.960
But what it really installs when you run that is the transitive closure of all those things.

00:30:27.960 --> 00:30:28.100
Yes.

00:30:28.100 --> 00:30:28.720
Which is fine.

00:30:28.720 --> 00:30:35.480
But you're not necessarily expressing that with just your requirements.txt, right?

00:30:35.480 --> 00:30:36.160
Right.

00:30:36.160 --> 00:30:37.000
Yeah.

00:30:37.000 --> 00:30:37.120
Yeah.

00:30:37.120 --> 00:30:41.060
Your two packages could balloon to, you know, 100 dependencies.

00:30:41.060 --> 00:30:43.120
And that's not uncommon.

00:30:43.280 --> 00:30:44.140
It's not even that bad.

00:30:44.140 --> 00:30:54.060
Like, in the JavaScript ecosystem, you know, the same handful of top-level dependencies could have two orders of magnitude explosion where you end up with thousands.

00:30:54.060 --> 00:30:56.060
There's a really...

00:30:56.060 --> 00:30:56.500
Oh, gosh.

00:30:56.500 --> 00:30:57.320
I can't find out.

00:30:57.320 --> 00:30:57.960
You know what?

00:30:57.960 --> 00:30:59.100
I think it's on...

00:30:59.100 --> 00:31:00.360
I think I put it on the Python bytes.

00:31:00.360 --> 00:31:01.560
But there's a really funny...

00:31:01.560 --> 00:31:03.900
I want to be able to pull this up for people so they can find it.

00:31:03.900 --> 00:31:07.980
There's a funny, funny thing that somebody did.

00:31:07.980 --> 00:31:10.520
Well, for some definition of funny.

00:31:11.900 --> 00:31:12.680
They put...

00:31:12.680 --> 00:31:16.540
Somebody created an npm package called Everything.

00:31:16.540 --> 00:31:17.580
Yes.

00:31:17.580 --> 00:31:23.000
And there's an article called Everything Becomes Too Much, the npm Package Chaos of 2024.

00:31:23.000 --> 00:31:23.880
Yeah.

00:31:23.880 --> 00:31:31.560
An npm user named Patrick JS launched a troll campaign with a package called Everything, which depends on every package in npm.

00:31:31.560 --> 00:31:32.420
Yeah.

00:31:32.420 --> 00:31:32.800
Yeah.

00:31:32.800 --> 00:31:33.340
And that's...

00:31:33.340 --> 00:31:34.400
I think it's the...

00:31:34.400 --> 00:31:37.100
NPMs are the largest package registry out there.

00:31:37.100 --> 00:31:38.060
So it's...

00:31:38.060 --> 00:31:40.440
I mean, it's already massive.

00:31:40.560 --> 00:31:46.180
I remember your early episodes, you would recount how many packages were on PyPI.

00:31:46.180 --> 00:31:47.340
And then we got to that...

00:31:47.340 --> 00:31:47.920
I don't even know.

00:31:47.920 --> 00:31:49.000
Are we past half a million?

00:31:49.000 --> 00:31:49.100
...6-figure number?

00:31:49.100 --> 00:31:50.220
Well, yeah.

00:31:50.220 --> 00:31:51.260
I remember it was a big deal.

00:31:51.260 --> 00:31:52.440
It got up to 100,000.

00:31:52.440 --> 00:31:53.980
And now it's probably, what?

00:31:53.980 --> 00:31:55.080
400,000?

00:31:55.080 --> 00:31:55.900
500,000?

00:31:55.900 --> 00:31:55.900
Over 500,000.

00:31:55.900 --> 00:31:57.140
Over 500,000.

00:31:57.140 --> 00:31:58.800
509 by rounding.

00:31:58.800 --> 00:31:59.400
Yeah.

00:31:59.400 --> 00:32:00.420
Half a million.

00:32:00.420 --> 00:32:01.700
Congratulations, world.

00:32:01.700 --> 00:32:03.580
Amazing.

00:32:03.580 --> 00:32:04.240
Yeah.

00:32:04.240 --> 00:32:04.560
Yeah.

00:32:04.560 --> 00:32:06.340
I just added two new ones last week.

00:32:06.340 --> 00:32:08.280
So I guess I've made a huge difference in that number.

00:32:08.280 --> 00:32:10.580
Nice.

00:32:10.580 --> 00:32:11.020
Yeah.

00:32:11.020 --> 00:32:14.440
So basically, the pip is awesome and it does a bunch of great stuff.

00:32:14.440 --> 00:32:20.680
And one of the things I really like about working with pip is I don't need to teach people anything if they want to work with my project.

00:32:20.680 --> 00:32:21.020
Right.

00:32:21.020 --> 00:32:27.600
I don't need to teach them like, oh, I know you love poetry, but I'm using a combination of the hatch build back end with PDM.

00:32:27.600 --> 00:32:28.420
You're like, what?

00:32:28.420 --> 00:32:29.780
I don't even know what those are.

00:32:29.780 --> 00:32:29.980
Right?

00:32:29.980 --> 00:32:36.780
Like, there's a lot of, like, ways in which you work that are brought in with a lot of these tools here.

00:32:36.780 --> 00:32:39.500
So pip is kind of like, you know, it just kind of works, right?

00:32:39.500 --> 00:32:40.020
Yes.

00:32:40.020 --> 00:32:50.940
But having this transitive closure managed is not part of what it does, but it's super important because if I need to upgrade something, I can't just change my version number in my requirements.

00:32:50.940 --> 00:32:53.960
Because that doesn't affect its dependency possibly, right?

00:32:53.960 --> 00:32:55.480
Like, it depends what it's said.

00:32:55.480 --> 00:32:57.360
So I'm a huge fan of pip-tools.

00:32:57.360 --> 00:32:59.200
This is actually what I do most of the time.

00:32:59.200 --> 00:32:59.740
Yes.

00:32:59.740 --> 00:33:01.340
pip-tools is another one.

00:33:01.340 --> 00:33:04.920
You can, it's great.

00:33:04.920 --> 00:33:15.100
I think it has this pip compile command that will take as an input, I think, just about any Python manifest type that's out there.

00:33:15.100 --> 00:33:20.640
So you can do setup.py, requirements.txt.

00:33:20.640 --> 00:33:24.020
I'm forgetting the other ones.

00:33:24.020 --> 00:33:27.820
The pip.ev.lock, maybe.

00:33:27.820 --> 00:33:31.740
Setup.cfg, pyproject.toml.

00:33:31.740 --> 00:33:40.380
It just recognizes all the different ways people could express their loose requirements, you know, the manifest files.

00:33:41.480 --> 00:33:41.600
Yeah.

00:33:41.600 --> 00:33:42.140
So, yeah.

00:33:42.140 --> 00:33:42.480
Yeah.

00:33:42.480 --> 00:33:43.400
I really like it.

00:33:43.400 --> 00:33:50.200
And you can say pip compile upgrade and it'll look at all the dependencies and upgrade them all as high as they can go.

00:33:50.200 --> 00:33:52.800
But what's nice about that is you'll be working for a while.

00:33:52.800 --> 00:33:58.740
Then you choose, like, well, let me just do a refresh on the dependencies right now and repin them and see how that works.

00:33:58.740 --> 00:34:01.320
And then just carry on with your business for a while, right?

00:34:01.320 --> 00:34:09.760
And it'll manage that transitive closure as well with, like, actually a really nice lock file where it described, like, these are all the things in the lock file.

00:34:09.760 --> 00:34:13.600
And the reason that, you know, for example, in your blog post, you say they're certified of this version.

00:34:13.600 --> 00:34:16.680
And it's there because you asked for it and because request needs it.

00:34:16.680 --> 00:34:19.840
You know, if you're like, why is this in my virtual environment?

00:34:19.840 --> 00:34:21.520
Why do I have this weird thing that I don't know?

00:34:21.560 --> 00:34:23.600
Like, it'll tell you, here's why it's there.

00:34:23.600 --> 00:34:24.200
Yeah.

00:34:24.200 --> 00:34:25.320
Yeah.

00:34:25.320 --> 00:34:29.480
One of the downsides, though, I think pip-tools has this issue.

00:34:29.480 --> 00:34:39.200
I know pip does, is that in determining that transitive dependency resolution, it is very possible.

00:34:39.200 --> 00:34:43.400
In fact, it usually happens that you have arbitrary code execution on your system, right?

00:34:43.400 --> 00:34:51.480
Like, if you start with the two top-level dependencies, like you mentioned, and it lists dependencies, well, then it'll pull those in and it acquires the metadata.

00:34:51.480 --> 00:34:54.120
From the wheel, if that exists.

00:34:54.120 --> 00:35:00.620
But if it doesn't, it'll build the package just to get the metadata filed, just to figure out which dependencies that needs.

00:35:00.620 --> 00:35:05.700
Are you saying I should set up a Docker container to execute this?

00:35:05.700 --> 00:35:07.860
I mean, that's, yeah, that's kind of what's happening.

00:35:07.860 --> 00:35:08.700
Maybe I should, yeah.

00:35:08.700 --> 00:35:09.360
Maybe I should.

00:35:09.360 --> 00:35:15.080
And, you know, yeah, running in a sandbox is another option, right?

00:35:15.080 --> 00:35:19.760
Where that's what my company, Phylum, that's one of the solutions we offer.

00:35:19.760 --> 00:35:27.280
You know, we have extensions for our CLI where you can wrap pip by just calling Phylum pip.

00:35:27.280 --> 00:35:29.360
And then everything runs in a sandbox.

00:35:29.360 --> 00:35:31.200
So that's another solution.

00:35:31.200 --> 00:35:32.060
Yeah.

00:35:32.060 --> 00:35:32.420
Yeah.

00:35:32.420 --> 00:35:32.700
Yeah.

00:35:32.700 --> 00:35:33.000
So good.

00:35:33.000 --> 00:35:43.060
Because, I mean, pip is a funny one because they even have a command line option called dry run, tac-tac-dry run, which you would think, oh, nothing's going to happen on my system.

00:35:43.060 --> 00:35:43.980
It's just separate.

00:35:43.980 --> 00:35:45.820
Running code from strangers on the internet.

00:35:45.820 --> 00:35:46.660
But it does.

00:35:46.660 --> 00:35:47.020
Yes.

00:35:47.020 --> 00:35:58.340
Dry run, even using dry run for pip install and pip download commands will or has the possibility of downloading and running arbitrary code from strangers on the internet.

00:35:59.140 --> 00:36:03.400
If we had, oh, wheels came along far after pip, right?

00:36:03.400 --> 00:36:07.480
And we've got the source distributions and setup.py and all that kind of stuff.

00:36:07.480 --> 00:36:13.800
And so if wheels existed from day one, it very well would be the case that this is not a problem, right?

00:36:13.800 --> 00:36:15.140
But, you know, what is pip supposed to do?

00:36:15.140 --> 00:36:18.940
Like, it has to evaluate this dynamic thing to figure out what it wants.

00:36:18.940 --> 00:36:19.300
Yes.

00:36:19.300 --> 00:36:20.100
Yes.

00:36:20.100 --> 00:36:20.340
Yeah.

00:36:20.340 --> 00:36:20.580
Yeah.

00:36:20.580 --> 00:36:29.020
Wheels are great because, you know, they have a metadata file in there that clearly lays out what the dependencies are.

00:36:29.020 --> 00:36:32.220
And there's no arbitrary code running when you install a wheel.

00:36:32.220 --> 00:36:35.520
It's just extracting and copying, you know?

00:36:35.520 --> 00:36:35.820
Yeah.

00:36:35.820 --> 00:36:37.680
A wheel is just a zip file.

00:36:37.680 --> 00:36:41.580
You extract that zip file and then copy the contents to various locations.

00:36:41.580 --> 00:36:49.340
But, yes, as you said, because we've had source distributions, tarballs, and then even eggs before that,

00:36:49.860 --> 00:36:53.300
and I'm probably never going to fully get rid of those.

00:36:53.300 --> 00:36:55.420
It just takes one.

00:36:55.420 --> 00:37:00.880
One dependency anywhere in your chain that is only distributed as a source distribution.

00:37:00.880 --> 00:37:07.780
Before now, you're downloading and building a package just to get metadata to continue.

00:37:07.780 --> 00:37:07.800
Yeah.

00:37:07.800 --> 00:37:09.700
And maybe you didn't actually choose it, right?

00:37:09.700 --> 00:37:12.200
It's the dependency of a dependency of a dependency.

00:37:12.200 --> 00:37:13.260
Absolutely.

00:37:13.740 --> 00:37:13.920
Yeah.

00:37:13.920 --> 00:37:14.240
Yeah.

00:37:14.240 --> 00:37:14.300
Yeah.

00:37:14.300 --> 00:37:15.180
That's, yeah.

00:37:15.180 --> 00:37:15.640
Yeah.

00:37:15.640 --> 00:37:23.220
You know, people often respond to some of the findings our company has where we'll, you know,

00:37:23.220 --> 00:37:26.160
we'll post these malicious packages with all sorts of crazy names.

00:37:26.160 --> 00:37:30.420
And people will respond to say, like, you know, why would I install that?

00:37:30.420 --> 00:37:36.240
Like, why would I ever install this, you know, random package that no one's heard of?

00:37:36.240 --> 00:37:37.600
It's like, well, you wouldn't.

00:37:37.600 --> 00:37:43.920
It's, it's, it's, it could be, but it could be included in, you know, the transit dependencies, right?

00:37:43.920 --> 00:37:54.820
If it gets, if it gets added to a slightly more legitimate package or, you know, worked up the chain that way, then, then yes, eventually, you know, you'll be running it unknowingly.

00:37:54.820 --> 00:37:55.340
Yeah.

00:37:55.340 --> 00:38:02.760
I think there's two important things we should talk about this before we move on, because there are some interesting ways in which you might unknow it.

00:38:02.760 --> 00:38:07.020
You might even try to do the right thing and you might actually shoot yourself in the foot by doing so.

00:38:07.020 --> 00:38:14.100
So number one, these like super strict lock files are awesome when you're building an application.

00:38:14.100 --> 00:38:16.620
I want to ship Talk Python training out.

00:38:16.620 --> 00:38:17.960
It's got its strict APIs.

00:38:17.960 --> 00:38:19.220
It runs on this version.

00:38:19.220 --> 00:38:22.340
It uses that version of Pydantic, that version of Beanie and whatever.

00:38:22.340 --> 00:38:22.880
Yeah.

00:38:22.880 --> 00:38:29.360
I want that to be fixed, fixed, zero flexibility until I decide through maybe a pip compile upgrade or whatever.

00:38:29.360 --> 00:38:30.340
I want a new one.

00:38:30.620 --> 00:38:41.980
However, if I was building a library that someone else was using, I would do them many headaches and a disservice to say, I depend on Pydantic 2.7.0.

00:38:41.980 --> 00:38:45.940
You're like, well, my other library needs Pydantic 8, 2.8.

00:38:45.940 --> 00:38:46.360
Right.

00:38:46.360 --> 00:38:48.820
And I can't use it and your library together.

00:38:48.820 --> 00:38:49.160
Right.

00:38:49.160 --> 00:38:56.300
So you need the, it's, it's a different story when you're building a library that others are going to consume than it is when you're building an application.

00:38:56.300 --> 00:39:01.440
And there was some, some disagreement, I guess, about the recommendation of pipenv for a while.

00:39:01.440 --> 00:39:05.500
And it's because I believe that pipenv is really focused on the application side.

00:39:05.500 --> 00:39:10.060
And it, I don't think it was made super clear that maybe it doesn't make as much sense for libraries.

00:39:10.060 --> 00:39:10.400
Right.

00:39:10.400 --> 00:39:11.720
So you want to speak to that a little?

00:39:11.720 --> 00:39:12.380
Yeah.

00:39:12.380 --> 00:39:12.720
Yeah.

00:39:12.720 --> 00:39:16.060
I'm, I'm an advocate for lock files for everyone.

00:39:16.060 --> 00:39:16.580
Right.

00:39:16.580 --> 00:39:20.940
Applications for sure, but also libraries and their developers.

00:39:20.940 --> 00:39:21.200
Right.

00:39:21.200 --> 00:39:27.180
Cause you know, if when you, when you, when you distribute a library, sure.

00:39:27.180 --> 00:39:31.400
You know, loose dependencies is, is probably the way to go there.

00:39:31.400 --> 00:39:38.240
But library developers, people who want to contribute to your projects, the developers themselves, maybe you work on a team.

00:39:38.240 --> 00:39:45.640
Having, having a lock file alongside your library is still going to be useful.

00:39:45.640 --> 00:39:46.040
Right.

00:39:46.040 --> 00:39:46.300
Like.

00:39:46.300 --> 00:39:46.640
Yeah.

00:39:46.640 --> 00:39:50.540
Cause that way you can say everyone, if somebody makes a change or they report a bug or whatever.

00:39:50.540 --> 00:39:50.900
Yeah.

00:39:50.900 --> 00:39:55.440
They're not bringing in a change from a different version of a dependency or like maybe something changed.

00:39:55.440 --> 00:39:55.680
Right.

00:39:55.680 --> 00:39:56.420
Yes.

00:39:56.420 --> 00:39:56.820
Yes.

00:39:56.820 --> 00:39:57.380
Yeah.

00:39:57.380 --> 00:40:03.860
and then, and it, plus it still allows you to, start from a known good spot.

00:40:03.860 --> 00:40:20.860
And then, maybe, maybe if you, if you know, you want to get the latest, then you can do it in a controlled environment, you know, like a sandbox or maybe a, on CI, you know, in a, in a throwaway runner that has no access to any, any secrets.

00:40:20.860 --> 00:40:22.860
Or, sensitive.

00:40:22.860 --> 00:40:23.420
sensitive.

00:40:23.420 --> 00:40:24.040
That's interesting.

00:40:24.040 --> 00:40:34.220
I hadn't really thought about having a specific requirements lock file type of thing for the libraries that I've been working on for the developers, right?

00:40:34.220 --> 00:40:35.320
For people who want to contribute.

00:40:35.320 --> 00:40:42.180
because it's just been like a loose requirement so that people that built against it aren't pinned into some very specific thing.

00:40:42.180 --> 00:40:43.060
But yeah, that makes a lot of sense.

00:40:43.060 --> 00:40:43.360
I think.

00:40:43.360 --> 00:40:43.840
Yeah.

00:40:43.840 --> 00:40:46.280
There's a, there's a link in that blog post.

00:40:46.280 --> 00:40:51.740
It's kind of dated now, but it's from the folks who built yarn, you know, JavaScript ecosystem.

00:40:51.740 --> 00:40:55.860
But, they had, they say it a lot more eloquently than I can.

00:40:55.860 --> 00:41:00.700
yeah, that's the one, lock files should be committed on all projects.

00:41:00.700 --> 00:41:00.940
Yeah.

00:41:00.940 --> 00:41:06.420
It's, I mean, it's a bit old now, but, but they, they go down the lists and spell it out a lot more clearly than me.

00:41:06.420 --> 00:41:12.260
And that's why, libraries even can benefit from, from publishing a lock file.

00:41:12.260 --> 00:41:12.760
Yeah.

00:41:12.760 --> 00:41:13.760
People can check that out.

00:41:13.760 --> 00:41:14.140
That's cool.

00:41:14.140 --> 00:41:14.660
Yeah.

00:41:14.660 --> 00:41:14.760
Yeah.

00:41:14.760 --> 00:41:16.860
And Java, that's the JavaScript package manager.

00:41:16.860 --> 00:41:20.060
So in JavaScript years, like a hundred years or something that's been a couple of years.

00:41:20.060 --> 00:41:20.600
That's right.

00:41:20.600 --> 00:41:21.100
Yeah.

00:41:21.100 --> 00:41:22.140
You got dog years.

00:41:22.140 --> 00:41:23.160
You got JavaScript years.

00:41:23.160 --> 00:41:26.280
JavaScript years just tick by like second, the second hand.

00:41:26.280 --> 00:41:26.700
Yeah.

00:41:26.700 --> 00:41:27.100
Yeah.

00:41:27.100 --> 00:41:27.780
All right.

00:41:27.780 --> 00:41:28.920
cool.

00:41:28.920 --> 00:41:32.420
So I see we're making great progress through our list of things to talk about here.

00:41:32.420 --> 00:41:35.640
I've gone through three and I like 15 left.

00:41:35.640 --> 00:41:36.300
We'll have plenty of time.

00:41:36.300 --> 00:41:41.180
so yeah, let's see.

00:41:41.180 --> 00:41:48.420
So another one, another pep, I think we're talking about here is 517, a build system independent

00:41:48.420 --> 00:41:49.920
format for source trees.

00:41:49.920 --> 00:41:51.040
I have no idea what this is.

00:41:51.040 --> 00:41:51.540
What is this?

00:41:51.540 --> 00:41:51.920
Yeah.

00:41:51.920 --> 00:41:54.620
Pep 517 and 518 kind of go together.

00:41:54.620 --> 00:42:00.680
This is, this was like the transition away from setup.py towards pyproject.toml.

00:42:00.680 --> 00:42:04.940
518 is the one that specifies pyproject.toml.

00:42:04.940 --> 00:42:07.140
kind of things that go in it.

00:42:07.140 --> 00:42:12.920
And then five, 517 is all about, build systems and build backends.

00:42:12.920 --> 00:42:19.020
so, so like in your pyproject.toml and your, in your, in your, build system

00:42:19.020 --> 00:42:26.200
key, you know, you'll often see things like, poetry core or flit or hatchling or,

00:42:26.200 --> 00:42:26.940
these kinds of things.

00:42:26.940 --> 00:42:33.020
And, and so it's five, PEP 517 is, is specifying what it means to be one of those build backends.

00:42:33.020 --> 00:42:37.340
it's really just defining two mandatory hooks.

00:42:37.340 --> 00:42:40.360
What does it mean to build wheel and build sdst?

00:42:40.360 --> 00:42:43.420
there's three optional hooks as well.

00:42:43.500 --> 00:42:47.820
And I think there's even another PEP that followed on from this that talks about, building

00:42:47.820 --> 00:42:51.300
editable, packages or, or, right.

00:42:51.300 --> 00:42:54.260
The dash, the dash E equivalents.

00:42:54.260 --> 00:42:54.680
Yeah.

00:42:54.680 --> 00:42:55.420
Yeah, exactly.

00:42:56.000 --> 00:43:01.840
but really it just boils down to, defining a way to build a wheel and build

00:43:01.840 --> 00:43:02.700
a source distribution.

00:43:02.700 --> 00:43:03.280
Yeah.

00:43:03.280 --> 00:43:09.620
And this is part of what opened up all the different choices we now have for package management and

00:43:09.620 --> 00:43:10.440
things like that.

00:43:10.440 --> 00:43:10.700
Right.

00:43:10.700 --> 00:43:14.900
Cause now there's a common way they can all work together a little bit like WSGI.

00:43:14.900 --> 00:43:15.560
Yes.

00:43:15.560 --> 00:43:15.980
Yeah.

00:43:15.980 --> 00:43:16.440
Yeah.

00:43:16.440 --> 00:43:20.140
I've been using hatchling for my build back in recently and it's been working real nicely.

00:43:20.140 --> 00:43:20.820
Okay.

00:43:20.820 --> 00:43:21.380
Yeah.

00:43:21.540 --> 00:43:27.680
I, I was just looking at hatchling the other day and they've got, yeah, yeah.

00:43:27.680 --> 00:43:31.740
They, they're one of the, they're one of the build backends that offers, build hooks,

00:43:31.740 --> 00:43:41.560
which, you know, so prior to, pipe project.toml and, and, and, and, wheels

00:43:41.560 --> 00:43:46.220
and beat us wheels and you go back to the source distributions and your setup.py files where

00:43:46.220 --> 00:43:47.480
it's just Python code.

00:43:47.480 --> 00:43:53.040
You can be, you can be doing anything in your setup.py file, which runs when you

00:43:53.040 --> 00:43:53.820
install the package.

00:43:53.820 --> 00:43:59.420
well now we're starting to see, you know, methods to do the same thing in these, in these

00:43:59.420 --> 00:44:01.540
more modern packaging or build backends.

00:44:01.540 --> 00:44:07.880
So like hatch has their, build hooks, build system hooks where you can, you can, you

00:44:07.880 --> 00:44:13.600
can, you can, point it to think, yeah, just Python code and have it, have it run as

00:44:13.600 --> 00:44:14.620
part of the build.

00:44:14.620 --> 00:44:15.020
Yeah.

00:44:15.020 --> 00:44:17.520
At least it only runs at build time, not install time.

00:44:17.520 --> 00:44:19.200
right.

00:44:19.200 --> 00:44:20.820
Yeah.

00:44:20.820 --> 00:44:22.660
I'm looking at the documentation now.

00:44:22.660 --> 00:44:29.000
I, I, I, yeah, this is still new to me, but there might be hooks for, for install as well.

00:44:29.000 --> 00:44:29.860
Okay.

00:44:29.860 --> 00:44:34.760
While you're thinking about it, one of the things I got a couple of questions I want to highlight

00:44:34.760 --> 00:44:41.480
from the audience here, but also one of the, one of the things that I think maybe was considered,

00:44:41.480 --> 00:44:46.960
I have no awareness of this, but if it wasn't would be excellent is what if the people at

00:44:46.960 --> 00:44:53.780
pip just pre-computed all that metadata from, at least for the common platforms that you would

00:44:53.780 --> 00:44:58.900
get that pip needs to download, run, set up pie, and then throw it away just to get that

00:44:58.900 --> 00:44:59.180
data.

00:44:59.380 --> 00:45:03.760
Like for Mac, Windows, and Linux, you know, if it would just go, okay, we're just going

00:45:03.760 --> 00:45:08.040
to like, as you upload it, it would just kick off a job that does that on those three platforms

00:45:08.040 --> 00:45:09.680
and put it in a JSON blob.

00:45:09.680 --> 00:45:10.160
Yeah.

00:45:10.160 --> 00:45:11.580
It seems like that would be worthwhile.

00:45:11.860 --> 00:45:17.760
I, I, I'm fairly certain there's discussions already around that type of a solution and

00:45:17.760 --> 00:45:23.420
maybe even a PEP for proposal, for it, but, yeah, getting away from having to build

00:45:23.420 --> 00:45:24.760
a package just to get metadata.

00:45:24.760 --> 00:45:25.880
yeah.

00:45:25.880 --> 00:45:31.860
You got packages that are downloaded billions of times with a B it's insane.

00:45:32.400 --> 00:45:37.320
And if somebody could do that three times instead of a billion times, it would make

00:45:37.320 --> 00:45:39.180
it work faster and it would also make it safe.

00:45:39.180 --> 00:45:39.400
Right.

00:45:39.400 --> 00:45:40.360
I think it'd be great.

00:45:40.360 --> 00:45:40.800
Yeah.

00:45:40.800 --> 00:45:41.320
All right.

00:45:41.320 --> 00:45:42.920
A couple of questions here.

00:45:42.920 --> 00:45:45.180
this one.

00:45:45.180 --> 00:45:50.760
So Tony on the audience says pip compile is great for finding your transitive dependencies.

00:45:50.760 --> 00:45:56.660
One interesting thing that they've done is package up code with pants build, which supports

00:45:56.660 --> 00:45:59.760
locks files just to look through what code gets packaged up.

00:45:59.760 --> 00:46:01.060
Is this anything you've explored?

00:46:02.060 --> 00:46:03.620
I've heard of pants.

00:46:03.620 --> 00:46:05.900
I haven't looked into it myself yet.

00:46:05.900 --> 00:46:06.380
Mm-hmm.

00:46:06.380 --> 00:46:07.180
okay.

00:46:07.180 --> 00:46:07.420
Yeah.

00:46:07.420 --> 00:46:07.880
Yeah.

00:46:07.880 --> 00:46:11.920
So just use it like, go, okay, you're going to have to build this thing and give me a little

00:46:11.920 --> 00:46:13.020
manifest and whatnot.

00:46:13.020 --> 00:46:14.060
And then we can just look at that.

00:46:14.060 --> 00:46:14.480
That's cool.

00:46:14.480 --> 00:46:14.900
Yeah.

00:46:14.900 --> 00:46:19.720
And then Tamir says, do you have a solution for taking already locked dependencies with you

00:46:19.720 --> 00:46:21.100
when you start a new app?

00:46:21.100 --> 00:46:24.480
I'm guessing, you know, maybe, yeah, I don't know.

00:46:24.480 --> 00:46:26.920
I, I guess maybe you've already got a project you're working on.

00:46:26.920 --> 00:46:28.740
You want to say like, I want this project to use that.

00:46:28.740 --> 00:46:30.840
Probably you could just copy the lock file.

00:46:30.840 --> 00:46:31.080
Right.

00:46:31.080 --> 00:46:31.800
Yeah.

00:46:31.800 --> 00:46:32.160
Yeah.

00:46:32.160 --> 00:46:32.200
Yeah.

00:46:32.200 --> 00:46:36.820
If you, I mean, if you really, I mean, really you're going to, if you start a new project,

00:46:36.820 --> 00:46:42.340
um, or new application, you're going to, you're going to have new, manifest file, you know,

00:46:42.340 --> 00:46:46.440
pyproject.toml, maybe you have the same dependencies, the top level dependencies or not,

00:46:46.440 --> 00:46:51.720
but the fully resolved set of dependencies that makes up your lock file, that, that, that,

00:46:51.720 --> 00:46:52.920
that can very easily be different.

00:46:52.920 --> 00:46:58.220
So I'm, I'm not exactly sure how you just poured over one to another.

00:46:58.360 --> 00:46:59.740
One more bit from Tony.

00:46:59.740 --> 00:47:05.200
This is, something that I now remember from pants is this, if it just looks through

00:47:05.200 --> 00:47:09.440
your code and if you use the import statement, regardless of whether you've put it in your

00:47:09.440 --> 00:47:13.540
requirements files, it'll figure out what your requirements file should have been.

00:47:13.640 --> 00:47:16.000
If you were a bad developer, basically.

00:47:16.000 --> 00:47:18.140
That's kind of cool.

00:47:18.140 --> 00:47:19.100
Just to see what it uses.

00:47:19.100 --> 00:47:19.540
Yeah.

00:47:19.540 --> 00:47:20.060
Nice.

00:47:20.060 --> 00:47:20.360
All right.

00:47:20.360 --> 00:47:22.380
onto the next thing.

00:47:22.380 --> 00:47:28.360
Specifying PEP 518, specifying minimum build system requirements for Python projects.

00:47:28.800 --> 00:47:28.980
Yeah.

00:47:28.980 --> 00:47:29.880
I'm guessing related.

00:47:29.880 --> 00:47:31.320
This is pyproject.toml.

00:47:31.320 --> 00:47:33.880
This is the, this is the PEP for that.

00:47:33.880 --> 00:47:34.900
Okay.

00:47:34.900 --> 00:47:39.500
There's not much to it other than to say that they've settled on that name, rejecting a bunch

00:47:39.500 --> 00:47:40.340
of other possibilities.

00:47:40.340 --> 00:47:45.080
And then they've got the, you know, the few entries that are required, like for your,

00:47:45.080 --> 00:47:46.380
your finding your build system.

00:47:46.380 --> 00:47:46.500
Excellent.

00:47:46.500 --> 00:47:47.060
Yeah.

00:47:47.060 --> 00:47:47.540
Yeah.

00:47:47.540 --> 00:47:50.560
You don't have to have a pyproject.toml for Python, but.

00:47:50.560 --> 00:47:51.040
No.

00:47:51.040 --> 00:47:51.420
Yeah.

00:47:51.420 --> 00:47:56.480
If you're building a Python library and you don't want to use setup.py, then you're much

00:47:56.480 --> 00:47:58.880
better off having a pyproject.toml, right?

00:47:58.880 --> 00:47:59.440
Yes.

00:47:59.440 --> 00:47:59.940
Yeah.

00:47:59.940 --> 00:48:00.380
Yeah.

00:48:00.380 --> 00:48:00.800
Yeah.

00:48:00.800 --> 00:48:00.900
Yeah.

00:48:00.900 --> 00:48:04.680
It's more in the library side that it, I mean, it's not that you can't use it on an application,

00:48:04.680 --> 00:48:07.260
but it's more required on the library side.

00:48:07.260 --> 00:48:07.700
Yeah.

00:48:07.700 --> 00:48:09.320
That's the thing.

00:48:09.320 --> 00:48:09.600
All right.

00:48:09.600 --> 00:48:14.020
So let's talk about some of the ways in which your packages might go wrong.

00:48:14.020 --> 00:48:18.080
We've already talked about typosquatting and we also talked about everything that's different.

00:48:18.080 --> 00:48:19.640
Yeah.

00:48:19.980 --> 00:48:23.180
But yeah, typosquatting is, it is tricky.

00:48:23.180 --> 00:48:27.380
I think it's pretty well understood at this, this point, but maybe just tell people real

00:48:27.380 --> 00:48:29.440
quick to cover that base, you know?

00:48:29.440 --> 00:48:29.920
Sure.

00:48:29.920 --> 00:48:36.280
Typosquatting is, is, you know, publishing a package with a name that's similar, but not

00:48:36.280 --> 00:48:39.800
the same as, as a, as a existing known good package.

00:48:39.800 --> 00:48:40.540
Right.

00:48:40.540 --> 00:48:48.800
So like, instead of requests, maybe you, you get request without the S or, you know, one

00:48:48.800 --> 00:48:52.820
that gets me cause I, cause I make the typo all the time was, is the cryptography package.

00:48:52.820 --> 00:48:57.060
Like, like if I, you know, if I put you on the spot, would you know how to spell cryptography?

00:48:57.060 --> 00:49:02.780
I always get the first couple of letters, you know, jumbled up a bit and, and there have

00:49:02.780 --> 00:49:10.160
been malicious packages published and then taken down with, with the, you know, spelled C-R-P-Y

00:49:10.160 --> 00:49:12.660
instead of C-R-Y-P cryptography.

00:49:12.660 --> 00:49:13.080
Right.

00:49:13.080 --> 00:49:13.600
Yeah.

00:49:13.600 --> 00:49:20.840
But, but the idea is that, you know you, you, you can overlook a package cause it looks like

00:49:20.840 --> 00:49:21.980
a, it looks like a good one.

00:49:21.980 --> 00:49:25.740
No, it's not necessarily that you're going to, you're going to install it because you type

00:49:25.740 --> 00:49:26.100
it wrong.

00:49:26.580 --> 00:49:30.240
although that is, that is, you know, one technique, right?

00:49:30.240 --> 00:49:34.620
The drive-by installs where someone just bat fingers, the package name.

00:49:34.620 --> 00:49:43.220
but really having a, typo squatted package is going to allow these threat actors to,

00:49:43.220 --> 00:49:48.960
be a little more stealthy in their inclusion of that package in, in legitimate, code

00:49:48.960 --> 00:49:52.420
reviews and commits and, dependencies of dependencies.

00:49:52.420 --> 00:49:52.860
Right.

00:49:52.860 --> 00:49:57.140
And so the other, the other thing that goes with typo squatting, I don't know if I had a link

00:49:57.140 --> 00:50:00.480
for you there yet, is, is star jacking.

00:50:00.480 --> 00:50:05.980
So, a lot of times if you're going to typo squat on a known good package, okay, there,

00:50:05.980 --> 00:50:06.480
there it is.

00:50:06.480 --> 00:50:12.900
you know, these, these, these threat actors, they just, they just straight up copy the

00:50:12.900 --> 00:50:14.800
known good project, right?

00:50:14.800 --> 00:50:19.120
It just cloned the repository and then changed the package name.

00:50:19.120 --> 00:50:26.700
and, and then when they, when they post the package to, PyPI, for instance,

00:50:26.700 --> 00:50:31.180
the metadata that goes with the package, still exists, right?

00:50:31.180 --> 00:50:36.600
So, on PyPI for a given package, you can see on the left-hand side, it shows like some,

00:50:36.600 --> 00:50:37.460
some statistics.

00:50:37.460 --> 00:50:42.840
If, if the, URL was given to like a GitHub.

00:50:42.840 --> 00:50:48.800
hosted project, for instance, it'll go in there and tell you how many stars.

00:50:48.800 --> 00:50:50.220
Right, right, right.

00:50:50.220 --> 00:50:51.140
You know, how many downloads.

00:50:51.140 --> 00:50:53.520
That's actually a signal that it seems like it should be good, right?

00:50:53.520 --> 00:50:54.140
It'll have.

00:50:54.140 --> 00:50:54.700
Yeah.

00:50:54.700 --> 00:50:55.160
A lot.

00:50:55.160 --> 00:51:01.400
And that's what star jacking is doing is just copying the metadata of a known good package.

00:51:01.400 --> 00:51:04.060
so that on first look.

00:51:04.060 --> 00:51:05.100
Yeah, there you go.

00:51:05.100 --> 00:51:05.820
You can see.

00:51:05.820 --> 00:51:11.920
Like I did pull up pytest and it says statistics, GitHub statistics, 11,000 stars, 2,000 forks.

00:51:12.000 --> 00:51:12.880
Okay, this is legit.

00:51:12.880 --> 00:51:13.660
Let's install it.

00:51:13.660 --> 00:51:14.060
Right.

00:51:14.060 --> 00:51:20.540
So I could go clone pytest repository right now, change the name to pytest spelled P-I-T-E-S-T.

00:51:20.540 --> 00:51:21.240
Mm-hmm.

00:51:21.240 --> 00:51:23.200
And then, and then push that to P-I.

00:51:23.200 --> 00:51:24.320
The math version of testing, yeah.

00:51:24.320 --> 00:51:28.420
And you're going to get these same statistics and you're going to get the same, maintainers

00:51:28.420 --> 00:51:33.460
that you see if you scroll down a little bit, in the, the, metadata.

00:51:34.040 --> 00:51:34.260
Yeah.

00:51:34.260 --> 00:51:35.400
So you get the maintainers list.

00:51:35.400 --> 00:51:42.680
All of that metadata that you, you, you enter in your pyproject.toml or setup.py file,

00:51:42.680 --> 00:51:45.920
gets read here on PyPI and just, just publish.

00:51:45.920 --> 00:51:48.060
So you can, you can fake people out.

00:51:48.060 --> 00:51:48.260
Yeah.

00:51:48.260 --> 00:51:48.580
Yeah.

00:51:48.580 --> 00:51:48.980
Yeah.

00:51:48.980 --> 00:51:51.200
That's actually really, okay.

00:51:51.200 --> 00:51:53.020
Well, there's a new terrifying thing that I hadn't thought about.

00:51:53.020 --> 00:51:53.140
Yeah.

00:51:53.140 --> 00:51:53.460
Yeah.

00:51:53.460 --> 00:51:58.400
So, so star jacking and typosquatting where you just take a known good package, clone it,

00:51:58.440 --> 00:52:04.040
and then maybe you, you make a change to, you know, existing function, you know, the

00:52:04.040 --> 00:52:07.720
function does what it's supposed to do, but it also does some other stuff like ship off

00:52:07.720 --> 00:52:14.840
secrets from your, your CI server or, you know, it could lay dormant and wait for,

00:52:14.840 --> 00:52:19.120
uh, some sort of production environment and grab some SSH queues or something terrible.

00:52:19.120 --> 00:52:20.040
Yeah.

00:52:20.040 --> 00:52:20.360
Yeah.

00:52:20.360 --> 00:52:20.580
Yeah.

00:52:20.580 --> 00:52:20.620
Yeah.

00:52:20.620 --> 00:52:24.380
That's, that's, that's the other, the other dependency confusion.

00:52:24.380 --> 00:52:24.640
Okay.

00:52:24.640 --> 00:52:26.220
That's the next one you've got up.

00:52:26.220 --> 00:52:26.340
Yeah.

00:52:26.340 --> 00:52:30.120
This is the one that we kind of talk, it's similar to what we talked about, before

00:52:30.120 --> 00:52:34.140
with, I can't remember, but I said there's, there's going to come back to this.

00:52:34.140 --> 00:52:39.480
So here, here it is again, this is a dependency confusion where, if you get the wrong version

00:52:39.480 --> 00:52:45.240
or the wrong name, it could actually, you try to be safe by having a white listed list or

00:52:45.240 --> 00:52:51.640
say, well, it's, it's, so this is one where it's the same, same package name, different source

00:52:51.640 --> 00:52:53.060
of where you acquire that package.

00:52:53.060 --> 00:52:53.580
Yes.

00:52:53.580 --> 00:52:59.980
So this is, you'll, these attacks are mostly like, companies, enterprises.

00:52:59.980 --> 00:53:01.780
This is the enterprise attack.

00:53:01.780 --> 00:53:02.220
Yeah.

00:53:02.220 --> 00:53:02.680
Yeah.

00:53:02.680 --> 00:53:08.200
So it's an artifactory and we, we only put our stuff there and we're, we're going to call

00:53:08.200 --> 00:53:11.960
it like, you know, international company underscore data access.

00:53:11.960 --> 00:53:12.660
That's right.

00:53:12.900 --> 00:53:17.980
And, and it's, and it's, and it's tricky because if you don't know, like if you don't

00:53:17.980 --> 00:53:22.800
have your build system set up in a way and then, your CI server set up in a way to

00:53:22.800 --> 00:53:28.520
install your dependencies in the proper order, like excluding public registries first and only

00:53:28.520 --> 00:53:34.140
looking for packages in your private registry, then it's very easy, especially with pip, which

00:53:34.140 --> 00:53:40.440
defaults to looking on pipe PI, the public registry first, and then only falling back to your, your

00:53:40.440 --> 00:53:42.900
extra index URL specifications.

00:53:42.900 --> 00:53:50.380
Secondly, that if you, if someone had the knowledge or just guessed at the package

00:53:50.380 --> 00:53:54.660
name that you had published on your internal registry, and then they made their own package

00:53:54.660 --> 00:53:58.500
with the same name, but put it on pipe PI, that's the one that's going to get installed.

00:53:58.500 --> 00:54:05.740
and there was like a whole series of, you know, bug bounties that were claimed over this

00:54:05.740 --> 00:54:10.880
back a few years ago, because people just went around, you know, guessing at internal package

00:54:10.880 --> 00:54:13.220
names or maybe they used to work there or new people.

00:54:13.220 --> 00:54:14.260
Yeah.

00:54:14.260 --> 00:54:14.380
Yeah.

00:54:14.380 --> 00:54:14.580
Yeah.

00:54:14.580 --> 00:54:18.340
I'll pay a hundred bucks just to share your quorum at Sot.txt with me.

00:54:18.340 --> 00:54:18.900
Right.

00:54:18.900 --> 00:54:19.300
Right.

00:54:19.300 --> 00:54:19.580
Right.

00:54:19.580 --> 00:54:19.860
Right.

00:54:20.280 --> 00:54:27.220
You know, it's, it's kind of, it's extra sneaky because it only affects people.

00:54:27.220 --> 00:54:32.220
It only affects people who are going out of their way to be more secure, right?

00:54:32.220 --> 00:54:35.860
They're going out of their way to say, we're only going to, we're going to actually set up

00:54:35.860 --> 00:54:39.120
a whole server and we're going to whitelist a bunch of stuff.

00:54:39.120 --> 00:54:41.900
You can only ask for the names of the things on this server.

00:54:41.900 --> 00:54:43.140
And, ah, yes.

00:54:43.140 --> 00:54:43.600
Yes.

00:54:43.600 --> 00:54:49.900
And that, that might still work if you limit it to your internal registry only or a mirror

00:54:49.900 --> 00:54:53.860
perhaps of, of the public registries.

00:54:53.860 --> 00:55:00.960
but it's pretty easy to create your own internal copy, download a bunch of external

00:55:00.960 --> 00:55:04.260
ones and mirror them locally and say like, these are the ones that are pre-approved at

00:55:04.260 --> 00:55:04.760
our company.

00:55:04.760 --> 00:55:05.760
Nothing else.

00:55:05.760 --> 00:55:06.300
Yeah.

00:55:06.300 --> 00:55:06.620
Yeah.

00:55:06.620 --> 00:55:10.620
I, I, I've worked in a environment where that's exactly what we did.

00:55:10.620 --> 00:55:13.420
And, I think there is merit to that.

00:55:13.420 --> 00:55:19.420
You just have to know that anything you're mirroring to the trusted internal network is in

00:55:19.420 --> 00:55:20.500
fact secure.

00:55:20.500 --> 00:55:21.320
You know?

00:55:21.320 --> 00:55:21.680
Yeah.

00:55:21.680 --> 00:55:22.000
Yeah.

00:55:22.000 --> 00:55:22.420
For sure.

00:55:22.420 --> 00:55:29.440
I think it doesn't really make sense except for a few very rare cases to say you cannot

00:55:29.440 --> 00:55:30.640
use external dependencies.

00:55:30.640 --> 00:55:31.280
Right.

00:55:31.280 --> 00:55:31.520
Right.

00:55:31.520 --> 00:55:36.440
You're just saying what we want is to not build software, but while the rest of the world

00:55:36.440 --> 00:55:37.700
does, you know?

00:55:38.060 --> 00:55:38.360
Yeah.

00:55:38.360 --> 00:55:39.480
Because that's part of the magic.

00:55:39.480 --> 00:55:42.900
We just saw there's over half a million libraries you can choose from.

00:55:42.900 --> 00:55:48.920
When you say we, we have zero of those, you're really, really constraining the type of software

00:55:48.920 --> 00:55:51.200
and the velocity at which you can build.

00:55:51.200 --> 00:55:52.100
Yeah.

00:55:52.100 --> 00:55:52.580
Yeah.

00:55:53.340 --> 00:55:53.740
Yeah.

00:55:53.740 --> 00:55:58.640
It reminds me of, there's that line, you know, like why, why do you rob banks?

00:55:58.640 --> 00:56:00.080
Cause they have the money.

00:56:00.080 --> 00:56:01.200
Cause that's where the money is.

00:56:01.200 --> 00:56:01.420
Right.

00:56:01.420 --> 00:56:05.800
It's like, well, why do attackers, why are attackers going after open source software now?

00:56:05.800 --> 00:56:10.560
Like, well, that's, that's where it's easiest to get arbitrary code to run.

00:56:11.620 --> 00:56:12.820
That's where developers are.

00:56:12.820 --> 00:56:14.300
That's what, to be fair though.

00:56:14.300 --> 00:56:16.260
It's not only, it's not only right.

00:56:16.260 --> 00:56:20.420
There's solar winds, which really had almost nothing to do with open source, but it had

00:56:20.420 --> 00:56:23.360
to do with CI, CD systems and other sneakiness.

00:56:23.360 --> 00:56:23.720
Right.

00:56:23.720 --> 00:56:23.960
Yeah.

00:56:23.960 --> 00:56:24.340
Yeah.

00:56:24.340 --> 00:56:24.880
Yeah.

00:56:24.880 --> 00:56:29.640
And got into places that, you know, instead of getting into libraries, you get into the

00:56:29.640 --> 00:56:33.820
build system and you just give it a little extra, a little extra include tag there.

00:56:33.820 --> 00:56:37.320
Bringing that deal out, like you said, right.

00:56:37.320 --> 00:56:42.020
So dependency and confusion is sneaky because you're asking for a local version off a local

00:56:42.020 --> 00:56:42.400
server.

00:56:42.400 --> 00:56:47.580
It doesn't exist on PyPI, but if it could be made to exist on PyPI, all of a sudden that

00:56:47.580 --> 00:56:48.220
gets installed.

00:56:48.220 --> 00:56:50.220
That's potentially, that's not good.

00:56:50.220 --> 00:56:50.600
Potentially.

00:56:50.600 --> 00:56:50.840
Yeah.

00:56:50.840 --> 00:56:51.060
Yeah.

00:56:51.060 --> 00:56:54.340
It's, it's, that's, that's how it works in all the, in all the default cases.

00:56:54.340 --> 00:56:59.900
And it's, it's pretty tricky actually to, to exclude, to do it in the correct order and

00:56:59.900 --> 00:57:01.340
exclude those public registries.

00:57:01.340 --> 00:57:01.980
Yeah.

00:57:01.980 --> 00:57:08.180
What's what I do to help this is I just, I just run the UUID command to get one of those

00:57:08.180 --> 00:57:10.760
16 digit arbitrary X things.

00:57:10.760 --> 00:57:12.680
And I just name all my libraries that.

00:57:12.680 --> 00:57:16.280
And so it's like, oh, you have the F3DC.

00:57:16.280 --> 00:57:16.560
Yeah.

00:57:16.560 --> 00:57:18.560
That's the, that's the API one.

00:57:18.560 --> 00:57:18.780
That's right.

00:57:18.780 --> 00:57:19.600
That's important.

00:57:19.600 --> 00:57:20.020
That, right.

00:57:20.020 --> 00:57:21.280
No one is going to do this.

00:57:21.280 --> 00:57:23.060
It's such a safe space.

00:57:23.060 --> 00:57:23.520
I tell you.

00:57:23.520 --> 00:57:25.680
All right.

00:57:25.680 --> 00:57:26.380
On to the next one.

00:57:26.380 --> 00:57:27.220
That, that would work.

00:57:28.180 --> 00:57:30.780
Expired author domains.

00:57:30.780 --> 00:57:32.200
This is super sneaky.

00:57:32.200 --> 00:57:33.180
Yeah.

00:57:33.180 --> 00:57:33.640
Yeah.

00:57:33.640 --> 00:57:41.300
So this is one, you know, it, it might be less of a factor now.

00:57:41.300 --> 00:57:46.320
I think, I think it was just earlier this month that PyPI enforced two factor authentication

00:57:46.320 --> 00:57:47.840
for all their users.

00:57:48.920 --> 00:57:59.140
But a lot of sites and, you know, even PyPI, I think before this month have, you know,

00:57:59.140 --> 00:58:04.020
password reset features where if, if you lose access to your account or you forget your password,

00:58:04.020 --> 00:58:06.400
just, you know, send me an email and reset your password.

00:58:07.400 --> 00:58:13.460
But it's, it's, it's very possible that people, you know, years ago submitted a package.

00:58:13.460 --> 00:58:15.620
They, they don't maintain it anymore.

00:58:15.620 --> 00:58:20.260
They submitted it under an old email account that has expired.

00:58:20.260 --> 00:58:20.560
Right.

00:58:20.560 --> 00:58:22.300
Maybe they, they had some domain.

00:58:22.300 --> 00:58:23.300
Yeah.

00:58:23.300 --> 00:58:23.820
Special.

00:58:23.820 --> 00:58:26.900
It doesn't work that well for Gmail or Outlook.

00:58:26.900 --> 00:58:27.280
Right.

00:58:27.280 --> 00:58:27.440
Right.

00:58:27.440 --> 00:58:28.080
If you had.

00:58:28.080 --> 00:58:28.820
Custom domain.

00:58:28.820 --> 00:58:34.280
If you had a custom domain and, as would be awesome, have your own, you know, Michael at

00:58:34.280 --> 00:58:36.060
talkpython.fm, that kind of thing.

00:58:36.320 --> 00:58:36.680
Yeah.

00:58:36.680 --> 00:58:37.080
Yeah.

00:58:37.080 --> 00:58:42.180
Say you, you win the lottery and, and, you know, decide to quit your pay job.

00:58:42.180 --> 00:58:42.520
Yeah.

00:58:42.520 --> 00:58:48.180
And then you let your domain expire and, well, maybe there's still a linkage for the talkpython

00:58:48.180 --> 00:58:50.220
domain to PyPI.

00:58:50.220 --> 00:58:54.380
And then I go and buy that domain and, you know, request a password.

00:58:54.380 --> 00:58:54.760
Set up the server.

00:58:54.760 --> 00:58:55.160
Yeah.

00:58:55.160 --> 00:58:56.200
Account reset.

00:58:56.200 --> 00:58:56.380
Set up email.

00:58:56.380 --> 00:58:56.960
Yeah.

00:58:56.960 --> 00:59:02.260
And then now I, now I can publish new versions of, of the, of the packages there.

00:59:02.260 --> 00:59:02.720
Yeah.

00:59:02.720 --> 00:59:03.220
Yeah.

00:59:03.220 --> 00:59:03.780
It's not good.

00:59:03.780 --> 00:59:04.200
Yeah.

00:59:04.200 --> 00:59:04.620
Yeah.

00:59:04.620 --> 00:59:09.020
So I don't really know what to do about that one, but there's an amazing, there's an amazing

00:59:09.020 --> 00:59:10.740
joke that I found on Mastodon.

00:59:11.100 --> 00:59:13.940
Somebody posted, sit here.

00:59:13.940 --> 00:59:17.200
It's two big red buttons.

00:59:17.200 --> 00:59:18.800
Think Ren and Stimpy or whatever.

00:59:18.800 --> 00:59:22.480
And one of the red buttons says, admit to yourself that your dream is dead.

00:59:22.480 --> 00:59:26.620
The other one says pay $12 for domain renewal.

00:59:26.620 --> 00:59:26.900
Right.

00:59:27.260 --> 00:59:33.000
I mean, it's funny, but there's plenty of people who will get a domain and I totally go in and

00:59:33.000 --> 00:59:33.820
then it's like, you know what?

00:59:33.820 --> 00:59:35.520
I haven't done anything without five years.

00:59:35.520 --> 00:59:36.880
I'm not paying another 12 bucks.

00:59:36.880 --> 00:59:39.620
But if they had set up an account under that, right?

00:59:39.620 --> 00:59:40.700
This is what you're talking about.

00:59:40.700 --> 00:59:41.100
Yeah.

00:59:41.100 --> 00:59:41.360
Yeah.

00:59:41.360 --> 00:59:42.300
Yeah, exactly.

00:59:42.300 --> 00:59:43.380
Yeah.

00:59:43.640 --> 00:59:47.720
That's why you got to buy your domains for that hundred year renewal period.

00:59:47.720 --> 00:59:48.560
Exactly.

00:59:48.560 --> 00:59:50.700
Take out that loan.

00:59:50.700 --> 00:59:51.460
You get to the loan.

00:59:51.460 --> 00:59:54.520
All right.

00:59:54.520 --> 00:59:55.460
We're getting short on time here.

00:59:55.460 --> 00:59:59.040
I want to, let me, let's just go through, I'll just list off a few real quick.

00:59:59.040 --> 01:00:00.020
Maybe we do lightning round.

01:00:00.020 --> 01:00:00.260
Okay.

01:00:00.260 --> 01:00:00.580
Okay.

01:00:00.580 --> 01:00:01.860
Unverifiable dependency.

01:00:01.860 --> 01:00:02.580
Okay.

01:00:02.580 --> 01:00:11.240
These are for specifying dependencies that are not necessarily published to PyPI, right?

01:00:11.240 --> 01:00:15.880
So that maybe you're pointing to a GitHub repository.

01:00:15.880 --> 01:00:22.240
You know, pip calls these VCS project URLs, you know, if you look in their help output.

01:00:22.240 --> 01:00:22.720
Yeah.

01:00:22.720 --> 01:00:27.500
It's like pip install git plus HTTP to a thing that has a PyProject autonomous.

01:00:27.500 --> 01:00:27.980
Yeah.

01:00:27.980 --> 01:00:30.860
And that thing, it can point to a repository.

01:00:31.100 --> 01:00:32.360
Maybe it points to a tag.

01:00:32.360 --> 01:00:34.360
Maybe it points to a branch.

01:00:34.360 --> 01:00:37.480
None of that is stable, right?

01:00:37.480 --> 01:00:40.400
Like you, the tag could change out from under you.

01:00:40.400 --> 01:00:44.480
The code that's related to that tag could change out from under you.

01:00:44.480 --> 01:00:49.120
The code at the branch you're pointing to could change while the name remains the same.

01:00:49.120 --> 01:00:52.340
So, you know, those are risky for that reason, right?

01:00:52.340 --> 01:00:56.800
If you're not pinning to a very specific version or a very specific hash, right?

01:00:56.800 --> 01:00:59.980
If you're going to point to a repository or a git URL.

01:00:59.980 --> 01:01:00.260
Interesting.

01:01:00.700 --> 01:01:00.920
Yeah.

01:01:00.920 --> 01:01:00.960
Yeah.

01:01:00.960 --> 01:01:01.960
Make sure it's true.

01:01:01.960 --> 01:01:06.960
I've gotten to feel a lot of times like the hash is maybe a little bit redundant given the immutability of PyPI.

01:01:06.960 --> 01:01:11.080
But if you're pointing at something like this, then maybe all of a sudden you really do want that, right?

01:01:11.080 --> 01:01:11.420
Yes.

01:01:11.420 --> 01:01:11.800
For sure.

01:01:11.800 --> 01:01:12.280
Yeah.

01:01:12.280 --> 01:01:12.500
Okay.

01:01:12.500 --> 01:01:14.880
Repo jacking?

01:01:15.400 --> 01:01:15.780
Yeah.

01:01:15.780 --> 01:01:20.820
This is similar to the expired author domain, right?

01:01:20.820 --> 01:01:39.340
So if someone was pointing to one of those git dependencies, a VCS project URL, as pip calls it, and that account went dormant or expired, relinquished, whatever, and someone else took it over, then yeah, they can now dictate.

01:01:39.340 --> 01:01:40.600
What's there, yeah.

01:01:40.600 --> 01:01:41.100
Yeah, exactly.

01:01:41.100 --> 01:01:42.640
A lot of people are acquiring.

01:01:42.640 --> 01:01:43.940
All right.

01:01:44.120 --> 01:01:50.300
And then maybe last bit, get a chance to talk a bit about your Phylum CI project.

01:01:50.300 --> 01:01:54.460
I do want to point out really quick, though, that Phylum was a sponsor of the show.

01:01:54.460 --> 01:01:55.020
Yes.

01:01:55.020 --> 01:01:55.620
A while ago.

01:01:55.760 --> 01:01:57.820
But this is not a sponsored episode.

01:01:57.820 --> 01:02:03.020
This is just you and I had been talking prior to that, actually, and decided to put this show together.

01:02:03.020 --> 01:02:06.700
So just to be clear, but let's talk about this project you guys got anyway.

01:02:06.700 --> 01:02:07.900
Yeah.

01:02:07.900 --> 01:02:08.240
Yeah.

01:02:08.240 --> 01:02:14.520
So you can pip install Phylum right now, or like I prefer, pipx install Phylum.

01:02:14.520 --> 01:02:14.800
Yeah.

01:02:14.800 --> 01:02:15.900
I love pipx.

01:02:15.900 --> 01:02:16.460
It's awesome.

01:02:16.460 --> 01:02:17.340
Yeah, me too.

01:02:17.340 --> 01:02:17.740
Yeah.

01:02:17.740 --> 01:02:19.680
I think I heard about it from you, actually.

01:02:19.680 --> 01:02:23.040
So the circle goes.

01:02:23.240 --> 01:02:23.940
Yes, yes.

01:02:23.940 --> 01:02:26.020
So this package, it does two main things.

01:02:26.020 --> 01:02:30.260
One is it can, it'll expose us to entry points.

01:02:30.260 --> 01:02:40.140
One of them is called Phylum init, and that'll get you the Phylum command line interface written in Rust, but installed with Python.

01:02:40.140 --> 01:02:45.760
It'll get you the Phylum CLI locally.

01:02:45.760 --> 01:02:48.320
And then the other one is called Phylum CI.

01:02:48.320 --> 01:02:53.060
That's just a catch-all entry point, the thing that gets exposed through our Docker container.

01:02:53.060 --> 01:02:56.260
to handle almost all of our integrations.

01:02:56.260 --> 01:03:03.900
So if you want to monitor your PRs on GitHub, for instance, we've got an integration for that.

01:03:03.900 --> 01:03:04.320
Nice.

01:03:04.320 --> 01:03:07.400
So the idea is basically that I could set this up in GitHub.

01:03:07.400 --> 01:03:09.520
A PR comes in, I could set up an action.

01:03:09.520 --> 01:03:13.440
Phylum will scan it for known mischievousness.

01:03:13.440 --> 01:03:14.100
That's right.

01:03:14.360 --> 01:03:17.380
And make that part of the PR, maybe even block it out, right?

01:03:17.380 --> 01:03:18.320
Yeah, exactly.

01:03:18.320 --> 01:03:28.160
It'll fail your build if you don't pass your default policy or established policy on any of your given lock files or manifests.

01:03:28.160 --> 01:03:29.820
We deal with manifests as well.

01:03:30.200 --> 01:03:31.040
And you mentioned GitHub.

01:03:31.040 --> 01:03:33.460
So even with GitHub, we went a step further.

01:03:33.460 --> 01:03:34.840
We have an app as well.

01:03:34.840 --> 01:03:36.980
So you don't even have to modify a workflow.

01:03:36.980 --> 01:03:42.680
You could just install a GitHub app and automatically monitor your repositories.

01:03:42.680 --> 01:03:46.860
But a lot of the other ecosystems don't have that.

01:03:46.860 --> 01:03:49.400
So we just provide Docker containers.

01:03:49.400 --> 01:03:51.180
I love the Docker container.

01:03:51.180 --> 01:03:54.600
So use Docker run against your code or whatever.

01:03:54.600 --> 01:03:56.320
So yeah.

01:03:56.320 --> 01:04:02.500
And then there's even a pre-commit hook we expose as well.

01:04:02.500 --> 01:04:03.100
Nice.

01:04:03.100 --> 01:04:06.880
I genuinely don't know the answer to this question.

01:04:06.880 --> 01:04:07.780
Does this cost money?

01:04:07.780 --> 01:04:08.640
No.

01:04:08.640 --> 01:04:10.760
We have anyone.

01:04:10.760 --> 01:04:12.700
Anyone can sign up for free.

01:04:12.700 --> 01:04:17.640
There's a community edition where you can have up to five projects.

01:04:17.640 --> 01:04:18.760
Okay, cool.

01:04:18.780 --> 01:04:19.760
You guys have to eat.

01:04:19.760 --> 01:04:21.360
There must be some way you charge for something.

01:04:21.360 --> 01:04:21.700
Oh, exactly.

01:04:21.700 --> 01:04:22.200
Yeah, yeah.

01:04:22.200 --> 01:04:24.680
So there's the paid version, right?

01:04:24.680 --> 01:04:26.680
Which, you know, unlimited projects.

01:04:26.680 --> 01:04:29.640
You get access to group-based management.

01:04:29.640 --> 01:04:30.860
You know, there's a few extra features.

01:04:30.860 --> 01:04:32.040
It's a freemium model.

01:04:32.040 --> 01:04:34.840
More of a Teams enterprise-y angle.

01:04:34.840 --> 01:04:35.640
Yeah, yeah.

01:04:35.640 --> 01:04:46.280
But for this audience, I mean, I would love if everyone just went that little extra step of securing their open source software and, you know, go with the free option.

01:04:46.280 --> 01:04:47.560
I'm not trying to sell you anything here.

01:04:47.680 --> 01:04:52.920
You know, monitor your manifest, your lock files.

01:04:52.920 --> 01:04:55.320
Make sure that you remain secure.

01:04:55.320 --> 01:05:01.760
You're not exposing your secrets because that's what we're finding now is that developers are the new high-value targets.

01:05:01.760 --> 01:05:02.380
Yeah.

01:05:03.020 --> 01:05:07.300
That's what attackers want to go after because we know that developers, they have the secrets.

01:05:07.300 --> 01:05:09.020
They've got the keys, you know.

01:05:09.480 --> 01:05:15.200
We write the code that then gets run on the production server inside the firewalls.

01:05:15.200 --> 01:05:15.880
Yeah.

01:05:15.880 --> 01:05:16.160
Yeah.

01:05:16.160 --> 01:05:20.060
We have all the access, all the secrets, all the keys.

01:05:20.060 --> 01:05:28.940
So, you know, if you can find a way to get arbitrary code from strangers to run on developer systems, you're going to have a much better chance.

01:05:28.940 --> 01:05:29.400
We have a good time.

01:05:29.400 --> 01:05:29.800
Yeah.

01:05:29.800 --> 01:05:30.900
We have a good time.

01:05:30.900 --> 01:05:32.520
By that, I mean having a bad time.

01:05:32.520 --> 01:05:33.040
Right.

01:05:33.600 --> 01:05:33.800
Yeah.

01:05:33.800 --> 01:05:35.420
Doing bad things.

01:05:35.420 --> 01:05:35.840
Okay.

01:05:35.840 --> 01:05:36.900
Let's not do that.

01:05:36.900 --> 01:05:37.380
Awesome.

01:05:37.380 --> 01:05:38.500
Well, excellent work.

01:05:38.500 --> 01:05:40.460
I think probably we'll kind of just leave it there.

01:05:40.460 --> 01:05:42.760
We're pretty much out of time for the rest of the stuff.

01:05:42.920 --> 01:05:44.240
But close it out for us, Charlie.

01:05:44.240 --> 01:05:51.340
People are maybe both have a few new tools to work with, but also techniques, but maybe also a little freaked out.

01:05:51.340 --> 01:05:51.840
What do you tell them?

01:05:51.840 --> 01:06:04.480
I recommend everyone to restrict their use of dependencies to lock files and then carefully gate or guard the inclusion of new lock files or updates of existing ones.

01:06:04.480 --> 01:06:07.900
Or sorry, dependencies in those lock files with careful analysis.

01:06:08.860 --> 01:06:13.480
Don't allow arbitrary code to run anywhere in your development process and give FileM a try.

01:06:13.480 --> 01:06:15.340
We've got the free community edition.

01:06:15.340 --> 01:06:22.620
We will provide that analysis and ensure that you don't have malware running on your system through bad dependencies.

01:06:22.620 --> 01:06:23.280
Awesome.

01:06:23.280 --> 01:06:23.860
All right.

01:06:23.860 --> 01:06:27.320
Well, it's been very interesting and a lot of new things to think about.

01:06:27.320 --> 01:06:28.240
So thanks for being here.

01:06:28.240 --> 01:06:28.980
Thank you, Michael.

01:06:28.980 --> 01:06:29.520
Yep.

01:06:29.520 --> 01:06:29.840
See you later.

01:06:29.840 --> 01:06:33.220
This has been another episode of Talk Python To Me.

01:06:33.220 --> 01:06:35.040
Thank you to our sponsors.

01:06:35.040 --> 01:06:36.640
Be sure to check out what they're offering.

01:06:36.640 --> 01:06:38.060
It really helps support the show.

01:06:38.400 --> 01:06:40.100
Take some stress out of your life.

01:06:40.100 --> 01:06:45.880
Get notified immediately about errors and performance issues in your web or mobile applications with Sentry.

01:06:45.880 --> 01:06:50.880
Just visit talkpython.fm/sentry and get started for free.

01:06:50.880 --> 01:06:54.480
And be sure to use the promo code talkpython, all one word.

01:06:54.480 --> 01:06:58.120
Mailtrap, an email delivery platform that developers love.

01:06:58.120 --> 01:07:05.900
Use their email sandbox to inspect and debug emails in staging, dev, and QA environments before sending them to recipients in production.

01:07:06.540 --> 01:07:10.280
Try Mailtrap for free at talkpython.fm/Mailtrap.

01:07:10.280 --> 01:07:11.680
Want to level up your Python?

01:07:11.680 --> 01:07:15.720
We have one of the largest catalogs of Python video courses over at Talk Python.

01:07:15.720 --> 01:07:20.900
Our content ranges from true beginners to deeply advanced topics like memory and async.

01:07:20.900 --> 01:07:23.560
And best of all, there's not a subscription in sight.

01:07:23.560 --> 01:07:26.480
Check it out for yourself at training.talkpython.fm.

01:07:26.880 --> 01:07:28.580
Be sure to subscribe to the show.

01:07:28.580 --> 01:07:31.360
Open your favorite podcast app and search for Python.

01:07:31.360 --> 01:07:32.660
We should be right at the top.

01:07:32.660 --> 01:07:42.020
You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm.

01:07:42.420 --> 01:07:45.000
We're live streaming most of our recordings these days.

01:07:45.000 --> 01:07:52.780
If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

01:07:52.780 --> 01:07:54.880
This is your host, Michael Kennedy.

01:07:54.880 --> 01:07:56.180
Thanks so much for listening.

01:07:56.180 --> 01:07:57.320
I really appreciate it.

01:07:57.320 --> 01:07:59.240
Now get out there and write some Python code.

01:07:59.240 --> 01:08:20.240
I'll see you next time.