Python Packaging and PyPI in 2022

Episode Deep Dive Links Transcript

PyPI has been in the news for a bunch of reasons lately. Many of them good. But also, some with a bit of drama or mixed reactions. On this episode, we have Dustin Ingram, one of the PyPI maintainers and one of the directors of the PSF, here to discuss the whole 2FA story, securing the supply chain, and plenty more related topics. This is another important episode that people deeply committed to the Python space will want to hear.

Background noise warning: Just wanted to apologize for a bit of background noise on my end (Dustin had amazing audio). We had construction at our place, which would have been fine. But work started on the ceiling right under my desk making much more noise than expected. I think we generally have it cleaned up, but there may be a few sounds sneaking through. Thanks for the understanding. :)

Play on YouTube

Watch the live stream version

Episode Deep Dive

Guest background

Dustin Ingram is one of the maintainers of the Python Package Index (PyPI) and serves on the board of directors at the Python Software Foundation. He also works at Google on open-source security initiatives, focusing on projects that improve the security of Python’s packaging ecosystem. Prior to his current work, Dustin contributed to Python packaging tools and has been involved in a variety of efforts to modernize and secure PyPI. His unique dual perspective, as both a volunteer maintainer and a professional in open-source security, allows him to offer valuable insights into the growth, challenges, and solutions facing the Python community.

What to Know If You're New to Python

Here are a few basics that will help you understand this episode’s focus on packaging and security:

PyPI (Python Package Index): The main repository of Python libraries and frameworks that you install via pip install ....
pip: Python’s primary package installer, which fetches code from PyPI.
Virtual environments: A standard way to isolate your installed packages for each project (venv, conda, etc.).
2FA (Two-Factor Authentication): An extra layer of account security often recommended for critical software services.

Key points and takeaways

2FA Requirement for Critical Packages
PyPI has begun enforcing two-factor authentication for “critical projects,” defined by the top 1% of downloaded packages. This mandate aims to protect users from account takeovers and malicious package uploads, mitigating large-scale supply-chain attacks. While the community reaction was mixed, the PyPI maintainers underscored that 2FA dramatically reduces entire classes of attacks, such as phishing or domain-based account hijacking. Free hardware security keys were even provided (sponsored by Google) to help maintainers transition.
- Links / Tools:
  - PyPI
  - PyPI Security Key Giveaway
Scale and Sponsorship of PyPI
PyPI serves billions of daily requests, transferring over a thousand terabytes of data each day—an infrastructure burden that would cost millions if not for generous corporate sponsorships. Companies like Fastly donate their CDN services to keep package hosting free. The Python Software Foundation relies heavily on event revenue (notably PyCon) and sponsorships, underscoring that community support and sponsorships remain critical to the Python ecosystem’s sustainability.
- Links / Tools:
  - Fastly
  - psf.org/sponsor
Volunteer-Driven Maintenance
Maintaining PyPI involves a handful of volunteers and a small PSF staff, meaning new security requirements (like 2FA) come with significant overhead. Every support ticket for account recovery or package takedown adds up. The push toward making 2FA mandatory for high-impact packages is as much about reducing volunteer fatigue as it is about improving security.
- Links / Tools:
  - Python Software Foundation
Atomic Writes Fiasco
A maintainer of the atomicwrites package attempted to bypass PyPI’s “critical project” label by deleting and re-uploading the project. This caused a break for anyone using older versions. PyPI ultimately restored those deleted versions, but the incident highlighted the tension between maintainers’ autonomy and the broader ecosystem’s needs. It also spotlighted how package deletion can break countless downstream dependencies if not done thoughtfully.
- Links / Tools:
  - atomicwrites project
Lessons from NPM and Left-Pad
The conversation touched on parallels with the “left-pad” incident in the JavaScript world, illustrating how one maintainer’s decision can send shockwaves across thousands of projects. PyPI’s stance has generally been to discourage package deletion, offering alternatives like yanking or archiving old releases. This approach aims to ensure continued stability for the broader Python community.
- Links / Tools:
  - NPM Left-Pad Story (General Info)
James Bennett’s 2FA Response
James Bennett wrote a widely shared article defending the PyPI team’s decision to require 2FA for popular packages. His commentary underscored that PyPI’s volunteer maintainers have every right to ask for minimal security from package owners. By cutting down on security emergencies, these volunteers can spend more time improving PyPI rather than constantly firefighting malicious uploads.
New Upload API and Draft Releases (PEP 691+)
PyPI historically used an older, monolithic post-based upload flow. Upcoming improvements, such as PEP 691, aim to introduce more modern, JSON-based interfaces and the concept of draft releases. Draft releases will let maintainers test and verify artifacts within PyPI before final publication, preventing accidental or rushed uploads from impacting the public index.
- Links / Tools:
  - PEP 691 (JSON Simple API)
  - PEP 621 (Metadata in pyproject.toml)
Reducing Arbitrary Code Execution with PEP 621
The conversation emphasized the shift away from setup.py scripts, which allow arbitrary code execution at install time. With PEP 621, Python packages can define metadata purely in pyproject.toml, reducing security risks. Modern tools like Flit, Poetry, and updated setuptools adhere to these standards, offering safer package installations.
- Links / Tools:
  - Flit
  - Poetry
Supply Chain Security and OIDC
Another improvement is adopting short-lived credentials via OpenID Connect (OIDC). By letting CI systems (like GitHub Actions) authenticate securely without storing long-lived API tokens, maintainers can reduce accidental credential leaks. This approach strengthens the integrity of each package release by cryptographically verifying which workflow performed the upload.
- Links / Tools:
  - GitHub Actions OIDC Docs
pip-audit and SigStore
The episode highlighted a couple of new security-oriented tools. pip-audit audits local environments (or requirements files) for known vulnerabilities based on databases like OSV (Open Source Vulnerabilities). Meanwhile, SigStore helps sign packages with ephemeral keys, making verification easier and bypassing the complexities of GPG’s “web of trust.” These are part of broader efforts to secure the Python supply chain.

Links / Tools:
- pip-audit
- SigStore

Interesting quotes and stories

On community-driven infrastructure: “PyPI is as susceptible as a bank for phishing or domain-based attacks, but we have zero full-time support staff—it’s all volunteers.”
On 2FA backlash: “We never imagined there would be so much pushback for asking maintainers to enable two-factor. We’re just trying to protect them—and everyone else!”
Atomic Writes takedown: “Yes, you can delete your package from PyPI, but expect some serious fallout across the community.”

Key definitions and terms

PyPI (Python Package Index): Central repository for Python packages installable via pip. Pronounced "Pie - P - I."
Two-Factor Authentication (2FA): An added security measure requiring a second proof (e.g., app code, hardware key) beyond just a password.
Dependency Confusion: A supply-chain attack where internal packages are overshadowed by public packages of the same name.
Domain Resurrection Attack: Taking control of an expired domain to reset account passwords and gain unauthorized access.
PEP (Python Enhancement Proposal): A design document that describes new features or processes for Python.
OIDC (OpenID Connect): An identity layer built on OAuth 2.0 enabling short-lived, secure tokens for CI deployments.

Learning resources

If you want to deepen your Python skills—especially if you’re just getting started—check out these courses on Talk Python Training.

Python for Absolute Beginners: A comprehensive step-by-step introduction to Python, perfect for those new to coding.
Up and Running with Git: Ideal if you need a foundation in version control for your Python projects.
Getting started with pytest: Ensure your code is reliable, especially if you plan to publish or maintain packages.

Overall takeaway

PyPI has grown from a simple code repository to a critical backbone for the entire Python community. Because of this, security and reliability measures—like mandatory 2FA for critical packages, safer metadata standards, and advanced tools for signing and vulnerability detection—are not merely nice to have. They’re essential for maintaining trust in Python’s ecosystem. By cooperating with the Python Software Foundation, sponsors, and volunteers, the community can balance user freedom with the security and stability required at Python’s scale.

Links from the show

Dustin on Twitter: @di_codes

Hardware key giveaway: pypi.org
OpenSSF funds PyPI: openssf.org
James Bennet's take: b-list.org
Atomicwrites (left-pad on PyPI): reddit.com
2FA PyPI Dashboard: datadoghq.com
github 2FA - all users that contribute code by end of 2023: github.blog
GPG - not the holy grail: caremad.io
Sigstore for Python: pypi.org
pip-audit: pypi.org
PEP 691: peps.python.org
PEP 694: peps.python.org

Watch this episode on YouTube: youtube.com
Episode #377 deep-dive: talkpython.fm/377
Episode transcripts: talkpython.fm

---== Don't be a stranger ==---
YouTube: youtube.com/@talkpython

Bluesky: @talkpython.fm
Mastodon: @talkpython@fosstodon.org
X.com: @talkpython

Michael on Bluesky: @mkennedy.codes
Michael on Mastodon: @mkennedy@fosstodon.org
Michael on X.com: @mkennedy

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 PyPI has been in the news for a bunch of reasons lately, many of them great, but also some with a

00:05 bit of drama or mixed reactions. On this episode, we have Dustin Ingram, one of the PyPI maintainers

00:11 and one of the directors of the PSF, here to discuss the whole two-of-a-face story,

00:16 securing the supply chain, and plenty more related topics. This is another important episode that

00:21 people deeply committed to the Python space will want to hear. This is Talk Python To Me,

00:25 episode 377, recorded August 11th, 2022.

00:29 Welcome to Talk Python To Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow

00:48 me on Twitter where I'm @mkennedy and keep up with the show and listen to past episodes at

00:53 talkpython.fm and follow the show on Twitter via at Talk Python. We've started streaming most of our

00:59 episodes live on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube to get

01:05 notified about upcoming shows and be part of that episode. This episode of Talk Python To Me is brought

01:11 to you by Compiler from Red Hat. Listen to an episode of their podcast as they demystify the tech industry

01:17 over at talkpython.fm/compiler. It's also brought to you by the IRL podcast, an original

01:23 podcast from Mozilla. This season, they're looking at AI in real life. Listen to an episode at talkpython.fm

01:30 slash IRL. Transcripts for this and all of our episodes are brought to you by Assembly AI. Do you need a great

01:35 automatic speech-to-text API? Get human-level accuracy in just a few lines of code. Visit talkpython.fm

01:41 Dustin, welcome back to Talk Python To Me. Yeah, it's great to be back. Good to see you. Yeah, good to

01:49 see you as well. It's lovely to have you back. You know, we talked about the Python Packaging Authority.

01:56 We talked about PyPI and all these things previously, and we're back to talk about them some more with a

02:02 particular focus on security. Yeah, which is like kind of my new focus on my day-to-day, my like job

02:09 hat, my PSF hat, all that stuff. It's fantastic when the job that you're paid to do like lines up with

02:16 these other things, right? You can kind of learn on the job and then it really applies quickly. So, you know,

02:23 maybe let's just start there. You're at Google working on security there. Maybe tell us about what

02:29 you're up to and how it ties together. Last time we talked, I was working for Google Cloud as a dev

02:33 advocate. And so that was a lot of like, you know, I think people mostly know me from that, a lot of

02:37 conference talks and things like that. But since then, I've switched to like a brand new team at Google

02:42 that I'm really excited about and I think is just really exciting in general. We're an open source

02:46 security team. So, and we don't work just like on Google's open source, you know, libraries or

02:52 whatever, but we just generally broadly work on open source security across the entire open source

02:56 ecosystem and not just the Python ecosystem, but like every open source ecosystem. So we have our

03:01 hand in like a lot of pots and there's like, you know, I think you're probably aware there's like a,

03:05 this incredible wave of like focus on software security, but also open source software security.

03:10 And so we're kind of riding that wave a little bit, but yeah, it's like, it's a dream team.

03:15 Everyone I work with is like super talented and we're working on some really interesting new security

03:19 stuff and yeah, really love it.

03:21 I bet it's very exciting and it, you also have a chance to make a big impact, right?

03:24 I've been working like kind of tangentially on software security ever since I started working

03:27 on PyPI and I like cared about it for a long time, but it's like really, I think it's validating to

03:32 sort of see that like, oh, now everyone kind of gets it. Like everyone's like, oh, this is,

03:36 this is the thing we need to focus and make better. So it's cool to be there and be like ready to do it,

03:41 you know, and, and have the tools to like make it happen.

03:43 Yeah, absolutely. Have a lot of resources behind you through Google and the team and so on.

03:49 Absolutely. Like an incredible amount of resources. Yeah. So nice.

03:52 Most people probably don't fully appreciate it, right?

03:54 Yeah.

03:54 Yeah. So that is fantastic. The other thing that you're doing is working as the director of the PSF,

04:01 right?

04:01 Well, not the director. So the PSF has a board of directors. So I would, I call myself one of the

04:07 directors of the PSF.

04:08 So yeah. Tell us about your role at PSF.

04:09 Yeah. Yeah. So I joined the board, I think it was about two years ago, because we just had an election.

04:13 And so we sit for a three year term. So I've got another year left before I have to run again. But

04:17 you know, it's been really nice to sort of like work on the PSF from the inside and do some community

04:21 stuff. It's been a really weird time to join the board as well. Like there was like, right,

04:25 the start of the pandemic and like the PSF derives most of its income from events like PyCon,

04:31 like a lot of its income. And that was always sort of like identified as like kind of an existential

04:35 threat to the PSF, but it like very much became a reality very quickly. And so there was a lot of

04:40 work done before I joined and like sort of like after I joined as well to adapt to that. And I think

04:45 PSF did an amazing job. Like we actually did really well, partly in thanks to like all of our

04:49 sponsors and donors that like still continue to step it up, even though we weren't doing an in-person

04:54 PyCon. We did a bunch of virtual PyCons. They went pretty well. You know, not as quite,

04:58 quite as fun for me. You know, I like to see folks in person, but yeah, we, I think made it through the

05:03 other side amazingly. It's been great. We got a really great board now. We just brought on a couple

05:08 new folks as well. And I'm really excited to see what we're going to do for the next, you know,

05:13 couple of years. I don't know that people fully appreciate how important PyCon is to the existence and

05:18 financial wellbeing of the PSF. Maybe elaborate a bit on that.

05:23 Yeah. I think the statistic, maybe at its peak, PyCon's revenue was about 85% of the operating

05:31 budget of the PSF. So like almost all of the money that the PSF needs to like run and operate, which

05:36 means pay staff, pay for infrastructure, all that kind of stuff came from ticket sales for PyCon.

05:41 And sponsorship money and things like that.

05:43 Yeah. Yeah. It's a little gray because like there are sponsors and they, they like both sponsor PyCon

05:48 and sponsor the PSF. And it's like that money sort of just gets used by the PSF. But yeah, I mean,

05:53 a lot of that sponsorship was really tied to the in-person event. So one thing we've done recently

05:57 is like, if you're not a PSF sponsor, you should go psf.org slash sponsor. And there's, you know,

06:04 kind of like a new menu for sponsorship. And we sort of adapted it in a way that's like,

06:08 not exactly focused. Like you don't have to show up to PyCon to be a PSF sponsor and still you'll

06:12 get a lot of benefit from it, including supporting things like PyPI and other infrastructure projects.

06:17 There's a bit of a, it seems like a bit of a wave of large companies coming in and properly sponsoring

06:23 the PSF. And I don't know if this is in reaction to what happened with PyCon and COVID, or it just

06:29 happens to be the timing and the growth and especially the growth of Python in a more business corporate

06:36 sense.

06:36 I think it's a couple of things. One is that like PSF very much needs the support,

06:41 right? And I think that's made obvious to the organizations that use Python and our infrastructure

06:45 like PyPI. The other thing is, I think a lot of organizations are taking open source as like a

06:50 dependency a lot more seriously. So making sure that they're like in some way contributing or,

06:55 you know, providing support for the infrastructure tools, software that they use. The other thing that

07:00 I want to call out here is like the PSF staff is incredible. They've done an amazing job about

07:05 making it really an attractive thing to be a sponsor of the PSF and also like following through on our,

07:10 you know, commitments and to our organization's commitments to us, our commitments to them,

07:15 that kind of thing. And like finding new and interesting ways to like get funding as well.

07:19 Right. So like we started doing an interesting thing a couple of years ago where we started

07:23 applying for grants for like work on PyPI. And I think that's actually our first podcast was about

07:29 like some funded work that I got hired to do as a contractor. And then we kind of like repeated that.

07:34 And we like brought in a ton, a ton, a ton of money to fund like really big stuff, big stuff that like

07:38 a volunteer would never get, you know, get done in a year of weekends. Right. Like it's just never

07:43 going to happen that a volunteer is going to sit down and have the time to do this. So it's been

07:46 really successful in terms of like shipping stuff that users need. That's like big, like large scale

07:52 stuff. It seems a little bit like in the past, but when you go to PyPI.org, this is still kind of shiny

07:58 new. Right. It got rewritten a couple of years ago and polished up and made a lot more modern. Right.

08:04 Yeah. Yeah. I think 2018 we launched this and so it hasn't changed really, you know, visually much since

08:10 then. A lot of new features and development and like the whole point behind the rewrite was to make it a lot easier

08:14 to build on top of. Yeah.

08:16 PyPI was like ancient, essentially. It predated everything on PyPI. So like it was, it was kind of wacky, but yeah,

08:22 it's super modern. Yeah. That's fantastic. It's like more sustainable now as well. Right. Like we have, we have

08:27 better like commitments from, you know, our in-kind donors for infrastructure. Shout out to like Fastly that pays our

08:32 entire infrastructure bill and is like an amazing sponsor of PyPI. But also just like we have, we had just hired an

08:38 infrastructure engineer to work on PyPI, which is super exciting and PyPI and other PSF infrastructure

08:44 as well. And yeah, it's just like, it was a little more sustainable than it used to be. Like we have

08:49 better core volunteers, moderators, all that.

08:51 We talked back in 2018, I think maybe you and Donald stuffed. I don't remember if you all were on together.

08:57 Those were two separate shows, but you were both involved. And one of the challenges was PyPI,

09:02 the web app was so bespoken, sort of its own tangled mess that people would want to contribute.

09:11 And they'd be like, you know what, now that I see this, maybe not so much. It sounds like it's in a

09:15 better place. We have some peps that we're going to talk about, about extending some of its functionality

09:19 and those sorts of things, which is probably a spinoff of just making it easier to work with.

09:25 We did a full stack rewrite for our reasons, because it's easier to maintain for us. It's easier to

09:29 contribute to, for other users, easier to propose new changes. And I think maybe the undertone for

09:35 this entire interview is there needs to be progress, right? We can't just get to a point where it's

09:40 just, that's it, that's good. There's just going to be, it's a constantly shifting landscape. So if we,

09:46 the PSF, we, PyPI, want to continue to be successful and popular, and Python is doing amazing right now,

09:51 we have to adapt to that to some extent.

09:53 Yeah. It's not the same world it was built for when it first came out. And also this 393,000 packages,

10:00 it's probably not something that was expected when this whole idea was put together.

10:04 It's scaled like impressively well, I think over the years, like almost 4 million individual releases,

10:10 like 6 million individual artifacts. Like that's a lot. That's a lot of stuff.

10:15 Let's talk just a bit about the whole infrastructure side, not the tech or anything. We've covered that

10:21 before and it was really interesting, but just how much data and expense there is to run this thing.

10:26 I wrote a blog post sometime last year, and it was going to be essentially a five-year update from a

10:32 previous post that Donald, who's one of the other PyPI maintainers, had written about just like what it

10:37 takes to power PyPI. And it has some statistics in it that are at this point out of date. But like,

10:42 yeah, we serve like almost, I think at this point, over 2 billion requests a day. We transfer like more than

10:49 60 terabytes from PyPI.org. And that doesn't include files. So when we serve the actual files,

10:55 actual distributions, that's like almost a thousand terabytes a day, like per day. That's a lot.

11:00 A thousand terabytes?

11:01 Yeah. If we had to pay like retail costs for our bandwidth from our CDN. So like almost like 99% of PyPI

11:08 is served from CDN. It would be in the millions of dollars. Like it's a substantial infrastructure cost

11:14 just to like serve the files, serve the requests. So it's not going down. It's not plateauing either.

11:21 It's definitely going up, which is good in the sense that like, yeah, we want it to be popular,

11:26 but like there's sustainability questions that come with that as well as we grow, just sort of like

11:30 unfettered, you know? Yeah. Figuring that out.

11:32 Honestly, that kind of blows my mind. I'm just wondering, what would you possibly do if you didn't

11:37 have companies like Fastly really supporting?

11:40 Honestly, it would be very hard to keep PyPI running if we didn't have the support of all our sponsors.

11:45 And like, I think it's really important to make this distinction between PyPI and other indices,

11:50 like, like NPM, for example, which is owned by a massive corporation and like run, has a whole

11:55 support staff, has a whole like engineering staff. PyPI is like a couple folks and like a bunch of

12:00 donated stuff, you know? And it's like on the same scale, you know, it's like as useful as something.

12:05 When I think about how PyPI and npm and RubyGems, and you know, this is not to focus on,

12:12 like to call out Python, but just all of these. It reminds me of the early internet back when we're,

12:19 not maybe when we didn't have passwords, but when it was kind of like, oh, we'll just,

12:22 we don't really need encryption here. And it was just, it was from a time when things were simpler

12:27 and it feels like it's getting a little more complicated security wise and so on.

12:32 Oh yeah, definitely. I mean, there was a point when me as a PyPI maintainer administrator,

12:37 like we never had to respond to takedowns for like malicious stuff. Like it just never happened.

12:42 And now it's like my inbox is on fire because I get multiple reports a day. Yeah. Yeah. I mean,

12:47 it's, it's like, I think part of it is like people are trying to hit security bounties and do like

12:51 research with PyPI, which is not the intended use case for PyPI, but it's a lot. Like it's definitely,

12:57 there's an uptick.

12:57 Yeah. There's been a lot of talk on the internet about things that might fix it,

13:01 like signing packages and whatnot, but we'll, we'll talk about whether that actually has anything to

13:08 offer there. Yeah. Yeah. One thing I did want to give a quick shout out to is there's from the open

13:13 SF at open SSF. They just gave some big donation to make PyPI a little bit better. Right. So they,

13:23 committed $400,000 to, in order to create a new role. Tell us a bit about this. What is this?

13:28 Yeah. They're, they're very excited to announce that they're planning to support us with a new role.

13:33 So it hasn't been, it hasn't been finalized. Okay. Contract hasn't been signed yet. No, no,

13:37 like open SSF, a fairly new organization, bunch of member organizations, including Google, Microsoft,

13:43 whatever to essentially support software security. Right. And so they're just kind of getting started

13:49 pretty recently. And I think their marketing team kind of outpaced the like legal team here. So we

13:55 haven't signed the contract yet, but it's like, you know, I feel confident saying that like, this is

13:59 probably almost definitely going to happen. So yeah, they committed, I think $400,000 to doing a

14:05 developer in residence that's security focused. And so this is sort of like piggybacking on something

14:09 that I helped start two years ago at this point, which is create the CPython developer in residence.

14:15 So that was started with funding from Google and, Lucas Schlange became the CPython developer

14:20 in residence. And like, I love to see this because like, I'm very happy to say Google is not the

14:26 sponsor of CPython developer in residence this year. It's Facebook. And then like, that's great because

14:30 like, I think this is something that can be shared by all the PSF sponsors that's, you know, funding it

14:35 each year, that kind of thing. So in a similar way, we're going to ideally hire someone that will focus

14:39 on just security for Python. That might be security for CPython. It might be security for PyPI. They

14:45 also want, I think, to fund a security audit of some like critical tooling for, for Python ecosystem.

14:49 That might be PyPI. So, but yeah, this is super cool. And, they've also like announced funding

14:55 for some other organizations like Eclipse foundation.

14:58 This is fantastic news and it's too bad that it's not signed yet, but I'm, it sounds like it's

15:02 definitely going to happen when it becomes official. I'll give it a, another shout out just to say

15:06 things because this is going to probably make a big difference. That's a big chunk of money to

15:10 contribute to it.

15:11 Similar to the CPython developer residence role. Like we're going to do interviews and audit,

15:15 and hire someone for that role. So there'll be a job posting if this happens. And then,

15:20 I'll be definitely tweeting, sharing that, trying to get people to get interested and apply because,

15:25 this is a super cool role.

15:26 It sure is.

15:27 This portion of talk Python enemy is brought to you by the compiler podcast from red hat.

15:34 Just like you, I'm a big fan of podcasts and I'm happy to share a new one from a highly respected

15:40 and open source company compiler and original podcast from red hat with more and more of us

15:47 working from home. It's important to keep our human connection with technology with compiler.

15:51 You'll do just that the compiler podcast unravels industry topics, trends, and things you've

15:56 always wanted to know about.

15:57 And I'm going to be talking to people who are going to be talking about.

15:58 I'm going to be talking to people who are going to be talking about tech through interviews with people

15:59 who know it best. These conversations include answering big questions like what is technical debt?

16:04 What are hiring managers actually looking for? And do you have to know how to code to get started in open source?

16:10 I was a guest on red hat's previous podcast, command line heroes in compiler follows along in that excellent and polished style.

16:18 We came to expect from that show. I just listened to episode 12 of compiler. How should we handle failure?

16:24 I really valued their conversation about making space for developers to fail so that they can learn and grow without fear of making mistakes or taking down the production website.

16:34 It's a conversation we can all relate to. I'm sure. Listen to an episode of compiler by visiting talkpython.fm/compiler.

16:42 The link is in your podcast player show notes. You can listen to compiler on Apple podcast, overcast, Spotify, pocketcast, or anywhere you listen to your podcasts.

16:50 And yes, of course, you could subscribe by just searching for it in your podcast player, but do so by following talkpython.fm/compiler so that they know that you came from talkpython to me.

17:00 My thanks to the compiler podcast for keeping this podcast going strong.

17:06 Let's talk about 2FA. That's been a bit of a flashpoint in it. I don't feel like it should have, but it has been.

17:16 What's the story of 2FA and critical packages and PyPI?

17:20 Yeah. So yeah, flashpoint, like kind of almost unexpected for me, you know, like I think I'm just so close to security space and PyPI and all that stuff that like, I think the reaction was a little stronger than I think everyone expected.

17:34 It feels to me like the reaction was if you had set up a rule that said, hey, you can't have the letter A as your password.

17:41 And everyone who has the letter A, you have to change it. It's like, it's almost like that level of requirement change to me, it feels like. And yet it just, it just blew up. Right.

17:51 Yeah. Let me give some background and then like we can talk about like realistically what it means.

17:55 So yeah, we, we made an announcement and basically that we were going to designate some projects on PyPI as critical. And essentially we determined this based on download count because that's kind of like, it's not a great metric, but it's kind of the best metric we have for determining like if this project was compromised.

18:13 And I'll talk about like how that might happen, you know, how many people would be affected. And it's like, if we measure the amount of times that this is getting downloaded a day, that's a pretty good proxy for like impact in terms of something being compromised.

18:24 Right.

18:24 So yeah, we, we made this designation and we sort of announced that like at some point in the future, did not announce a date, did not like enforce a requirement at this point.

18:31 We're going to ask those maintainers to require that two FAs enabled for their account.

18:36 And so we did that and then we sort of paired this with an incentive.

18:39 We, my team at Google actually funded the purchase of a bunch of Titan security keys.

18:44 These are like hardware keys for two factor authentication that Google manufactures, but we just, you know, essentially we give away discount codes to these maintainers of projects that have been designated as critical and they can get not one, but two for free.

18:57 So if they're one of these maintainers.

18:59 So yeah, we did that and the designation was 1%.

19:03 We decided the top 1% of projects would be this point designated as, as critical.

19:07 Right. I feel like there was a bit of a confusion when people saw this announcement.

19:12 They saw, wait a minute, you're making me adopt hardware based 2FA because I have a PyPI package.

19:19 The requirement is not that you have to use the hardware keys if you have a critical package, is it?

19:23 And like, I would love if everyone used hardware keys because I think they're generally considered to be a little bit more secure, but no, the idea is that everyone should turn on 2FA and that's, PyPI supports, you know, TOTP, which is like what you're used to.

19:37 Yeah. Like the free applications on the phone or other device and security keys.

19:43 So like, and security keys is pretty broad now that, that doesn't just include the like USB devices, but also like you can do web auth and via like phones and like other physical hardware.

19:53 Like it's pretty, you know, the integration with browsers is pretty good now.

19:56 So there's a lot of support.

19:58 Yeah. And just like the audience out there, Michael is asking, do they need to be hardware keys or just regular auth?

20:03 It's just regular auth, right?

20:04 And, and that's why I said, I don't feel like it's that big of a deal.

20:07 It's like, well, you have to have a secure password or you have to have 2FA or whatever.

20:10 Like kind of immediate reaction from some folks with like really big megaphones essentially was that this is a slippery slope.

20:16 Like PyPI is asking something of its users.

20:18 We don't do that very often.

20:20 Like we sort of like let users do whatever they want.

20:22 And we have some sort of like baseline requirements for like how to use PyPI, but we don't like often ask people to do extra stuff.

20:29 There's a good reason why we're interested in asking people to do 2FA.

20:33 And it's not because like Google has secretly conspired to like do it so that it's own open source security.

20:38 You know, like it, there's, there's like a very good reason.

20:40 There's a whole undercurrent, a whole thread of, well, it's these big corporate companies that are adopting Python that are making us do different security to support them.

20:50 And that wasn't it at all, was it?

20:52 Here's the main reason, right?

20:53 And like that is valid, right?

20:55 Like there are big corporations that consume stuff from PyPI.

20:57 They would love to have more assurances about like that their projects haven't been compromised.

21:02 I don't think 2FA is like exactly the right way to do that.

21:05 At the end of the day, 2FA, like it protects against like two kind of critical attacks that could happen on a Python package.

21:12 One is just like phishing, right?

21:14 Like 2FA is essentially it completely eliminates the potential to get phished.

21:17 I've never seen someone get phished on PyPI.

21:19 I've never heard about phishing attack.

21:20 But like PyPI is as susceptible as like a bank or anything else for phishing.

21:25 Like it could happen to anybody.

21:26 So that's one thing.

21:27 The other thing is maybe like more specific to PyPI itself, which is what we call like domain resurrection attacks.

21:33 So developers like really love their vanity domains, their personal domains, their personal email addresses.

21:38 And so unlike maybe your bank, like you're the users on PyPI are more likely to have these like one off domains.

21:44 And those domains like expire.

21:46 People forget about they lose access to them.

21:48 They get, you know, registered by someone else.

21:50 And when that email address has the ability to reset a password on a PyPI account, an attacker can like keep an eye on your domain, watch when it expires, go and register it, do a password reset, and then take over your account and publish whatever they want.

22:03 And so 2FA in similar way to phishing protects against that attack as well.

22:07 I had never really thought about that.

22:08 That's almost like the SIM card equivalent.

22:10 Yeah, a little bit.

22:11 But for email.

22:13 So, you know, the SIM card problem is I could call up, you know, my, I could call up, I could call up someone else's phone provider and say, I lost my SIM card.

22:22 Please issue me a new one.

22:24 And then you start getting their SMS for like SMS authentication and stuff.

22:27 This is, you've taken over their domain, not maliciously.

22:31 They just decided to, you know, a credit card expired or something.

22:35 And then you snatch it up, set up some MX records and off you go.

22:38 Okay.

22:39 Our ultimate goal, like PyPI's administrators, I'd love to protect all users from attacks that could be prevented from 2FA, but it's a little bit more like, like it's actually for our own benefit, right?

22:50 Those kind of attacks.

22:51 So one has happened recently, the CTX package had a domain takeover and a malicious release published.

22:56 And we wrote a very long incident about it.

22:58 It took a lot of our time.

22:59 And we essentially like, it's not sustainable for these to happen.

23:02 We can't, we don't have a support team.

23:04 We can't do, you know, manually remove these packages and monitor things for, like, we just can't, we just can't handle it.

23:10 So 2FA is like the folks that maintain PyPI asking users like, hey, help us out a little bit.

23:15 Just do this thing for us to like kind of cut down on the potential for this and make it easier for us to do things that like we actually want to do to PyPI and not just like respond to security incidents.

23:25 Right. Because there's only a couple of you.

23:26 And if you're spending all your time putting out these fires, you're not adding JSON endpoints and other beneficial things.

23:33 Yeah. All sorts of stuff.

23:34 Like, yeah, the more time we spend putting out fires, the less we can do like useful and interesting things to PyPI.

23:39 Yeah. So why 1% of the top packages? Why is that critical? And also what's the designation over time?

23:46 The designation is if at any point it was in the top 1%. And I think we recompute this every day. So, you know, projects have since we announced this, they've moved into the 1% because it's constantly shifting.

23:57 But yeah, why 1%? So that's a question that like was coming up a lot in the discussion after we made this announcement. And the like secret to the 1% is that in reality, if you were to go and figure out like, okay, how much traffic, how many downloads does this 1% of packages actually represent for PyPI?

24:15 It's like over 95%. It's close to 99%. It's like most of what people are using from PyPI is in this 1%. So by saying 1%, we also essentially said like for the long tail of PyPI that people aren't using, we care a little bit less about that.

24:32 We're going to cover like the majority of these, like I said, the potential for impact if something was compromised, we sort of maximize that. And we also kind of had to minimize that 1% too, because I think the thing, another thing that folks didn't really realize about

24:45 what it takes to support 2FA is that there's an incredible maintenance burden for 2FA. Like we have to handle account recovery requests because people like they lose their phones, they lose their hardware keys, people are humans, right? And so this happens all the time. And it's expensive for us to handle this, right? And we can't just say, all right, great, you like you lost 2FA, I turned it off for your account, go wild, because that's essentially like a perfect way to circumvent 2FA. Instead, we have to do this like very manual process where we like verify other identities,

25:14 emails, like if you have a GitHub associated, we ask you to do something on GitHub, just like prove that you own that account. And even then it's like, it's really not perfect. Like there is potential for someone to be compromised who did have 2FA enabled by someone who, you know, could take over this account or that account and like pretend like they need an account recovery. But yeah, this is a huge maintenance burden. So like, we actually like can barely handle account recovery requests right now. And I'm a little wary of how many we're going to get now that folks have started really turning on 2FA, but we think it's worthwhile.

25:43 And maybe that's probably why 1% and not 100%, right?

25:46 Oh, yeah. Like there's just zero chance we could handle 100% of like everyone on PyPI with 2FA enabled. Like we just couldn't handle it. I would love that. Like that would be great. But yeah, unfortunately, like the amount of people losing their stuff and having to come to us for resets, it's just, it's the burden is really high.

26:02 Sure. For me, I used Authy for my 2FA, which syncs across devices. So at least if I lose one, I get it back.

26:09 Yeah. And like Google Authenticator works really well for TOTP as well. And I think you can like download the codes or store them externally as well. So if you lose your phone, you can regain access to those TOTP codes as well. And there's a bunch, like, there's also like emulated TOTP stuff where you can like run it on your laptop. And it's not like maybe not technically true factor, but like a lot of people use that because it's more convenient.

26:30 It's way better than nothing, right?

26:31 All better than nothing. Exactly.

26:32 Let's talk about James Bennett and opinions. You called out this article and I also read this. I think this is really good. What are some of your takeaways here?

26:40 Yeah. James absolutely nailed the response here. And actually, like, you know, when we got a lot of feedback, I'm not going to say that it was bad feedback. You know, it was maybe somewhat uninformed feedback or it was somewhat sensational feedback, but we got a lot of feedback after this. And some of it was totally valid. Like, you know, at the end of the day, we are asking users to take a little

27:00 more effort and some people, you know, they don't want to do that. And, you know, I like none of the PyPI administrators actually like explicitly responded to a lot of this. I think we were all like a little bit depressed about how upset some people were about the 2FA requirement that didn't even exist yet.

27:14 But yeah, James, like really shout out to James, because I read this and I was like, I could really could not have written it better than he did. He really called out.

27:23 There's a lot there. And it's I think it's very well thought. Yeah, I thought out. Yeah.

27:27 Yeah. Shout out to James. You know, there was kind of like two arguments that he was making, which is that, like, you know, a lot of people were concerned this would be a slippery slope. And I think I don't really foresee PyPI making too many more mandates about stuff like this. Not because of the feedback, but because, you know, like we're never going to.

27:42 I don't think we're ever going to mandate signing, for example, like that's always going to be the option of the maintainer. But, you know, things like 2FA for certain high profile stuff like, yeah, it really helps out helps PyPI continue to exist. Right. Like that's actually the motivation here.

27:56 I definitely want to echo the message that you said about the overhead. Yeah. Like you have to deal the people who would otherwise be constructively working on this have to deal with these problems.

28:05 Yeah. Every day. I mean, it's like I don't get paid to do it. I do it out of love, but it becomes larger and larger every day. And yeah.

28:12 We're keeping our head above water right now. But yeah, there's plans also to make that better. But yeah.

28:16 How much do you think the reaction, the I'll put it out of the way, I think the overreaction was of how much do you think that was perceived as it's got to be a hardware key versus it's just straight 2FA?

28:27 Do you think people really rejected it being 2FA or did it seem like a bigger burden than just adding it to your Google Authenticator?

28:34 If I were to say that whether we made some sort of failure here when we announced it, I would say like we didn't message this super well. Right.

28:40 And that's because I'm a software engineer. I'm not a marketer or, you know, I'm an OK communicator.

28:45 And the same is true for the rest of us. We don't have copywriters, anything like that.

28:49 We don't have a PR team. So, you know, there was some stuff that people kind of missed.

28:53 And I think one of the things was missed was like the mandate doesn't exist right now.

28:56 We're just talking about enforcing it in the future.

28:58 The other was like, what is actually being required of you today?

29:02 Which for most folks, it was it was nothing.

29:04 It was like, if you want to get a pair of free security keys, you have to do this today.

29:09 And by the way, those are still available.

29:11 I'm sure you all saw this as a positive, like, hey, we got this cool thing for people that they can get if they want or they just do 2FA.

29:16 But like people are like, what is this?

29:18 Yeah. And you're saying there's still some available for folks who want to get it. Right.

29:21 Yeah. So through October 1st. So, yeah, if you this might be my call at the end as well.

29:26 But yeah, if you go to pypi.org slash security dash key dash giveaway, you can check if you're a critical maintainer and you can get a key, get a pair of keys, actually.

29:35 Yeah. So the pair of keys thing also, people weren't really sure why we were doing that.

29:39 But the main reason is to help you not lose both of them, like lose all access.

29:43 So if you have two keys and you've used both of them, you have some redundancy.

29:47 You can stick someone, stick it in the garage or stick it somewhere else.

29:51 You know, hand it to a friend.

29:52 You put your brain in your backyard.

29:53 Yeah, exactly. Exactly.

29:58 This episode of Talk Python To Me is brought to you by the IRL podcast, an original podcast from Mozilla.

30:04 If you're like me, you care about the ideas behind technology, not just the tech itself.

30:10 We know that tech has an enormous influence on society.

30:14 Many of these effects are hugely beneficial.

30:16 Just think about how much information we carry with us every day through our cell phones.

30:22 Other tech influences can be more negative.

30:24 I really appreciate that Mozilla is always on the lookout for and working to mitigate negative influences of tech for all of us.

30:33 If those kinds of ideas resonate with you, you should definitely check out the IRL podcast.

30:37 It's hosted by Bridget Todd, and this season of IRL looks at AI in real life.

30:43 Who can AI help?

30:44 Who can it harm?

30:45 The show features fascinating conversations with people who are working to build a more trustworthy AI.

30:51 For example, there's an episode on how the world is mapped with AI.

30:56 But it's the data that's missing from those maps that tells as much of the story as the data that's there.

31:01 Another episode is about gig workers who depend on apps for their livelihood.

31:05 It looks at how they're pushing back against algorithms that control how much they get paid,

31:10 and how they're seeking new ways to gain power over data and create better working conditions for all of them.

31:16 And for you political junkies, there's even an episode about the role that AI plays when it comes to the spread of disinformation around elections.

31:24 Obviously, a huge concern for democracies around the world.

31:27 I just listened to The Tech That We Won't Build, which explores when developers and data scientists should consider saying no to projects

31:35 that can be harmful to society, even though we do have the tech to build them.

31:39 Does this sound like an interesting show?

31:41 Please use the link talkpython.fm/IRL to subscribe.

31:46 Yes, you could search for it in your podcast player, but use the link talkpython.fm/IRL

31:52 to let them know that you came from us.

31:54 The link is in your podcast player show notes.

31:56 Thank you to IRL and Mozilla for supporting Talk Python To Me.

32:04 I guess part of the reason this is so much in the public awareness is because of this project called Atomic Rights.

32:12 Yes.

32:13 Want to give us all the rundown of why we're talking about this package?

32:19 Let me just give people a really quick background.

32:21 Atomic Rights is a package that lets you, within a with block, like you would do open file,

32:27 but instead you say atomic write and it will write to a temporary file and only commit those

32:33 changes to the real file, like at the very end, all in one shot.

32:36 Pretty useful.

32:37 Not super hard to do your own version of with a couple of built-in things in Python, like the temp files and what, but still.

32:43 Kind of no longer necessary for modern Python is my understanding.

32:46 Like this is a couple of lines of like modern Python.

32:49 You don't have to worry about it.

32:50 But it used to be, you know, something that you would use.

32:52 Right.

32:52 Exactly.

32:52 How does this relate to 2FA?

32:54 That has nothing to do with 2FA, does it?

32:56 There's this thing that happens all the time, right?

32:58 Like so IPI has this policy that everything on PyPI is essentially immutable.

33:02 And that means that like individual files, file names, which can include a project name, a version and like a distribution type.

33:09 Those are immutable.

33:10 So if you upload something to PyPI that is like source distribution for some version or whatever, you publish that.

33:15 It's there.

33:16 You can't overwrite it.

33:17 Right.

33:17 So you can't surreptitiously like change what that points to.

33:20 So like anyone installing it is always going to get the same thing, same SHA, everything.

33:24 But that also means like if you want to delete something, you delete it and it's gone forever.

33:28 You can't come back and overwrite it with something else.

33:30 And so, you know, when and I don't encourage people to delete stuff from PyPI generally because, you know, you're almost definitely going to break somebody.

33:36 There's better methods for, you know, kind of marking something as not useful and telling pip to not install it.

33:41 That's our yanking, which is a whole weapon to itself.

33:43 But yeah, so this thing happens all the time, though.

33:45 Like we have a huge warning banner, big red button, like everything telling you if you're going to delete this thing.

33:50 You're not going to be able to get it back.

33:51 And so what happened here is like this maintainer didn't want to comply with 2FA.

33:56 Their project was marked as critical because a lot of people were using it like a lot of people were using it still.

34:00 And they thought that it would be a cool.

34:03 They thought they'd discovered a cool hack where if they deleted it and then recreated it later that the mandate would no longer apply.

34:09 And like that was kind of true because, like I said, our computation for critical projects runs once a day.

34:14 So when they brought it back, like it didn't have that flag within 24 hours, that flag was added back to the project, essentially.

34:20 But for a brief period of time, yeah, it was not marked as critical.

34:23 But what happened was, you know, all these versions went away.

34:26 And like a lot of people, I think, were depending on them, you know, like actual users of this project.

34:31 And so, yeah, like there's a long discussion happening now about whether like it should even be possible to delete stuff from PyPI.

34:37 And there's good arguments on both sides of the coin, right?

34:40 Yeah, well, that was one of my first thoughts is like, wait, you can delete the releases?

34:44 I knew they were immutable.

34:45 You can't update them, but deleting.

34:47 So what's the tradeoff there?

34:50 Why can you delete them now?

34:52 And maybe why wouldn't you in the future?

34:53 This is like NPM's left pad incident, essentially.

34:55 Like we right now there's potential for a high profile enough.

34:59 And this package wasn't super high profile, but like it was in the critical list.

35:02 It was in the top 1%.

35:04 There's potential for some maintainer to decide, you know, you know, and it's their prerogative right now, right?

35:09 Like there's no guarantees that these things continue to exist on PyPI.

35:12 No one's necessarily paying for this.

35:14 So like, yeah, maintainer absolutely has the ability now to just wipe something super popular and necessary off the face of PyPI.

35:21 And that's the current status quo.

35:23 It's not the same in a lot of other ecosystems.

35:25 Some of them don't have that policy.

35:27 Some of them do.

35:28 But yeah, so there's a bit of debate about whether that should be necessary, especially when we have stuff like yanking, which actually is a more meaningful way to remove something.

35:36 So let's suppose somebody, a pallets org or whatever, erases flask tomorrow.

35:41 David, don't do it.

35:42 David, please keep going, man.

35:44 Is there a way to get a hold of the actual wheels and stuff as a community and put it back up under potentially a different name?

35:53 Or is it just gone?

35:55 How seriously gone is it when it's gone?

35:57 Yeah, nothing published to PyPI is actually gone.

35:59 So we don't actually, unless we're like legally required to, we don't delete any actual files off of our data store.

36:06 So like the bucket that everything goes into, everything that's ever been published to PyPI is still there.

36:11 So this actually played out.

36:12 It's good that we have this in a couple instances because this is what exactly we used in this case of atomic writes.

36:18 Because the maintainer was like, oh, I made a mistake.

36:21 And they were like kind of humble.

36:22 They were like, yeah, okay, this is a mistake.

36:24 I shouldn't have done this.

36:25 And then asked us to essentially restore the project from scratch.

36:29 And like, we don't really have mechanisms to do that, right?

36:31 That's not, that's not something that we do often.

36:34 I think I can only remember maybe once when we've done that before.

36:36 Maybe not even once.

36:37 We generally just don't do this.

36:39 Like if you delete something, we say it's gone.

36:41 Like you need to publish a new version.

36:42 But in this case, like we did decide to take the time.

36:45 I think it took Donald like almost an hour to do this because it's a super manual process.

36:49 But yeah, the files are still there.

36:50 Something you don't do very often, right?

36:52 No, no, like almost never.

36:53 How can I even do this?

36:55 The files are still there and they're still like externally addressable too.

36:58 So they're, you know, they're always going to be available.

37:00 And like, if, if something like that happened, David, don't do it.

37:04 But like, if something like that happened, I think folks would probably be okay.

37:07 We'd find ways around it.

37:08 But yeah, I mean, it's a strong argument for not allowing it to happen.

37:12 And, you know, when, when people publish stuff to PyPI, like RTOS is essentially,

37:16 you give us the right to distribute this as we see fit forever.

37:19 So, you know, PyPI is within its right.

37:22 But there's, there's arguments for, you know, giving maintainers the ability to do it for various reasons.

37:26 Has there been any thoughts to putting like levels of what pip will install?

37:31 For example, I'm thinking like, I want to set up my pip though.

37:35 It will only accept things that have 2FA set up, or it will only accept things with a certain number of downloads.

37:42 Like I can only pip install something with 10,000 or more downloads.

37:45 So, because maybe I'm trying to avoid typosquatting for very edge case things.

37:50 In that general realm, have you all thought about this?

37:52 Probably you have.

37:53 Yeah, definitely.

37:54 What are some of the thoughts?

37:55 I think we have it on our list to talk about later.

37:57 But yeah, I mean, there's definitely potential, right?

37:59 There's all sorts of signals that you could potentially take into account here.

38:03 TBD, like how meaningful some of them actually will be or how much that will actually protect you.

38:07 But yeah, people have talked about like essentially defining a policy for what they'll consume and either having that be part of pip or something external.

38:14 So yeah, it's definitely been discussed.

38:16 Okay.

38:16 You know, for example, like the web browsers, you can have no blocking.

38:22 You can have blocked third-party cookies.

38:24 You can block third-party cookies and trackers.

38:26 And you can decide like how broken do I want my web to be versus how safe do I want my web to be?

38:31 I feel like there might be something like that in the pip world.

38:35 I think the reality, at least right now, is that any kind of policy like that would not be enforceable.

38:41 Because there's going to be some edge case, some dependency that's super old or whatever.

38:45 Like, you know, Python is not nearly as bad as an ecosystem like npm in terms of like breadth of dependencies for a given thing.

38:51 Right.

38:52 Yeah.

38:52 Usually the dependencies are like thicker.

38:54 Right.

38:55 You don't have like three lines of code you're depending on.

38:58 You just put that in your code.

38:59 It does exist, but yeah, generally no.

39:01 But so that's sort of like, but even still, like, I think you'd have a hard time saying like, I'm going to only consume packages that have 2FA enabled because there's so few of them right now.

39:11 If all that's, you know, the tooling and stuff for that existed.

39:14 Yeah.

39:14 Sure.

39:14 Okay.

39:15 Interesting.

39:16 So this Atomics Right story is everything was put back, but it just, it shows unintended consequences.

39:22 And kind of ironic too, actually, because like PyPI is an open source project as well.

39:26 And like, you know, people were upset.

39:28 We were making demands of users to do a certain thing.

39:31 But like at the end of the day, someone's making demands of us to use our time in ways that we don't necessarily want to.

39:36 Again, see James Bennett's article, right?

39:39 A lot of those ideas were really well spelled out there.

39:42 Absolutely.

39:43 People forget that it's a volunteer open source project and not like run by some corporation.

39:48 I thought it was this conglomerate of corporate overlords.

39:52 So this is what I got from Reddit.

39:54 Shadowy Cabal.

39:55 Yeah.

39:55 What's this PyPI 2FA dashboard here?

40:00 This looks pretty cool.

40:00 Tell us about this project.

40:02 I'll link to it in the show notes, of course.

40:03 Switch to like in the top right, switch to like past three months or something like that.

40:07 Yeah.

40:07 Okay.

40:08 You can really see the bump there.

40:09 So yeah, this is the dashboard we put together essentially for us to monitor how the rollout for 2FA and security key giveaway was going.

40:16 But we made it public.

40:18 So like anyone can check this out.

40:19 And I think we'll put link in the show notes.

40:21 But yeah, the numbers are great.

40:22 The one thing this isn't actually showing is how many security keys we've given away.

40:26 So we've at this point, like I just checked earlier, we've given away more than 500 keys, which is awesome.

40:31 And it's only a fraction of what we have to give away.

40:33 So I would really like anyone listening wants a key and has a critical project like go and get the keys.

40:38 And then, you know, this will find something to do with those keys.

40:41 We don't give them all away by the time they expire in October.

40:44 But yeah.

40:45 So a bunch of keys given away.

40:46 We also I didn't mention this, but like as part of this, we also turned on a feature that allowed any project to manually require 2FA for all their maintainers.

40:54 So like anyone that just wasn't critical and wanted to opt into this, they could do that, too.

40:58 So I think 300, almost 300 projects have done that.

41:01 And then we're almost like we're so close to hitting 30,000 users on PyPI with two-factor enabled, which is huge.

41:08 And that's up from like 27,000 before we did the giveaway.

41:13 Cool.

41:13 Well, I was one of the 27,000 before because my packages on PyPI are not super significant, but they are there.

41:22 And so I definitely put 2FA on there and just have it running through my phone, basically.

41:27 Appreciate that.

41:27 Yeah.

41:27 So people can go and see the progress here of how it's coming along.

41:32 Yeah.

41:32 And, you know, how many projects we've classified is critical, right?

41:35 Like how many is 1%?

41:36 Well, right now it's like almost 4,000 projects, which, you know, it's not a ton, but there's a lot of maintainers of those projects.

41:43 Well, two things that are interesting.

41:44 One is I think I can go to pypi.org and see 393,000 and say, well, 3,930 are probably critical, right?

41:55 As that designation.

41:56 But as you said, it's computed over time.

41:59 So maybe there's something that was critical, but is no longer or something becomes critical, right?

42:04 So this number could sort of outpace the actual number of just 1% of the total projects.

42:09 Yeah.

42:10 Yeah.

42:10 It will grow above 1% over time because, yeah, if something's been designated as critical, it just retains that designation indefinitely.

42:17 And the other thing that we kind of snuck in here is that like anything that's a dependency of PyPI itself is also critical.

42:23 So we just figured that would be a good idea for us.

42:26 So, yeah, there's a couple of projects that maybe wouldn't normally be included, but we include them because like we personally care.

42:32 Is that like maybe pyramid or stuff like that?

42:34 Yeah.

42:35 I don't know what the difference between those two sets is necessarily, but it would be interesting to figure out.

42:39 If there's anything that potentially wasn't, then it is because of that.

42:42 Okay.

42:42 Yeah.

42:42 The other thing that's interesting is there's 8,400 users identified as critical, even though there's 3,900 packages.

42:49 So I guess because multiple people can be designated as a maintainer, huh?

42:54 Yeah, exactly.

42:54 So like it looks like the average is about two maintainers per critical project, which is a little scary.

42:59 I think in that reality, there's a lot that just have one and a lot that have a lot more.

43:03 Yeah.

43:04 Yeah.

43:04 It's a bimodal sort of distribution, right?

43:06 There's a whole bunch by the one maintainer than these groups of people.

43:10 Yeah.

43:10 Yes.

43:11 Interesting.

43:11 Okay.

43:12 This is cool.

43:12 So people can check this out and see how it's going.

43:14 This whole idea of critical and requiring 2FA, this npm is also doing something like this,

43:21 right?

43:21 So it's not completely out of the blue.

43:22 Yeah.

43:23 I mean, a lot of organizations are doing it.

43:25 I think RubyGems said that they were working on a mandate or had proposed one as well.

43:29 And yeah, NPM, they started with a pretty small cohort.

43:32 It's only the top 100 projects, but they're going to expand that.

43:36 And then the big one is, I think a lot of people aren't aware of this, but like GitHub

43:40 announced that they are going to require 2FA for anyone that contributes code on GitHub,

43:45 which is like, I guess everyone that uses GitHub.

43:47 I don't know.

43:47 I don't know what the user is that doesn't contribute code on GitHub, but like, yeah, everyone's

43:52 going to have to have 2FA enabled, which is huge by the end of 2023.

43:55 So they have some time, but like, I can't comprehend the size of the support team they

43:58 must be hiring right now to satisfy that.

44:01 Because that's like, it's crazy.

44:02 Well, probably a lot of support team and a lot of automation that they're trying to get

44:05 in place to, okay, here's what you got to do.

44:08 But yeah.

44:08 Yeah.

44:09 The kind of, the secret here is that like, we have a, so via the OpenSF, it has a bunch

44:13 of work groups and we have a new one that's pretty fun for me because it's about securing

44:17 software repositories.

44:18 But essentially, you know, like everyone that maintains a software repository, including folks from

44:22 NPM, crates, RubyGems, like Maven Central, all that stuff.

44:26 They all, we all come and talk twice a week or once every two weeks and, you know, talk

44:30 about this kind of stuff.

44:31 So like, we've been all talking about like 2FA mandates for a while now and like kind of

44:36 working on our plans together and sharing notes and that kind of thing.

44:38 Yeah.

44:38 Fantastic.

44:39 I only think that there's probably some users who just clone repos and post issues, but

44:46 don't really, don't do any check-ins.

44:48 They're just there to.

44:49 Yeah.

44:49 Just to save a copy.

44:50 Yeah.

44:51 Yeah.

44:51 That's actually probably not the main use, right?

44:56 Probably the main use is people contributing to their private repos or to maybe even less

45:01 sort of public repos, probably mostly private repos, I would guess.

45:04 Okay.

45:05 But yeah, that's going to be a big, big deal.

45:07 And it also, I guess it leads us into the whole supply integrity side of things, right?

45:14 Because it's one thing to say, your account on PyPI has to be secured with 2FA and better

45:21 security.

45:22 But if somebody can just put in bad code through a very complex PR that happens to sneak, you

45:29 know, a less than where there used to be a greater than, you know, some weird, weird little

45:33 edge case into a PR or take over somebody's GitHub account, change the code and they don't

45:39 realize it, right?

45:40 These, that's probably more likely because there's no notification that you did your own

45:44 commit in a significant way, right?

45:46 So locking down GitHub will have really important knock-on effects for Python and all the other

45:52 open source package locations, right?

45:55 And there's a lot of work being done here just in terms of like making your source depository

45:59 more secure and making your builds, you know, if you're building artifacts on GitHub, making

46:03 those more secure, you know, making your publishing method more secure, like all of it is getting

46:07 a ton of improvement right now.

46:09 Yeah.

46:09 I mean, you don't want something sneaking into mixing some code in a CI step or any of those

46:16 types of things.

46:16 Yeah.

46:16 So there's so many insertion points, right?

46:19 Yes, exactly.

46:20 There's a lot of spots.

46:20 Exactly.

46:21 So I guess, what are some of the thoughts on what can be done?

46:24 You know, like this is kind of what I was going back to, like, oh, there should be this world

46:27 with no encryption and like lacks or no passwords.

46:30 And now all of a sudden we need all this junk.

46:31 And I feel like there's a little bit of that with people going, well, let's, let's just try

46:35 to abuse PyPI.

46:36 Let's try to abuse npm and sneak in what we can.

46:40 And you maybe talk about some of the steps you all have taken to mitigate that and, you know,

46:45 what you think can be done.

46:46 There's a lot of talk about package signing, which I'm not so sure.

46:50 Straight up signing is how useful that is.

46:52 But yeah, maybe let's start at like whatever you thought about.

46:54 The thing I want to start with is just like, none of these is a pancia, right?

46:57 Like, so I think one of the arguments that was raised with the two factor stuff was just

47:01 like, well, this isn't going to protect like us from a vulnerability or like a maintainer

47:06 going rogue.

47:07 It's like, yeah, no, obviously not.

47:09 Like there's not one thing that's going to protect you from that.

47:11 It's like we, there's a combination of features that are going to protect you from a combination

47:14 of threat actors or vectors.

47:16 And we have to use all of them if you want to like feel fully protected.

47:20 So yeah, I think like we just spent a lot of time talking about 2FA, but like, like I

47:24 said, it eliminates entire classes of attacks.

47:26 So like, please turn on 2FA.

47:27 Yeah.

47:28 Another one, like since we're talking about GitHub, an interesting one is the security hardening

47:32 with OIDC.

47:33 If you've heard of OIDC, it's OpenID Connect.

47:35 It's kind of built on OAuth protocol.

47:37 But essentially like it allows you to give things identity.

47:40 So like each individual GitHub Action workflow, like each run of the workflow actually gets

47:45 its own identity.

47:46 And that identity is like cryptographically verifiable.

47:49 So IPIs, we're working on implementing support right now and it will exist very soon for what's

47:55 essentially going to be credentialist publication from a GitHub Actions workflow.

47:59 Okay.

47:59 So that means like no password, no API token, nothing.

48:02 You essentially say, okay, I trust this workflow and it has the ability to publish directly

48:08 in a secure way.

48:09 It's super cool.

48:10 And then like a lot of other CI providers will hopefully support for OIDC as well.

48:14 So yeah, we'll probably see this in a bunch of places.

48:17 All right.

48:17 So short-lived tokens directly from your cloud provider.

48:22 Yeah, essentially.

48:22 It works for like Google Cloud right now as well and a couple other things.

48:26 But yeah, it's essentially like a way to verify the identity of like a really tightly

48:31 scoped thing, like an actions run.

48:33 And then, you know, verify it and authenticate it and give it the permission to like do something

48:38 like publish to PyPI.

48:39 This looks useful.

48:40 I mean, handing out API keys or embedding username passwords into CI, CD doesn't sound like a

48:47 great idea.

48:48 Yeah, no.

48:48 Especially with stuff like CodeCoveAttack or Travis CI had a similar attack where like all

48:52 environment variables were exposed.

48:54 Like everyone's got to just go and roll everything at that point.

48:57 It's like such a mess.

48:58 So yeah.

48:58 Well, and it's one thing to say, okay, we've got to go reset your password.

49:01 Fine.

49:02 There is a race.

49:04 I mean, as soon as that happens, there's a race from people looking to use those credentials

49:08 and you looking to not have them used by other than you.

49:11 The real problem, I think, is not the people who are paying attention.

49:14 It's probably the maintainer who set up some project and hasn't touched it in a year and

49:19 hasn't checked their email.

49:20 They're just kind of not super engaged.

49:22 That could stay open for a super long time in a bad way.

49:25 Exactly.

49:26 All right.

49:26 Package signing.

49:27 Donald Stuff talked about package signing and said, why package signing is not the holy

49:32 grail.

49:32 People just say it's a little bit like the 2FA stuff.

49:36 Like if you just sign your packages to prove they come from you, everything is going to be

49:40 fine unless the person just goes rogue.

49:43 Like you can't protect from crazy.

49:45 There was one of the packages that got messed up.

49:49 I think it was on PyPI.

49:50 That same person was arrested for like bomb making materials in New York.

49:56 I don't know about this.

49:56 Let's see if I can find the article.

49:57 Let's see if I can find the article.

49:59 It's like, clearly that's not a well-functioning sort of person.

50:04 And it doesn't matter if they sign the package, if they just go bonkers, you know?

50:08 Yeah.

50:09 It's not the attack that we're trying to protect against with signing.

50:11 But yeah.

50:12 What can signing help with and what is it not going to help us with?

50:14 Yeah.

50:14 Donald wrote this post that's kind of like canon at this point for why package signing is not

50:20 the holy grail.

50:20 But he's really talking about GPG.

50:21 Well, why GPG signing is not the holy grail.

50:24 There's good points in here, right?

50:25 Like so, like you said, package signing doesn't protect against an actual account compromise.

50:29 If someone compromised your GPG key, like they can sign whatever they want.

50:33 So there's no protections there.

50:34 And then there's other problems with GPG as well.

50:37 Like there's UX and usability issues.

50:39 There's issues with web of trust, like actually establishing, okay, great.

50:43 Like you sign this thing, but how do I establish that the person that signed it is actually a

50:48 person that I trust and not someone's provided me with a malicious public key, right?

50:52 Like how do you actually do that?

50:53 Sorry.

50:53 It's a little equivalent to people saying, well, just check that it's HTTPS for the URL.

50:57 Like, well, that's not identity.

50:59 That's just like encryption.

51:00 Yeah.

51:00 You can still serve crap over HTTPS.

51:03 Yeah.

51:03 I think a lot of people also, they don't realize that like PyPI has supported uploading

51:08 GPG signatures for like a very long time, still does, hasn't gone away.

51:12 And nobody does it.

51:14 It's just like, it's just not used.

51:15 Nobody does it.

51:16 It's too hard.

51:17 Tooling doesn't work right.

51:18 Or it's just like not worth doing.

51:20 So I think Donald's post is right in the context of a world.

51:23 And this is written in 2013.

51:24 So a world where like only GPG is the only signing feature or signing tooling that you

51:30 have available.

51:31 But that's not true anymore.

51:32 And I'm really excited about this new tech called SigStore, partly because I like, I work

51:37 with it.

51:37 People on my team work on it.

51:39 It's like really interesting technology.

51:40 But SigStore is essentially like a new way to sign things.

51:44 And it's not necessarily based on long-lived maintained keys.

51:49 It actually uses ephemeral keys.

51:50 Like when you sign something with SigStore, you generate a public private key pair, you

51:54 sign it very quickly, and then you throw those away.

51:56 Like you don't actually maintain them.

51:57 They can't get leaked.

51:58 They don't ever get rid of the disk.

51:59 Like nothing.

52:00 They don't exist.

52:01 That's interesting.

52:01 This is also based on OIDC.

52:03 The chain of ownership of those keys, the keys like provided by some other trusted key

52:09 or something that really is tied to you, something like that?

52:12 No, no, no.

52:12 It's just literally just a key that you generate out of thin air, you sign it, and then you throw

52:16 it away.

52:17 So the way that SigStore works, which is it's also built on top of OIDC.

52:21 So you have these identities, right?

52:23 You have like your email, your Gmail account, your GitHub account, that kind of thing.

52:27 And they all offer like essentially an online identity.

52:30 You essentially sign with these identities instead.

52:32 So like you sign into something like Gmail, sign in something like GitHub.

52:38 You share this identity with a certificate authority that SigStore runs.

52:42 And this binds the identity to this like one time ephemeral private public key pair that

52:48 you generated.

52:49 Then that certificate is published on a transparency log.

52:51 So it's there forever.

52:52 There's a record of everything that gets signed.

52:54 But then the thing that you have to trust is not like some however many digit long alphanumeric

53:01 like public key ID.

53:02 It's like an email address.

53:03 It's like di at python dot org.

53:05 You can like be pretty sure that someone hasn't, you know, and someone can still lose access

53:09 to that identity via like compromise of that account.

53:13 But it's like a little, it's much, much easier to use, sign, and maintain.

53:16 And it's a little bit less likely that that actual identity is going to get compromised.

53:21 Okay.

53:21 Here's the one that I was thinking of.

53:23 Sorry.

53:23 It was NPM, not PyPI.

53:24 Oh.

53:25 NPM libraries, colors and fakers, sabotage, sabotage and protest by maintainer, a person

53:30 named Mark Squires.

53:32 And follow it up quickly from that.

53:36 We have the resident of Queens home suspected in bomb making materials, arrested Mark Squires

53:42 and so on.

53:43 So like, this is what I was talking about when I'm saying like the no amount of signing

53:48 or, you know, 2FA is going to help against this.

53:52 And it's, it's just something, I mean, it's just, I think it's part of the deal.

53:56 If you accept code from people, you got to vet that, you know?

54:00 I've essentially like always said, you know, PyPI makes no guarantees.

54:03 You can't trust anything that's on it.

54:04 You need to like take your own steps to learn and build trust and things that are on there.

54:09 We can give you some tools to help you do that.

54:11 But like, yeah, essentially you're giving someone commit access to your project, your

54:15 application, whatever.

54:16 Like you're allowing them to introduce code into your project and run alongside your application

54:22 or.

54:22 What are your thoughts on private hosted package systems like PyPI dev server or whatever

54:29 those, those, those where you can create the local ones and maybe even mirror stuff from

54:33 PyPI.

54:33 Yeah.

54:34 I mean, if you're going to do like a really robust, like auditing pipeline where you pull

54:38 stuff from public PyPI, you spend some time looking at it.

54:41 You maybe run it through some tests and then you introduce it to your private server.

54:44 That's a really good way to like insulate yourself from a couple of different types of attacks,

54:48 like dependency confusion attacks, typosquatting, that kind of thing.

54:50 Like if you're pointing your install at private server that just doesn't have any of that stuff

54:55 on it because you have manually curated it, then yeah, I mean, that's, that's a pretty good

54:59 practice.

55:00 And so like a lot of people like Artifactory, Google Cloud, Artifact Registry, all those things,

55:04 like they're all sort of similar in that regard.

55:06 I guess related to that is if you automatically install the latest continuously, that maybe

55:11 puts you at a higher level of risk than if you choose to upgrade at some point to a package,

55:16 like pinning versus not pinning.

55:18 Yeah, exactly.

55:19 Yeah.

55:19 We're getting pretty short on time here, Dustin.

55:21 Yeah.

55:22 What else can we cover?

55:23 I know maybe pip on it or scorecard or what do you want to focus on for a couple of minutes

55:28 here?

55:29 Well, yeah, let me just quickly say that like 6door, we built a Python client, so you can

55:33 pip install 6door and then sign, verify, do whatever from there.

55:37 And I'm super excited to say that like the upcoming Python 3.11 release is, the releases

55:42 are usually signed with GPG, but we're going to start signing it with 6door as well.

55:46 The release maintainer.

55:46 Oh, nice.

55:47 Pablo is going to sign it.

55:48 Can it be signed with two things at once?

55:50 I guess it can, yeah.

55:51 Yeah, multiple people can sign it.

55:52 So we're just going to have their release manager sign it.

55:54 But yeah, that's super exciting.

55:56 Yeah.

55:56 So another like area that we've been working on a lot lately, I've been working on is vulnerability

56:01 auditing or remediation.

56:02 So we have a new tool called pip audit.

56:04 It's not part of pip right now.

56:05 And there's some discussion about whether it should be or not.

56:07 But essentially, this is a tool that allows you to audit your local environment, your Docker

56:12 container requirements file, whatever for known vulnerabilities, not like unknown vulnerabilities,

56:16 but stuff that's been known, reported and either fixed or upgraded.

56:21 So it'll tell you like essentially if it finds a CD or something like that.

56:24 But this also uses like a Python specific advisory database that we built that pairs with the open

56:30 source vulnerability service and works pretty well.

56:33 I'm pretty pleased with it.

56:34 I would like encourage everyone to just like run it on their machine and see what vulnerabilities

56:38 you have like lurking about right now.

56:40 But also like integrate it into your CI pipeline, like run an audit, just make sure that, you

56:44 know, your application is not going to like have a vulnerability introduced.

56:47 It pairs really nicely with like depend about things as well.

56:50 Yeah, there's a bunch of other stuff like working on Salsa.

56:53 If you're not familiar with Salsa, it's essentially like a framework for thinking about how secure

56:58 your build pipeline is when you're producing publishing artifacts.

57:01 So if you're a maintainer, you might think about whether, you know, tampering is possible,

57:04 that kind of thing.

57:05 It's a sort of a way just to sort of think about how good of a job the build pipeline is

57:09 doing in that regard.

57:10 Yeah.

57:10 Build pipelines are a little scary.

57:12 I mean, they have a huge value, but they're also, you could sneak stuff in without even

57:16 actually changing the code of the original repo and all sorts of stuff.

57:20 Yeah.

57:20 I saw this is managed by, or like in collaboration with Trail of Bits and also you're a maintainer.

57:26 What's the, what's the origin story of pip Audit?

57:28 Yeah.

57:29 So Trail of Bits is a security consultancy.

57:31 They've done a lot of work in the Python space and software security space for a long

57:35 time now.

57:36 Folks like William Woodruff were actually involved in like way back when implementing two

57:41 factor and some other stuff on PyPI.

57:43 So my team, open source security team at Google, we've hired them as contractors to

57:47 do some of this work, do some maintenance, build these open source projects, that kind

57:51 of thing.

57:51 So you'll see William and Alex and some other folks all over these projects because they've

57:56 been working really hard to make them really useful and work really well and be really secure.

58:00 Yeah.

58:00 Fantastic.

58:01 Let's maybe just round things out with the stuff on the PEP, the various PEPs.

58:06 Oh yeah.

58:06 Maybe the API.

58:07 Yeah.

58:08 Let's go through the PEPs real quick.

58:09 Then we'll probably have covered enough.

58:10 Yeah.

58:10 I stuck my name on a couple of PEPs recently.

58:12 Or as much we have time for.

58:13 Yeah.

58:13 I think most of these are, you know, I've like provided some like minimal input into

58:18 them.

58:18 I don't, can't say I claim that I authored them myself, but what I'm super excited about

58:22 is PEP 621.

58:23 So this is a way to do essentially static metadata for Python packages.

58:27 So this includes source distributions, which means that like you don't have to use setup.py

58:31 anymore.

58:32 You don't know why you don't need to use setup.py anymore.

58:34 It's essentially arbitrary code execution at install time, which is super scary and should not

58:39 happen.

58:40 So yeah.

58:41 621.

58:42 I compare that thought with this resident of Queens for bomb making.

58:46 Like, do you want to have that person running arbitrary code on your machine?

58:50 Probably no, is the answer.

58:51 At install time too.

58:52 In production.

58:52 Not at run time.

58:53 Yeah, yeah.

58:54 Yeah, exactly.

58:55 So I actually just saw, like right before I joined this, I saw a tweet from Prajan that

58:58 like setup tools has full support for this.

59:00 There's a bunch of folks been working really hard on it.

59:03 So yeah, it's essentially like you don't need to use setup.py anymore, which is really nice.

59:07 And like a lot of tools have sort of converged on highproject.toml as the sort of best standard

59:12 for metadata and configuration.

59:14 So like, it's nice to see that conversions.

59:16 Yeah, it's great.

59:17 Shout out to like Brett.

59:18 I think mostly led this pep.

59:20 He did an amazing job.

59:21 Yeah.

59:22 Yeah.

59:22 Very cool.

59:22 And then PEP 691 is exciting as well.

59:25 So my API has a couple of different APIs.

59:27 Most of them are not standardized.

59:29 One of them that is standardized is a simple API, which is essentially just an HTML page

59:34 and tools like pip.

59:35 It's kind of insane.

59:36 But if you go to a simple.

59:38 Are you really going to do it?

59:39 Yeah, that's going to blow up your browser for sure.

59:41 Because like essentially, yeah.

59:43 So tools don't actually use this page.

59:45 They use the individual pages for these projects.

59:46 But tools like pip essentially have to parse HTML to like interact with PyPI.

59:50 And that's not great.

59:51 Like it used to work.

59:52 Okay, it doesn't scale well now.

59:53 So we're in the process of like standardizing a lot of our JSON APIs.

59:57 And one of them that we sort of did, and Donald led this path with some input from Pragen

01:00:02 and Cooper and myself, essentially like the same data, the same API, same files, everything

01:00:06 that pip needs.

01:00:07 That's not HTML.

01:00:08 It's just JSON.

01:00:09 So they can use standard library JSON parser to request and get this and do the stuff that

01:00:14 pip needs to do to be pip.

01:00:15 Yeah, probably make it a little more efficient, easier for other people to consume, right?

01:00:20 Yeah.

01:00:20 Is that something that you encourage other applications to go mess with PyPI APIs?

01:00:26 Or is it, it's public, but you know, we'd rather you don't mess with it.

01:00:30 What are your thoughts there?

01:00:31 Like the stuff that we have standardized, definitely you can depend on it, right?

01:00:34 Like it will continue to exist.

01:00:36 And by standardizing it, we've said like, this is what you should expect.

01:00:39 And this, you know, unless we change the standard, it's what it's going to continue to do.

01:00:43 We have other APIs that existed before we were standardizing stuff.

01:00:46 Like there's a legacy JSON API.

01:00:48 There's this XML RPC API that's like such a nightmare to maintain.

01:00:51 We've kept them running just because a lot of people use them.

01:00:54 Like for example, Poetry uses this like kind of our unofficial JSON API.

01:00:58 And yeah, I mean, there's times when we need to do stuff to that because it like isn't scaling,

01:01:03 right?

01:01:03 Or there's trade-offs that we need to make.

01:01:05 And it's like, well, like we're probably going to have to break someone in order to like

01:01:09 keep this afloat.

01:01:10 But like with things like these standard APIs, like we've spent a lot of time designing

01:01:14 them, planning them, standardizing them.

01:01:15 Like those are definitely 100% cool to integrate against.

01:01:18 Okay.

01:01:18 Fantastic.

01:01:19 Traditionally on these peps, you'll see it's accepted and planned for this version of, this

01:01:24 one doesn't have that, right?

01:01:25 Because it just goes against the web app, right?

01:01:28 This is a packaging pep.

01:01:29 So we use the same process as CPython for their Python enhancements.

01:01:33 But yeah, this is about packaging.

01:01:35 So it doesn't necessarily tie to an individual Python release.

01:01:38 You don't ship it to a binary that people get, right?

01:01:41 Well, in a way we do.

01:01:42 Like pip, you know, ships support for it in a binary.

01:01:44 So, and it's implemented.

01:01:45 Like this exists on PyPI now.

01:01:47 pip uses this now.

01:01:48 So if I have the latest pip on my machine and I pip install something or do pip actions,

01:01:53 which does it hit the old symbol or does it hit this new one now?

01:01:56 If I'm remembering correctly, the support's been added to pip.

01:01:59 It might not.

01:02:00 I think it's been released.

01:02:01 Yeah.

01:02:01 I'm pretty sure.

01:02:02 I could be wrong.

01:02:02 I'm not a pip maintainer, but yeah.

01:02:04 Yeah, sure.

01:02:05 Okay, cool.

01:02:05 And we have one more, one more to touch on.

01:02:08 Oh yeah.

01:02:09 I mean, this is, you only really care about this if you're like really working with PyPI,

01:02:12 but we're going to have a new, like I said, we're standardizing all our APIs.

01:02:15 So there's going to be a new upload API.

01:02:17 The existing upload API has a lot of problems.

01:02:20 It's fairly old.

01:02:21 It's essentially just a big post request with like a metadata in it.

01:02:24 So this should be a little bit better.

01:02:26 And also enable things like draft releases where you can like publish something to PyPI

01:02:30 that's in draft state.

01:02:32 You can review it, practice installing it before you actually like publish it.

01:02:35 And when it's in draft, it'll allow you to overwrite it, you know, like do things that

01:02:39 we don't.

01:02:39 To fix the problem.

01:02:40 Yeah.

01:02:41 Okay.

01:02:42 Exactly.

01:02:42 Is this an alternative replacement or just another safety net compared to say like the

01:02:47 test PyPI versus production PyPI?

01:02:50 You mean like the draft stuff?

01:02:52 Yeah.

01:02:52 I think it would be the preferred.

01:02:53 Like would stuff automatically upload and draft and you got to flip it so that you don't

01:02:58 accidentally publish something that's not ready or first because you forgot to use the test

01:03:02 PyPI.

01:03:02 Yeah.

01:03:03 Test PyPI is kind of weird because like it actually existed to be our test environment for

01:03:07 PyPI.

01:03:08 Like because we didn't have a great test suite with the old PyPI.

01:03:11 And now it sort of hangs around as like a playground sandbox.

01:03:14 Yeah.

01:03:14 We don't care.

01:03:15 But yeah, some people do like have it in their production, in their like release flow, like

01:03:19 upload here first, make sure that it works and then like go to.

01:03:22 So I think this will be a better use case and sort of consolidate folks to use PyPI for

01:03:27 everything.

01:03:28 And we can eventually shut down test PyPI because it's not as useful as it could be.

01:03:32 All right.

01:03:32 Well, I think that's probably all the time we got to talk about this.

01:03:36 We could go on and on.

01:03:37 We only touched on some of the stuff that we're thinking about talking about.

01:03:41 But before we're done, maybe answer the final two questions.

01:03:44 If you're going to write some code, yeah, what editor do you use?

01:03:46 I do almost everything in BI and have since I sort of started doing any kind of development

01:03:51 work.

01:03:51 That said, I like I kind of try to do as much as I can in the browser in the like GitHub

01:03:56 UI.

01:03:56 Okay.

01:03:57 I hit, you know, speed bumps there.

01:03:58 Sometimes it can't be quite as fast, but like for little stuff I like, yeah, I kind of

01:04:02 like, I kind of like seeing what I can do there.

01:04:04 Yeah.

01:04:04 Right on.

01:04:05 Do you ever press the dot in GitHub?

01:04:07 Yeah.

01:04:08 All the time.

01:04:08 Love the dot.

01:04:09 Yeah.

01:04:09 The dot converts it to like hosted VS Code basically.

01:04:12 Yeah.

01:04:13 And then a package you want to give a shout out to?

01:04:15 So my biased answer to this is check out the SIG store package on PyPI and the pip dash

01:04:21 audit package.

01:04:22 Those are both stuff that I've been working on.

01:04:24 My team has been working on.

01:04:25 I'm really proud of the way that they work.

01:04:27 And we're going to be working on integrating those into like more use cases, more patterns.

01:04:31 Check them out.

01:04:32 Like try them out.

01:04:33 I'd love to get feedback on them.

01:04:34 Yeah.

01:04:35 Fantastic.

01:04:35 They seem to fill like a super big, a super important hole that sort of backfill some

01:04:40 of the security and supply chain stability and so on.

01:04:42 And then I think my unbiased answer, I did want to give a shout out to the pip-tools, pip dash

01:04:46 tools project on PyPI.

01:04:48 That's maintained by the jazz band team.

01:04:50 So that's just like a roving revolving door of maintainers.

01:04:53 But they do a good job keeping it up and running.

01:04:55 It satisfies this really, I think, important use case that a lot of people don't do with

01:04:59 their Python dependencies, which is like essentially allows you to compile your dependencies into

01:05:04 a requirements file that has all the versions pinned, all the sub dependencies there.

01:05:09 Hashes, which is like, please like use tips, hash checker, hash all your dependencies.

01:05:14 It definitely protects you against another whole class of attacks.

01:05:17 But yeah, pip-tools is great for that.

01:05:19 I would love a future where that was part of pip.

01:05:20 I don't know if that's going to exist or not up to the maintainer.

01:05:23 But yeah, it's super cool.

01:05:24 I have switched to using pip-tools for all of my packages and I love it.

01:05:28 It's fantastic.

01:05:29 Yeah, me too.

01:05:30 We use it for PyPI.

01:05:30 We use it for a bunch of other stuff.

01:05:32 Yeah.

01:05:32 Indeed.

01:05:32 All right.

01:05:33 Final call to action.

01:05:33 People want to get more involved, maybe make their things more secure.

01:05:37 I'd say call to action is go to pipi.org slash security dash key dash giveaway.

01:05:42 See if you're eligible as a critical maintainer for security keys.

01:05:45 If not, please just turn on 2FA anyway while you're there.

01:05:48 As a maintainer, I'd really appreciate that.

01:05:50 Keep your eye on the security space.

01:05:52 I think like there's a lot of interesting stuff happening, a lot of focus, a lot of resources

01:05:56 going into it right now.

01:05:57 It's a good time to like try and adopt some additional security, both to protect yourself,

01:06:01 your users, everything.

01:06:02 So yeah.

01:06:03 Yeah, I definitely second that.

01:06:04 Dustin, thanks so much for coming and sharing all this.

01:06:07 This has been great to talk about, get an insight and some of the ideas and thoughts

01:06:11 behind all these decisions.

01:06:12 It's great.

01:06:13 For sure.

01:06:13 Michael, it's always really great to talk to you.

01:06:15 So glad to be here.

01:06:16 You too.

01:06:16 Thanks again.

01:06:17 See ya.

01:06:19 This has been another episode of Talk Python To Me.

01:06:22 Thank you to our sponsors.

01:06:23 Thank you to our sponsors.

01:06:23 Be sure to check out what they're offering.

01:06:25 It really helps support the show.

01:06:26 Listen to an episode of Compiler, an original podcast from Red Hat.

01:06:31 Compiler unravels industry topics, trends, and things you've always wanted to know about

01:06:36 tech through interviews with the people who know it best.

01:06:38 Subscribe today by following talkpython.fm/compiler.

01:06:43 You care about the ideas behind technology, not just the tech itself.

01:06:46 And you know that tech has an enormous influence on society.

01:06:50 So check out the IRL podcast.

01:06:52 It's hosted by Bridget Todd.

01:06:53 And this season of IRL looks at AI in real life.

01:06:56 Listen to an episode at talkpython.fm/IRL.

01:07:00 Want to level up your Python?

01:07:02 We have one of the largest catalogs of Python video courses over at Talk Python.

01:07:06 Our content ranges from true beginners to deeply advanced topics like memory and async.

01:07:11 And best of all, there's not a subscription in sight.

01:07:13 Check it out for yourself at training.talkpython.fm.

01:07:16 Be sure to subscribe to the show, open your favorite podcast app, and search for Python.

01:07:21 We should be right at the top.

01:07:22 You can also find the iTunes feed at /itunes, the Google Play feed at /play,

01:07:28 and the direct RSS feed at /rss on talkpython.fm.

01:07:32 We're live streaming most of our recordings these days.

01:07:35 If you want to be part of the show and have your comments featured on the air,

01:07:39 be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

01:07:43 This is your host, Michael Kennedy.

01:07:45 Thanks so much for listening.

01:07:46 I really appreciate it.

01:07:47 Now get out there and write some Python code.

01:07:49 I'll see you next time.