Inside the new PyPI launch

Episode #159, published Fri, Apr 27, 2018, recorded Wed, Apr 18, 2018

Episode Deep Dive Links Transcript

Python is often described as a "batteries included" language and ecosystem. In fact, that's been taken so far that there is even a delightful Easter egg in the Python REPL. Just type "import antigravity" to see what I mean.

Where do these powerful packages come from? Well, the Python Package Index or PyPI.

On this episode, you will meet Nicole Harris, Ernest Durbin III, and Dustin Ingram. They were part of the team that has just launched the new version of PyPI over at pypi.org.

Not only have they given us a great new website around packaging in Python. They have laid the foundation for innovation in this space for years to come.

Episode Deep Dive

Guests introduction and background

Nicole Harris is a web designer specializing in HTML, CSS, and user experience. She joined the PyPI redesign effort in 2015 after responding to a call from Donald Stufft for help modernizing the front-end. Nicole focused on creating an attractive, responsive design that meets accessibility standards, reflecting the Python community’s values.

Dustin Ingram is a Python developer who began contributing to Warehouse (the code name for the new PyPI) roughly two years before its launch. Initially drawn by the improved testing and modern code base, Dustin worked on search improvements (such as adding Elasticsearch) and integrating new features like Markdown-based package descriptions.

Ernest Durbin III started working on PSF infrastructure in 2013. His goal was to make PyPI more robust and scalable. With the Warehouse project, Ernest helped move PyPI to Kubernetes, implement continuous deployment, and streamline the overall infrastructure so PyPI could serve billions of monthly requests with zero downtime deployments.

What to Know If You're New to Python

If you’re new to Python and want to understand this episode’s discussion on packaging and PyPI, here are a few core concepts:

PyPI (the Python Package Index) is the primary repository for third-party Python libraries and frameworks, typically installed with tools like pip.
Virtual Environments help you isolate dependencies for individual projects, so you don’t mix different versions of libraries.
pip is the default command-line utility to install Python packages from PyPI, and it’s central to most Python workflows.
Warehouse is the modern codebase powering the “new PyPI.” It replaces legacy PyPI with a more maintainable and secure system.

Key points and takeaways

The Motivation for a New PyPI The old PyPI (“legacy PyPI”) had grown unwieldy after many years of ad-hoc additions. Its code base, originally more like a proof-of-concept, lacked modern web frameworks and made contributions difficult. The new version (often referred to as Warehouse) provides a maintainable, testable foundation that allows the Python community to build and extend critical features more quickly.
- Links and Tools:
  - PyPI.org
  - Warehouse GitHub Repo
Modern Infrastructure and Deployment The new PyPI service runs on Kubernetes, allowing easy scaling and zero-downtime deployments. Even with billions of monthly requests, the architecture can handle load spikes by adjusting resources seamlessly. Combined with CDN support from Fastly, PyPI continues to respond quickly despite massive traffic.
- Links and Tools:
  - Kubernetes.io
  - Fastly.com
Revamped Design and Accessibility Nicole Harris redesigned the user interface to reflect Python’s values of friendliness and inclusivity. Her approach uses a well-structured CSS/SCSS architecture with careful attention to accessibility. The design is also fully responsive for use across devices, so the site is more welcoming to everyone.
- Links and Tools:
  - SCSS
  - BEM (Block, Element, Modifier) Methodology
Key New Features: Markdown Descriptions One of the first user-facing improvements in the new PyPI is support for Markdown-based package descriptions, making it simpler and more reliable to format project pages. Python developers have wanted this for years, and now they can move beyond reStructuredText without losing readability.
- Links and Tools:
  - Twine (commonly used to upload packages)
  - Markdown Syntax
Phased Rollout and Redirects Rolling out a service that handles a critical piece of Python’s ecosystem required caution. They gradually shifted traffic from legacy PyPI to the new site, verifying stability and performance before fully switching over. Redirect loops and caching quirks provided a few surprises, but the team quickly addressed them.
- Links and Tools:
  - Status.python.org for incident and status updates
Legacy PyPI Shutdown On April 30, 2018, the old site officially retired after over 15 years. The new PyPI is no longer forced to remain in feature parity with the legacy code, giving Warehouse contributors freedom to enhance and evolve. This milestone paves the way for future improvements around security, APIs, and package management.
- Links and Tools:
  - PyPI.org announcement blog post
Better Search with Elasticsearch Under the hood, searching for packages is more powerful, indexing names, descriptions, and other metadata. While the team aims to refine the indexing further (e.g., incremental updates), search is already significantly improved over legacy PyPI. This helps maintainers and users discover relevant packages more quickly.
- Links and Tools:
  - Elasticsearch
Security and Future Roadmap With a stable foundation, the team can focus on advanced security features, including better alerts for vulnerable packages and ways to deprecate insecure releases. They also discussed adding project renaming, more robust APIs, and a staging system that would reduce the need for test PyPI.
- Links and Tools:
  - PSF Packaging Working Group
Community Contributions and Funding Contributors around the world submitted pull requests, but the Mozilla Open Source Support award was crucial for full-time development. Now that the new PyPI is up, sustaining ongoing development requires continued community donations. Companies relying heavily on Python can support this effort through the Python Packaging Working Group.
- Links and Tools:
  - Mozilla Open Source Support
  - Donate to Python Software Foundation
Developer Experience: Local Setup and Testing Warehouse is intentionally easier to set up locally than legacy PyPI, which lacked modern testing and was cumbersome to run. With Docker and Docker Compose, developers can quickly spin up a local instance, fix bugs, and open pull requests. The project continues to welcome new contributors, including those making their first open-source commit.

Links and Tools:
- Docker
- Docker Compose

Interesting quotes and stories

"It was a one time click, but there were weeks and weeks of incremental load tests to make sure we wouldn't break too many people." -- Ernest Durbin

"We wanted it to look friendly and reflect the Python community’s values, both in design and accessibility." -- Nicole Harris

"The main reason why we were able to launch now rather than 18 months from now was the Mozilla Open Source Support grant." -- Dustin Ingram

Key definitions and terms

Warehouse: The internal code name for the new PyPI codebase, built using modern frameworks like Pyramid.
Kubernetes: A container orchestration system used for deploying, scaling, and managing containerized applications.
Fastly: A CDN service that caches and serves static files (and sometimes dynamic content) at the network edge to reduce latency.
Markdown: A lightweight markup language with plain-text formatting syntax, now usable for project descriptions on PyPI.
XML-RPC API: A legacy API for programmatic access to PyPI, officially deprecated in favor of future REST-like or JSON-based APIs.

Learning resources

If you’re just getting started with Python and want to dive deeper, check out these courses from Talk Python Training.

Python for Absolute Beginners: A foundational course for newcomers to programming and Python.
Getting started with Testing in Python (pytest): Learn to test your Python code effectively, which is central to open-source development projects like Warehouse.
Up and Running with Git: Contributing to projects like PyPI is much simpler when you understand Git-based workflows.
Python Jumpstart by Building 10 Apps: Build confidence writing Python code through real-world exercises and mini-projects.

Overall takeaway

The new PyPI launch was not just a cosmetic change; it was a fundamental modernization of one of Python’s most critical pieces of infrastructure. With a fresh design, a solid technical foundation, and strong community support, Warehouse unlocks the potential for many enhancements in security, search, and overall user experience. The project’s success is a testament to open-source collaboration, a forward-thinking infrastructure approach, and the dedication of the Python community. If you rely on Python packages, consider donating or contributing to help shape the future of PyPI.

Links from the show

The new PyPI: pypi.org

Guests
Nicole Harris: @nlhkabu
Dustin Ingram: @di_codes
Ernest W. Durbin III: @EWDurbin

New course
Python 3, an illustrated tour: talkpython.fm/illustrated
Episode #159 deep-dive: talkpython.fm/159
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode #159 deep-dive: talkpython.fm/159

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Python is often described as a batteries-included language and ecosystem.

00:03 In fact, that's been taken so far, there's even a delightful Easter egg in the Python REPL.

00:08 Just type import anti-gravity to see what I mean.

00:11 Where do these powerful packages come from?

00:13 Well, the Python Package Index, or PyPI.

00:17 On this episode, you will meet Nicole Harris, Ernest Durbin III, and Dustin Ingram.

00:22 They were part of the team that has just launched the new version of PyPI over at pypi.org.

00:29 Not only have they given us a great new website around packaging and Python,

00:33 they have laid the foundation for innovation in the space for years to come.

00:37 This is Talk Python To Me, episode 159, recorded April 18, 2018.

00:56 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities.

01:03 This is your host, Michael Kennedy.

01:05 Follow me on Twitter, where I'm @mkennedy.

01:07 Keep up with the show and listen to past episodes at talkpython.fm.

01:11 And follow the show on Twitter via at Talk Python.

01:15 This episode is brought to you by ActiveState and Codicy.

01:18 Please check out what they're offering during their segments.

01:20 It really helps support the show.

01:22 Hey, everyone.

01:23 Before we get to the exciting news about the new PyPI launch, I want to tell you about a brand new course we just launched.

01:30 It's called Python 3, an Illustrated Tour.

01:33 And it's a five-hour visual and code-based tour of all the features in Python 3.

01:39 It's written by Matt Harrison, who has authored 15 technical books and is a best-selling Python author.

01:44 Check it out over at talkpython.fm/illustrated.

01:48 And if you get the course this week, we'll throw in Matt's newest Python book for free,

01:53 which is a perfect complement for the course.

01:55 And if you have the Everything Bundle already, then you should definitely check out the course

02:00 because it's included in your bundle, and you can just go take it.

02:03 I hope you love this new course.

02:04 We have many more coming down the pipe, and I'm looking forward to sharing those with you as well.

02:09 Now, let's hear about the new PyPI.

02:11 Nicole, Dustin, Ernest, welcome to Talk Python.

02:14 Hey, thanks.

02:15 It's great to be here.

02:16 Thanks for having us.

02:17 Yeah, you all have done something amazing.

02:20 It's almost like you've caught a unicorn in the mythical sense of, like,

02:25 there's been this talk of a new PyPI website and infrastructure for so long, and then, like, here it is.

02:33 And you all are, you know, really central to doing this.

02:36 So I'm super excited to talk about this, the rollout, the technology behind it, new features we're going to get.

02:42 We have already gotten things like that.

02:44 But before we get to that, let's start with your story just briefly since there are three of you.

02:49 How do you get into programming Python?

02:50 Nicole, go first.

02:51 I started off with programming generally about 10 years ago.

02:55 My degree is actually in film and photography, and I wanted to make a website to put up my animation works.

03:02 And that kind of led me to HTML and CSS, which is still my specialization.

03:07 And from that, kind of, I became what was back then a sort of generic web designer before we had lots of different specializations.

03:16 And then my husband is actually a Python programmer.

03:19 So that's how I got involved in the Python community.

03:22 And I don't program in Python very much these days, but I do sort of dabble in it every now and again.

03:28 Yeah, yeah, very nice.

03:29 Ernest, how about yourself?

03:30 I graduated from school with a degree in physics and math in the sort of peak of the recession back in 2007-8 era.

03:40 And eventually conned my way into a job as a business analyst.

03:43 And at that point, I started programming in order to stop using Excel.

03:49 And then years later, I've come to this point.

03:52 Very cool.

03:52 I love how you sort of took your career and just kind of laddered it up or leveled it up.

03:57 Right?

03:57 Like, I'm math and physics.

03:58 I'm not going to work at CERN.

04:00 So now what?

04:01 And then you just, you know, work your way up that ladder.

04:03 Like, I also, I've said this several times on the show, of course.

04:05 But I also was working my PhD in math and then kind of abandoned for my self-taught developer path many years ago.

04:12 Dustin, how about you?

04:14 Yeah.

04:14 So I went to school for computer science.

04:17 And I'm not really sure when I first was introduced to Python.

04:20 But I do remember at some point, you know, after having done a lot of C and C++ in my studies, coming across this Python thing and being like, oh, this looks so much nicer.

04:29 So I slowly sort of worked that in as much as I could.

04:33 And, yeah, now I could probably call myself a Python developer.

04:37 That's pretty awesome.

04:38 So you're like, this can't work.

04:40 There's only five lines, right?

04:41 Like, in C++, I'd definitely have to, like, write a whole app around this.

04:45 So, but it works.

04:46 It's the beauty of Python, right?

04:47 Nice.

04:48 Okay.

04:49 So first of all, I want to start with a big piece of news, which we've been hinting at, or I've been hinting at, but has a particular date.

04:57 So the new PyPI.org, which, by the way, for a while is PyPI.io.

05:03 I want to ask you about that.

05:04 But PyPI.org has launched and Legacy PyPI is shutting down April 30th, right?

05:12 That's on the blog recently announced.

05:14 Congratulations.

05:15 How do you all feel about that?

05:16 Thanks.

05:17 I think we're super excited.

05:18 Yeah, I don't think there's anything negative to say about it.

05:21 I mean, it's just to see the culmination of the effort come to, like, a moment has been incredible.

05:26 And there'll be another sort of celebratory secondary on the 30th when we sort of say goodbye to something that's been around for so long.

05:35 Yeah.

05:35 We're going to have to get used to less gray, more red, or more blue, right?

05:40 It's blue, isn't it?

05:41 Is that your work, Nicole?

05:42 It sounds like you might have done a fair amount of the redesign HTML bootstrap type of thing.

05:48 I joined the project back in 2015 because Donald, who's our lead developer, I think you've already met and interviewed.

05:57 Yeah, he's been on the show twice.

05:59 He's great.

05:59 Yeah.

06:00 So he put a call out basically to say, I'm rebuilding this thing, but I'm terrible at design.

06:06 So is there anybody out there who can help?

06:09 And I got in touch, and so that's kind of how I ended up in charge of both the user interface, the user experience, and I also took charge of the HTML and the SCSS code base as well.

06:22 So kind of front end minus JavaScript.

06:25 Yeah, that's really cool.

06:26 Anything that looks good is Nicole's doing and not any of the rest of us.

06:31 I got to say congratulations because I do feel like it looks really modern, not overly designed, but it definitely feels like, you know, 2018.

06:40 Somewhere you want to be.

06:41 It doesn't look old, neglected, gray, and just like default, like browser font style, right?

06:46 Like it's really, really good.

06:47 And I think on one hand design, it's how much does it matter, right?

06:51 It's like a package warehouse.

06:52 But on the other, I think it sends a message to the community like this place is special.

06:57 We care about it.

06:57 We put in effort to style it and make it really look good and be usable, right?

07:02 Yeah, and I think a lot of the design focus for me was thinking about how much Python is a teaching language and how for how many programmers it might be their first experience dealing with a package index.

07:16 So it was really important to me that it looked friendly and it reflected the values of the Python community.

07:23 So both in terms of the design, but also in terms of the accessibility features that we've built into the front end code base.

07:31 We're trying to make sure that it's serving as many people as well as possible.

07:35 That's cool.

07:36 And do you mean things like ARIA, like screen reader indicators and stuff like that?

07:40 We've done a reasonable amount of work on that so far and we've actually got an accessibility audit happening this week as well.

07:47 So there'll be more improvements on that side.

07:50 But given there's so many users of the site currently, it's just from a percentage perspective, you know, that there is going to be a portion of those users who are going to be using assistive technology.

08:00 So we need to be looking after them.

08:03 And I think that reflects the Python community and the way that we go about things offline as well.

08:08 Very nice.

08:09 Let's touch on the contributions the other two of you have made.

08:12 So Ernest, what was your major part in this whole project here?

08:15 Sure.

08:16 So since about 2013, 12 or 13-ish, I've been contributing to the Python Software Foundation's infrastructure.

08:23 And so this is the servers and services behind python.org, www.python.org, mail, wiki, etc.

08:31 And so PyPI is one of the largest and most used of the services provided by the PSF.

08:38 And I got involved primarily just keeping things turned on.

08:43 In 2013, there was a large contribution that I did to modernize the infrastructure that hosted the old PyPI.

08:51 And over the past few years, I've continued that work in adding to the reliability and telemetry of PyPI.

08:59 And so with the warehouse project, Donald, Stuffed, and myself both sort of took a step back and said, if we were going to do it all over again, how can we make sure we have excellent infrastructure for warehouse?

09:11 So my main contribution in the most recent work has been a mixture primarily of the infrastructure behind PyPI.org and also some code changes that features as well as just stuff to make it more compatible and easier to operate and do so reliably.

09:27 Yeah, very cool.

09:28 Dustin, how about yourself?

09:30 Yeah, I joined the project just as a volunteer contributor about two years ago.

09:34 I think I just had happened to come across it looking at Donald's GitHub and I was like, wow, this is a really usable PyPI, but it's not finished.

09:43 And, you know, as a new contributor, I was pretty just attracted to it because I could actually contribute to it.

09:50 Legacy is Behomoth and has very few tests.

09:55 And even to run it locally, you have to actually go in and comment out a bunch of code.

09:59 So it's really abrasive for new contributors.

10:03 And warehouse is not like that at all.

10:05 So I sort of started making some contributions, doing some like Elasticsearch tuning and that kind of thing, and just adding elements to the UI that weren't there before.

10:13 And so I think I learned this to work necessary and also making a lot of contributions to the just tooling ecosystem.

10:23 So that's other projects like Twine and pip and things like that, just to work with the new PyPI.

10:28 Yeah, very cool.

10:29 Now, the three of you are here, but you all have mentioned Donald Stufft, who's been spearheading this and deserves a lot of credit as well.

10:36 So congratulations to him.

10:37 Who else?

10:39 Is there anyone else who we should sort of give a shout out to while we're talking to you all?

10:43 Yeah, absolutely.

10:44 I want to point out Sumina.

10:46 Sumina, I actually have never tried to pronounce Sumina's last name out loud.

10:50 So Sumina H took an incredible role in the project management and leadership over the past few months and bringing this together.

11:00 Absolutely.

11:01 It was a huge driver in a lot of the work that we did to encourage and welcome and have sticky contributors to the project.

11:10 So I sort of said this a few, I don't remember when exactly, but there was a point where whenever Donald or I would tweet the PyPI team, what we meant was whichever one of us happened to have done something that week or that month with PyPI.

11:25 And I attribute personally a lot of the reason why when I say the PyPI team now, it's a collection of more than like five, I mean, it's probably close to like seven or eight people who are regularly contributing and there's a team.

11:38 And when I say that now, I say it earnestly, pardon the pun.

11:42 But yeah, so Sumina must be, in my opinion, must be sort of encouraged and called out here as well.

11:48 Yeah, I just want to say, I don't think the project would have been as big of a success as it was if Sumina hadn't been sort of organizing and hurting us along the way.

11:58 She did an exceptional job.

12:00 So glad to hear.

12:00 So congratulations to you all.

12:02 So I have two sort of burning questions around the new PyPI.org.

12:08 Three.

12:09 Let's start with a simple one.

12:11 We have PyPI.python.org slash PyPI, which is a crazy location on the internet because why the duplication?

12:19 But anyway, we have that.

12:21 And then for a little while, you had PyPI.io.

12:25 And then you switched to PyPI.org for like where the actual new warehouse, the new Python package index lives.

12:33 Why did you change it halfway along the way there?

12:36 Sure.

12:36 This is a good story.

12:37 So it started at Python.org slash PyPI is where it initially lived.

12:42 And then it moved to PyPI.python.org slash PyPI because it was easy to change the domain and not so hard, not so easy to change the URLs.

12:51 Right.

12:52 It was more easy to separate it like the infrastructure to another server rather than behind the load balancer or something like that.

12:57 Right.

12:57 Mm hmm.

12:58 Okay.

12:58 And then eventually PyPI.io, I don't remember when we got it, but we've been using that for sort of the internal domain for PyPI for a long time.

13:07 So for the actual servers behind the scenes.

13:09 And when warehouse started to get to the point where Donald was like, oh, man, this is real.

13:14 We can start deploying this somewhere.

13:17 We went PyPI.io because we had it.

13:20 The frustrating part is that PyPI.org was not owned by the PSF or Python community member for a long time.

13:28 So basically, the reason why it switched midstream was that PyPI.org was successfully obtained by the PSF and by the PyPI maintainers.

13:41 It was the sort of the gold standard of the domain that we desired, but it wasn't ours until I don't remember when it was when that happened.

13:49 But when it became ours.

13:51 Yeah.

13:51 When it became ours, we immediately switched.

13:54 I see.

13:54 So that was what you wanted all along.

13:56 But there was just this like squatter type of situation thing going on.

13:59 It is the Internet, isn't it?

14:01 All right.

14:04 So whoever wants to take this one, feel free to jump in.

14:06 One thing that I'm wondering is what features or benefits do we get other than the underlying system is more polished, easier to contribute to and so on.

14:17 But as just a user, like suppose I don't care about that.

14:20 It could be written in PHP for all I care.

14:22 But when I go to it, what do I get to do that's better or different?

14:27 Honestly, there's not much different.

14:28 Most of the goal of this project was to move to a system that would allow us to more easily add new and exciting features.

14:35 So we have a lot of ideas like new APIs and ability to deprecate packages and things like that that are now going to be, you know, not trivial, but much, much, much easier to implement.

14:45 Much easier than they would have been on legacy.

14:47 So a lot of this is just modernization efforts and taking what was originally just a proof of concept that became PyPI into something that's actually been thought through and designed and robust.

14:58 Yeah, I think you mentioned earlier, and Donald himself had said this previously, that the original PyPI, the gray one, not the blue one, was really like based on almost like custom web programming, not even like Pyramid or Flask or something.

15:17 It was really hard to get to.

15:19 People would come and say, hey, I want to contribute the new feature.

15:21 They would look and go, actually, not that much.

15:24 And then they would go away, right?

15:25 And so now, maybe this is a good place to switch into it.

15:30 You know, we could talk a little bit about what the underlying technology for that is, right?

15:36 So maybe Dustin Ernest, talk about the back end.

15:40 And Nicole, we could touch on the front end as well, because that also got super modernized, I'm sure.

15:44 Yeah.

15:44 So the thing about legacy is that it was written at a time that predates a lot of the frameworks and tools that we know exist today.

15:51 So, you know, it was doing the best with what it had, I think.

15:54 It's not a real direct criticism of it, but it just it came into existence like really early before much of the other stuff, right?

16:02 Like you pip install Flask, but where are you going to do that from if you don't have it?

16:06 It's old.

16:06 Yeah, the modern PyPI, the framework we chose to use is Pyramid.

16:11 And that was after a little bit of experimentation that sort of Pyramid just allows us to have a little more control over various things that we need to do to be PyPI.

16:19 And I think a big part of this project was the infrastructure work that Ernest did.

16:24 And I think he should talk about that more.

16:26 Yeah, go for it.

16:27 We're now deployed on top of a nice, buzzy framework, piece of infrastructure called Kubernetes.

16:33 And so we sort of looked at that as getting to the point where it as a technology, Kubernetes has come so far.

16:40 And by the time warehouse is going to be really real, Donald and I are both comfortable with sort of targeting that.

16:48 And the biggest drawback that that as a platform has is right now sort of the industry standard of the de facto for deploying to Kubernetes is you write a bunch of YAML or you use something to generate a template for YAML.

17:02 So the goal was basically to have a lot of similar features to other platform as a service and do so without really having to have warehouse maintainers or PyPI maintainers worry too much about what's actually happening.

17:17 And so a project came out of this work called Cavitage, which is a platform within Kubernetes and a web app and worker on top of it that just basically manage continuous deployment.

17:31 So you can set and configure your environment variables and such and then deploy your service and it pops up in a known URL.

17:39 That's really sweet.

17:40 So you basically, as a contributor, I do a check-in to a Git branch, maybe a PR.

17:44 And when that is accepted, that will trigger sort of Kubernetes to pull down a new version and just kick off, you know, sort of reroute the request?

17:53 What happens there?

17:54 Not yet.

17:55 That's the dream?

17:56 That is something that is another long-term benefit that we can sort of foresee out of this.

18:01 Right now, the biggest benefit that we get from this is we have incredible flexibility in the way that we deploy warehouse and how we change how many resources it has effectively.

18:14 So all of the primitives of the platform of Kubernetes effectively are really excellent.

18:21 It's just that you have to bring them together and that's the part that's sort of difficult.

18:25 So one of the biggest benefits we get is, you know, the zero downtime deployment.

18:29 So since PyPI went live on Monday, we've already deployed like 30 times and nobody noticed, which is great.

18:37 And then also just being able to be really flexible.

18:40 We have, I think it's like five different types of things happening behind pypi.org.

18:47 And we're running certain workloads under G Unicorn because they perform very well under G Unicorn.

18:53 And the primary site is deployed using Twisted for that purpose.

19:01 So overall, just, you know, having a little more flexibility and scalability was the main driver.

19:06 And down the line, we're really excited to see about doing things like you mentioned, being able to do branch based deploys, et cetera.

19:12 Yeah, that's really cool.

19:13 Go, Dustin.

19:14 I just wanted to mention, I totally forgot there is one feature that I'm super proud of that pypi.org does that Legacy did not.

19:21 And I can't believe I forgot about this because this is my baby for a long time.

19:25 But you can now write markdown descriptions on PyPI.

19:30 Yeah, that's awesome.

19:31 Which is a feature that people have wanted for a really long time.

19:33 And that's really the one big thing that I'm super excited to say that the new PyPI does.

19:39 That's cool.

19:39 And that's part of that modernization that you're talking about, right?

19:42 Like markdown, I don't know what people would have thought that meant back when it was created.

19:47 But now, obviously, it's like the de facto way of formatted, structured input that doesn't break the site because it's missing a div or something, right?

19:56 So it's really cool.

19:57 Markdown didn't even exist when PyPI was first created.

20:01 Yeah, I'm sure it didn't.

20:02 Nicole, how did the sort of redesign look?

20:06 Did you try to take what was there and like patch it up?

20:09 Or are you like, I'm just going to recreate this from scratch and style it up from scratch?

20:15 What was that process like?

20:17 Before I answer that question, I actually have something to add on the infrastructure question that you asked.

20:22 One of the things that I really appreciate about the project is how easy it is to set up as a contributing developer.

20:31 So I am not the most technical contributor, but I found the project really, really easy to set up with Docker and Docker Compose.

20:40 So the infrastructure that the team has set up in terms of being able to hack on this project is really, really amazing.

20:48 And it really lowers the barrier to entry for a lot of people.

20:52 We've seen people who've made their first open source pull request on this project.

20:59 That's really great.

21:00 Yeah.

21:00 It's really accessible for people to actually come and contribute to the project.

21:04 So I don't want to undersell that aspect.

21:06 I think that's really important.

21:08 I agree that it is.

21:09 And I think that's one of the real powers of this whole Docker thing is, right?

21:14 Like, it kind of comes all together.

21:16 But Docker on its own brings almost equally many difficulties or challenges at the same time.

21:23 And this, like, bringing in Kubernetes kind of, like, to make all the pieces fit, I think, is really, really clever.

21:29 So quite nice.

21:30 This portion of Talk Python To Me is brought to you by ActiveState.

21:35 ActiveState gives you a faster way to build and secure open source runtimes from your first line of code through to production.

21:41 Every second you spend building your Python distro or trying to secure your Python programs is less time spent doing the work you love.

21:47 You've got better things to do than trying to resolve dependencies or making sure that you tick off all security boxes when you ship to production.

21:54 Standardize on your Python builds so you can have less friction in the development cycle and you can deliver apps faster.

22:00 You can also get a unique server-side way to verify your Python applications at runtime.

22:04 Bake security right into your code without impacting performance.

22:08 Go faster, spend more time doing the work you love, and comply with your enterprise security needs.

22:13 Try ActiveState and see why it was chosen by IBM, Microsoft, NSA, Siemens, PepsiCo, and more.

22:19 Join millions of developers who trust ActiveState to build their open source language distros.

22:23 Visit talkpython.fm/ActiveState for a special offer.

22:27 That's talkpython.fm/ActiveState.

22:31 On your other question.

22:32 So in terms of the redesign, basically Donald just gave me free reign to do whatever I needed to do because I hadn't – like to give you an impression of the old code base, I wasn't even – you know, Donald basically said don't even go and touch that.

22:49 Like don't look at anything over there.

22:51 Don't set it up.

22:52 Don't set it up.

22:53 Just avoid at all costs because he knew that that would be a world of pain for me.

22:59 So I didn't really take any of the HTML or the CSS or the design from that.

23:04 It was just like, okay, so we've got this fresh new thing.

23:06 We want to show that it's a fresh new thing.

23:09 And we want to bring it to the standards, modern design standards that people expect.

23:14 We want it to be responsive so it works across all devices and we want it to be accessible.

23:20 So I basically started from a completely clean slate.

23:24 That's not true.

23:25 Donald had put together some templates, but he was basically like throw that in the bin and start again.

23:30 So that's what I did.

23:31 That's really cool.

23:32 So what are some of the technologies in the new one?

23:35 It looks to me like it's probably bootstrap based, which I'm a fan of, so that's cool.

23:39 And what else?

23:39 No, it's not bootstrapped.

23:41 No, it's not bootstrapped?

23:42 No.

23:42 Okay.

23:43 What's involved there?

23:44 Okay.

23:44 So we're going to go into a bit of CSS and HTML naming methodology.

23:49 So it's the HTML users BIM, which is a naming methodology for controlling the specificity of your CSS.

23:58 And then basically each of the areas of the front end is a separate block or component within our SCSS code base.

24:07 So basically the idea is we've built up a custom reusable CSS code base.

24:12 Yeah, that's really nice.

24:13 And you're using SAS, you said, or SASS, which is like programmable CSS that then compiles or transpiles to CSS, which is really nice.

24:22 So it sounds like if people want to contribute to the UI side of things, it's pretty modern and fresh if they want to drop in.

24:28 It is, and it's documented as well.

24:30 So it's fairly clear how that system works.

24:32 If you want to change variables, if you want to change what are called mixins, which are kind of like functions, reusable functions in SCSS.

24:41 And if you want to modify a certain part of the code base, it's really obvious when you look, when you inspect the HTML, it's really obvious where the corresponding CSS is for that within the code base.

24:54 So it's quite logical in terms of the way that the file structure is being set up.

24:58 And I don't take credit for that.

25:00 So it uses a system from a CSS guru called Nicholas Gallagher, who's, I mean, if anyone's into CSS, that's someone you should be following.

25:08 So it uses the IT CSS system from here.

25:11 That's cool.

25:12 I feel like CSS and a lot of the web design stuff kind of gets the short end of the stick, but it could either be a serious drag to work on or it can be really beautiful depending on how you do it, right?

25:22 The challenge with CSS is kind of achieving something at scale.

25:27 Like I think most people can, you know, write a decent CSS code base for small projects.

25:33 But when you start to scale projects, that's when you kind of have all this complexity with the cascade, things starting to break where you don't expect them to break.

25:42 So that's why from the beginning, I introduced these kinds of systems that I knew would allow us to scale the code base as we add new features.

25:50 I feel, I can't remember who on my show said it before, but somebody said they feel like CSS and large projects becomes a write only.

25:57 Like you don't actually change anything.

25:59 You only go to the bottom and maybe overwrite it or add another file that replaces it.

26:03 You know, like adds to it.

26:04 Pretty interesting.

26:05 So let's talk about the actual rollout because actually before we talk about the rollout, let's talk about the traffic.

26:13 I don't know, maybe Ernest, this is most clear on your mind, but this site and this underlying infrastructure, it handles a little bit of data, right?

26:21 In total, PyPI does.

26:23 The numbers are not directly in front of me.

26:26 Why did I do that?

26:27 But I have a slide deck somewhere that has this information.

26:30 But it's many, it's like 30 or 50 terabytes a month, like something to that size, I think.

26:36 It's a tremendous amount.

26:38 I think it's like 10 billion requests per month is our running average right now.

26:43 Let's go look at numbers.

26:45 So if we go and look at the old service to get it in the last month, so that excludes two days, we did a total of 6.5 billion requests at the edge.

26:58 6.8 billion requests per month and 1.5 petabytes of data at the edge.

27:06 Petabytes.

27:06 Holy moly.

27:08 And so we're also doing that at around 150 milliseconds of latency and with not that many errors, all things considered.

27:18 It's always important when we talk about these huge numbers to take one step back and go, yes, that is what the service is a total and total does.

27:27 But it's all thanks to Fastly, which is the CDN provider.

27:31 Right.

27:31 Because of the CDN.

27:32 Yeah.

27:33 Which is the CDN provider that offered to front PyPI many years ago.

27:38 And so just that one change was the most significant thing that happened to PyPI until, in my opinion, Monday.

27:46 But at the back end, we still do something like 25 to 30 requests per second across a myriad of different routes.

27:54 Yeah, that's really cool.

27:55 And Pyramid is working out pretty well for you.

27:57 Like my entire site, my course site, my podcast site, and various other pieces of infrastructure are almost all Pyramid.

28:03 There's a little flask in there.

28:04 And I think it's just been rock solid.

28:06 So I've enjoyed it a lot.

28:07 But how is it working for you?

28:10 Yeah, I've had no complaints.

28:11 I mean, I didn't really use Pyramid before I started contributing to the project.

28:14 And now it's definitely my preferred framework for more intensive web applications in Python.

28:21 So I like it a lot.

28:22 Yeah, it broke my brain.

28:24 I mean, like I got to the point where now I'm like, oh, of course, like this is how this works.

28:28 And I go back and I work on some of the flask.

28:32 I was like, oh, I can't do that here.

28:36 And so overall, I think I agree with what Dustin sort of alluded to earlier around the control and precision that you can get from Pyramid that other frameworks sort of make you run around to do.

28:49 Yeah, nice.

28:49 So the rollout.

28:51 So I set the stage with how much data you guys do, how much traffic you do.

28:54 When you flip the switch on that, that's got to be a...

28:58 So did you just go, it all goes here?

29:00 Or did you like do some sort of like, let's take 1% of 1% of the traffic and like slowly roll it over?

29:06 Like, what was it like?

29:07 The main traffic sources for PyPI are pip installs, XML, RPC.

29:12 So we have an XML RPC API and that gets a lot of traffic because it's mostly post requests and it's hard to cache that.

29:19 And then, you know, a very small fraction of that is actual web traffic.

29:22 So, you know, pypi.org existed for a long time before the launch and you could go and do everything on, you know, via the web interface that you could do on regular legacy PyPI.

29:32 So that was, you know, didn't require a lot of traffic and worked fine.

29:36 And so what we did was sort of some incremental load testing where we would switch certain either some pip traffic or XML RPC traffic over to pypi.org and see how it stood up.

29:47 Yeah.

29:47 So once again, Fastly was sort of predominant in that effort.

29:51 So because we were doing those redirects at the edge, we were able to set rules there.

29:57 And so like right now, actually, there's still quite a bit of traffic going to the old pipe or to the legacy PyPI backend.

30:03 And we can do that because we're not redirecting the traffic over to PyPI.

30:08 So we were able to like tune it at like 5, 10, 15, 20% for the heavy hitting stuff and test ahead of time.

30:18 And so when we switched, basically all we did is we started issuing redirects from the old service.

30:23 And so it was a one time click, but there were like weeks and weeks of like incremental quick load tests for where we would throw a bunch of traffic at it.

30:33 There was some replaying we did ahead of time as well.

30:35 Yeah.

30:36 Oh, replaying.

30:37 That's pretty cool.

30:37 That's a basically capture the exact web traffic and you replay it against the domain and just see how it behaves.

30:44 Right.

30:44 It wasn't the exact traffic.

30:45 We were taking measured, basically percentage stuff and then redispatching a request that looks like it.

30:53 And because the problem is we can't just do every request blindly or people would like dual submit up, you know, dual submit an action or something.

31:01 That's true.

31:01 Right.

31:02 You got to have non-modifying type of stuff or test data or something, I guess.

31:06 Yeah.

31:07 Yeah.

31:07 Pretty cool.

31:08 So how did it go?

31:09 Perfectly.

31:09 Not a thing went wrong.

31:10 It was good for the first 15 minutes.

31:13 I think we were all really excited.

31:15 It's working.

31:16 Wait a minute.

31:17 It's working.

31:17 Then the issues started rolling in.

31:20 What do you guys run into?

31:22 Sure.

31:22 So previously, all files uploaded by users to PyPI.

31:29 So these maintainers uploading their packages were hosted under the same domain.

31:33 So packages were hosted at pypi.python.org slash packages, some stuff.

31:38 During this switch, we decided to make a separate service, a separate domain for hosting user content.

31:45 If you've ever seen the documentation that used to be hosted at or is still hosted, I'm sorry, pythonhosted.org.

31:52 The main reason for that is that serving user-generated content from the same domain that you're actually operating a service from can be dangerous from some security perspectives.

32:01 So the thing that went wrong is that when we switched over, there were redirect loops and all sorts of craziness happening for people trying to download files from files.pythonhosted.org, our new host.

32:16 Ultimately, it was a bewildering and sort of bizarre thing because we had a number of factors at play.

32:22 We had files that were cached were fine.

32:25 Files that weren't cached were going to end up in this redirect loop.

32:28 We had some host names involved.

32:32 And overall, it was just, and it would happen, we realized that it's sort of the worst possible time.

32:37 So if you go to status.python.org and scroll down a little bit, you can read an incident report that sort of describes in more detail what went wrong.

32:47 But effectively, we were making this change as part of the rollout.

32:52 And an esoteric thing that occurs, I guess, occasionally when you try to move a host name from one CDN configuration to another CDN configuration.

33:04 We mishandled that.

33:06 And so it was a one line.

33:07 The fix was one line, and it was like 13 characters.

33:12 But it resolved it.

33:14 And so, yeah, not everything can go perfect.

33:17 Well, sometimes the best, most memorable lessons are taught in production.

33:21 What we talked about before we started the official recording that everyone was listening to is your overall, as a group, your overall thought was this was a big success, even if there was like a blip here or there.

33:35 Yes, absolutely.

33:36 Aside from like that, you know, files outage, which is kind of the core use of VPI.

33:43 So that's kind of a big deal.

33:44 But aside from that, you know, everything else worked great and continues to work great.

33:48 So we're generally pleased.

33:50 Like 99% of things worked perfectly.

33:53 Yeah, that's really great.

33:54 So I think this is one of those things, like I'm sure people were concerned about switching, like what might go wrong.

34:01 Like, will we break like Netflix deployment because they can't get a pip install to work on some like Docker container in like a continuous build because something like, you know, these types of I may be affecting this, but you sort of had to go through that to be on the better side of the world.

34:17 Right.

34:18 So now you Nicole's designs up the pyramid app that you all built is up.

34:22 And now it's it's just there to be polished and built upon.

34:25 Right.

34:25 Yeah.

34:26 I think that we're our hands are.

34:29 Well, once legacy is shut down, our hands are untied and we can make, you know, we can make progress in places that we would that we sort of wouldn't wouldn't have been able to in the future.

34:40 Something that I like to point out about PyPI, the historical PyPI is that there was a point where it was pretty much the only non static web host that Python.org had.

34:50 And so it would end up getting a bunch of features thrown into it that weren't necessarily critical to its operation.

34:59 And so as we split into warehouse features were removed from PyPI legacy and sort of while they were both simultaneously in existence, we had to be very strategic about what things we added to warehouse or PyPI.org.

35:14 And so once legacy is shut down, we can start to make much more progress and do so much more quickly and much more safely than we ever have been been able to before.

35:26 And so that alone is probably the biggest long term benefit of this is being able to do the things that people need, whether it's design or functionality.

35:36 I think if if you have to remain like with on parity with this older system that totally you're not designing one thing, you're designing almost like two things or you're constrained really painfully.

35:49 So you'll be free.

35:51 They share a database.

35:52 So that also is a huge complicating factor.

35:55 Very interesting.

35:56 I guess a couple questions just really quick on that.

35:59 And then I want to kind of talk about where things are going.

36:01 You said they share a database.

36:03 Like what database is that?

36:04 Where is the actual web apps running your Kubernetes containers running these days?

36:10 We use Postgres for a database and we have a very generous donation for in-kind service, basically.

36:17 So AWS said, yeah, you can run PyPI here.

36:20 And so right now we run we run the entire stack in Ohio.

36:25 I picked where it deployed.

36:26 So I picked Ohio.

36:27 But in the Ohio region for AWS, we've got like I think it's like nine medium-ish sized servers running Kubernetes.

36:34 And we're using RDS and Elastic Cache for Postgres and Redis and such.

36:39 That's cool.

36:40 And Dustin, I heard you talk about Elasticsearch, right?

36:45 Is that that's involved as well?

36:46 They're another sponsor in kind.

36:47 And that's for the search on PyPI.org is far, far better than it was on Legacy, which is basically a super naive search.

36:55 So now we can do full text search across descriptions and summaries and package names and even author maintainer names.

37:04 And it's a little more performant than the previous search and a little more reliable and better results.

37:10 Yeah, perfect, perfect.

37:11 All right.

37:12 So let's talk about where things are going, I guess.

37:16 So you have a roadmap laid out at wiki.python.org slash PSF slash warehouse roadmap.

37:22 I'll put that in the show notes, of course.

37:25 So the very first thing, you have a bunch of stuff, which is pretty awesome.

37:28 It's like, here's a milestone, closed.

37:30 Here's a milestone, closed, completed, right?

37:32 These are great.

37:33 And then the current one that's like coming in progress is shut down Legacy PyPI.

37:38 You all want to talk about that?

37:40 That's coming on the 30th, right?

37:42 Like that is, we're recording right now on April 18th.

37:44 So 22 days.

37:45 Yeah, go Dustin.

37:46 We sort of kept Legacy up for now just because there are a few big users of PyPI that weren't able to sort of make the migration in time.

37:55 So the idea is to keep it up for just a little bit longer.

37:57 And then fully, the domain will continue to exist.

38:01 So pypi.python.org will redirect to pypi.org.

38:04 But the legacy service will cease to exist.

38:07 That's the big change you were talking about, Ernest, where like you'll kind of be free to build this thing as its own creation, right?

38:14 Not mirroring that.

38:15 Yeah, it's interesting.

38:16 I think the first thing that Warehouse ever did that was production impacting was take control of the database schema.

38:23 And that was years ago.

38:26 And so we started tracking database changes there.

38:29 And then uploads came.

38:31 And then the actual web app came up and was usable and such.

38:35 And we added features there to get to parity.

38:38 And so everything that the project was sort of undertaken up to this point, except for, you know, markdown descriptions.

38:45 I think that's it.

38:46 And the design, yeah.

38:47 And, of course, the refresh design has been just to make sure that we're doing everything we can to keep from breaking too many people.

38:57 It's impossible for us not to.

39:00 I mean, it's impossible for any service to make progress without, at some point, deprecating older APIs and such.

39:07 And so we're really getting to the point where we've pared down a lot of things.

39:10 And we can start looking forward to, you know, value-add features, if you will, where it's like security features, audit features.

39:18 Accessibility is a big thing that, you know, we're looking forward to as well.

39:23 Yeah, very cool.

39:23 So that comes on the 30th.

39:25 And it'll be officially, the chains will be broken and warehouse will be its own thing.

39:30 And that'll be great.

39:31 This portion of Talk Python is brought to you by Codacy.

39:35 If you want to improve code quality, prevent bugs and security issues from making it into production,

39:41 and at the same time speed up your code review process by 20%, then you need to try Codacy.

39:46 That's C-O-D-A-C-Y.

39:49 Codacy makes it easy to track code quality and identify and fix issues by automatically analyzing your commits and pull requests

39:56 with all the most widely used static analysis tools.

40:00 Codacy helps great teams build great software.

40:03 Join companies like DeliverHero, PayPal, Samsung, and more.

40:07 Try your first code review by visiting talkpython.fm/Codacy and linking your GitHub or Bitbucket account.

40:13 You can also just click on the Codacy link in the show notes.

40:18 So then you have, under your roadmap, you have another section called Post-Lugacy shutdown.

40:24 And then kind of beyond that, you have Cool but Not Urgent, which is a nice way to categorize it.

40:29 So maybe we could kind of touch on those.

40:31 And whoever feels most like it's in their space, just grab it.

40:35 So like Dustin, there's something called incremental searching, search indexing rather, coming.

40:40 Tell us about that.

40:43 Yeah, so right now, the way the search index works, you upload a package, our index runs, I think every, now it's every three hours, roughly, when it actually runs.

40:52 So there's a lot of packages index.

40:55 And we don't have, at the moment, a way to sort of incrementally update the index.

40:59 So as soon as you publish a package, you know, it shows up in search results.

41:03 I see.

41:03 So you could say, like, this part is super stale because I know it just got updated.

41:07 So rerun the search, but only on this package, for example.

41:10 The goal was, you know, we got search up and running on PyPI.

41:13 And it was still a lot better than Legacy.

41:15 So it was good enough for launch.

41:17 But, you know, it can be better.

41:18 So that's one of the things we're focused on adding.

41:21 Sure.

41:21 And while you're on it, like, there's the autocomplete for search, which will be pretty nice.

41:26 There's also a search API.

41:28 That's pretty cool.

41:30 Like, does that exist?

41:31 Is there a way to search now and in the future?

41:33 Like, this is going to be a thing?

41:35 What's the story?

41:35 Technically, we have the XML RPC API that is technically deprecated.

41:39 You probably shouldn't be depending on it or using it or adding new things that depend on it.

41:43 It does have the word XML RPC in it, right?

41:46 That should be an indicator that it's deprecated, but no.

41:49 But you can technically search from this API.

41:51 And this is how, like, if you type pip search, whatever, that's how you get results through there.

41:56 But XML RPC, like I think I said before, is really hard for us to cache.

42:00 It's a big consumer of our bandwidth and backend resources.

42:04 So the idea is to sort of move to something that is a little more cacheable.

42:08 So this would be, we have a lot of ideas about future APIs for PyPI.org.

42:12 And, you know, something that might be included in that is a search API.

42:17 The other one that's interesting to me is the Psycho PG 2 warning.

42:23 So I guess that's just like you guys are using Postgres basically.

42:26 Are you using the asynchronous stuff or just synchronously?

42:30 No, warehouse is all synchronous right now.

42:33 Are you thinking of any way to get something async in there or does it not matter?

42:38 So a number of the services that are behind the entire sort of service, it's like the service umbrella, if you will, of what PyPI is.

42:45 So PyPI, it has been broken up into hunks.

42:49 And so for some things, it truly does matter the way these, you know, the way these requests are handled.

42:54 A lot of the really incredible work that was done initially on warehouse by Donald was just how aggressively cached everything is.

43:06 You know, the goal is basically to make as few requests require a transit to the backend as possible.

43:11 So we don't have a ton of concurrency concerns around that.

43:16 But for some services that do see lots of traffic, like we have a service that just translates old URLs to new URLs.

43:25 And that is effectively proxying information.

43:28 And so that's a knockout use case for async stuff.

43:32 Yeah, pretty interesting.

43:32 Let's see, what else is in your post-legacy shutdown here?

43:35 We have stop having a staging environment.

43:37 Is that because of the Kubernetes stuff that makes it not required?

43:41 So that's talking about test PyPI, which would be at test.pypi.org now.

43:46 That existed so that people can, or it currently exists so that people can do stuff and not worry about it being on the real PyPI.

43:54 So you can practice uploading a package, see how it looks on PyPI.

43:58 And I think there's a lot of reasons for it to exist, sort of just as an experimental and educational tool.

44:04 But the main reason I think people use it is to see if their restructured text descriptions are going to break or not.

44:10 Because historically, PyPI would just, it's sort of all or nothing.

44:14 You either get a perfect description or it just looks like plain text.

44:18 There's some ideas about doing some new things that might obviate the need for test PyPI,

44:23 like the ability to stage your releases.

44:25 So you're going to make a new release of your package.

44:27 You can upload them all to PyPI, but they're not actually published yet.

44:31 You can go and look at them, but no one can see them.

44:33 And then you hit a button and they're released.

44:35 Yeah, that's very cool.

44:36 And a big reason why that's important is we have immutable releases on PyPI.

44:41 And so right now, there's a lot of frustration that comes from users around they upload a package.

44:48 They don't like what they saw.

44:49 They try to delete it.

44:51 And they get a warning that says, when you delete this, you won't be able to reupload it.

44:55 And then they go to reupload it and they're frustrated.

44:58 So then they try to delete the project.

45:00 And then they go and they reupload it again.

45:02 And it says, no, you still can't reupload that file.

45:05 And so this is around primarily a caching and immutability thing to basically say that files can't be replaced.

45:12 So if you've been installing a file from PyPI for however long, it will still be there.

45:18 And so giving people a way to trial things without committing, basically, if you go to the permanence of the thing, is a big reason for that as well.

45:27 And when you get billions of requests, one pip freeze can make it part of the history of the software, right?

45:34 For sure.

45:35 All right.

45:36 So just really quick, some other things.

45:37 You have GitHub sign-on coming along, renaming projects, a few other cool things.

45:42 In the cool but not urgent, the one that stood out most to me was a mobile app.

45:46 Like, what's the story with the mobile app there?

45:47 Nicole, are you going to be designing a new mobile app for PyPI?

45:51 I don't know whether or not.

45:52 I mean, this has been a suggestion from the community.

45:56 And I think we're still working out whether or not that is something that's justifiable in terms of our time and the resources that we have on the project.

46:04 What exactly, like, do you guys know, like, what the goal of the mobile app?

46:08 I mean, you're definitely not going to pip install, like, onto your mobile phone.

46:11 Like, that wouldn't mean anything, right?

46:12 Is it more about management and, like, seeing stats?

46:14 I think it was more just about, like, can we offer this user interface as a mobile app as opposed to a responsive website?

46:22 And for me, I'm not sure how much value that would bring.

46:25 We probably have, like, I mean, I don't have the statistics in front of me, but it's less than 10% of our users are using a mobile or tablet device.

46:34 So, and the site works on mobile now better than the old ones.

46:38 So, I'm not sure whether or not we'll go down that road.

46:41 What I think is most interesting about the mobile apps issue that's being tracked there is a prerequisite for that is effectively the next generation of an API for interacting with PyPI.

46:55 That's one of the biggest things that is intended to be undertaken at the PyCon sprints this year.

47:01 And so, now, putting my PyCon hat on and my warehouse hat on at the same time, I think it'd be an excellent idea for people who are interested in helping to contribute to the discussion and design of the next generation of APIs for PyPI to consider attending the sprints after PyCon this year.

47:22 The sprints are Monday, Tuesday, Wednesday, Thursday after.

47:24 And there will be a number of contributors to the project around.

47:29 And that's one of the main things we plan on discussing.

47:32 It sounds really good.

47:33 Yeah.

47:34 Very, very nice.

47:35 A couple others.

47:37 Let's see.

47:38 The university, a simple one, package update feed.

47:41 So, that's like I can subscribe to real-time changes to the back-end data.

47:46 So, I know if I'm like, if I've pulled that down or something, I could refresh, say, like my local PyPI caching server type thing.

47:53 Yeah.

47:53 So, there's a lot of third-party services that depend on PyPI that kind of want real-time notifications about new package uploads or removals and that kind of thing.

48:03 And so, this is just going to be a new API for, you know, like a tool like PyUp, which lets you like automatically upgrade your dependencies when they're released.

48:10 Yeah.

48:11 I use PyUp on my stuff.

48:12 I love it.

48:12 Yeah.

48:12 It's great.

48:13 So, we want to be able to support them, make it easier for them to do their job and use PyPI.

48:17 So, that's one of the things we're thinking about.

48:19 Yeah.

48:19 You don't want to have to suck all that data down just to get a new batch.

48:22 Kind of like your incremental search.

48:24 This is like the external version of it, sort of.

48:25 Yeah.

48:26 Exactly.

48:26 So, another one that's really closely related to it, like including related to PyUp.io, like you just mentioned, is security notification system for Python packages.

48:36 That sounds really useful.

48:38 We just had this year or is it in the last year, like some sort of test malicious stuff uploaded to PyPI, right?

48:48 A couple of packages that were sort of hitting on typo squatting.

48:51 Didn't really seem to do anything, but still kind of scary.

48:54 So, knowing about security notifications, I guess not necessarily just people uploading malware, but like, hey, we actually forgot to check the password in this login field.

49:03 You probably want to get the newer version that checks the password type of thing, right?

49:07 On legacy, you could do this thing called hiding releases, which just made them not show up, but they basically still existed and it's not going to prevent you from using them.

49:14 One of the things that we're thinking about doing with the new PyPI is either adding the ability to deprecate a release saying like, you should not use this anymore.

49:22 It doesn't work or being able to market as insecure in some way.

49:26 So, there's like a known vulnerability in it and you should upgrade to the new version.

49:29 And, you know, this is something that's going to have to change in a lot of different parts of the packaging ecosystem.

49:35 So, like, pip needs to be able to say, hey, you told me to install this version and PyPI says it's insecure and tell the user, give them a warning or whatever.

49:43 But, yeah.

49:44 Yeah.

49:44 I mean, just related to that, I would love to be able to type pip security checkup on like an environment or something.

49:51 And go, these two things have security warnings.

49:53 These have updates, but they're feature only or something to that effect, right?

49:57 That would be cool.

49:57 Yeah.

49:57 And I mean, to be clear, it doesn't happen very often that there are security vulnerabilities in Python packages.

50:02 But it's something that could happen, might happen.

50:05 We want to be able to support it.

50:06 Yeah.

50:07 For example, Django had one or two minor security issues patched, right?

50:10 And you'd want to know if you were built upon Django, like, hey, you should probably install a new version before people start doing anything with that, right?

50:18 Yeah.

50:18 Very cool.

50:19 So, just super quick.

50:20 I'm about out of time, but just touch on one more thing.

50:23 Like, this week, I think, pip 10 was released, right?

50:26 That's correct.

50:27 I don't know how much any of you all had to do with that, but still pretty good news, right?

50:31 Yeah, it's great.

50:31 It had been a long time.

50:32 Since we had a pip release, so.

50:34 Yeah, it's really exciting.

50:36 I mean, the biggest thing is that it's a pretty foundational refactoring of a lot of the internal stuff.

50:43 And it puts, in my opinion, anyway, one of the things I'm most excited about it is it puts a lot of the internal tooling of pip and makes it more available for more interesting things built around and on top of pip, not necessarily at a CLI basis.

50:56 Because right now you've got to, like, if you want to use pip's stuff, you have to, you just have to jump into, like, super private APIs to do it, which isn't so great.

51:04 That's really cool.

51:05 Probably will make pairing it with this work that you're doing on the server side easier as well.

51:09 All right.

51:10 So I think I have other things I would love to talk to you about, but I think we're sort of running low on time.

51:16 So let's get to just a couple of things here at the end.

51:20 Final two questions, just quick, since they're straight of you.

51:23 Nicole, start with you.

51:24 If you're going to do some work on this project, what editor would you use?

51:29 Like, what typical editor do you use?

51:30 I use Atom.

51:31 Okay.

51:32 Very nice.

51:33 I don't think that I know about that one.

51:34 Tell me a little bit about it.

51:36 You're talking about text editor?

51:37 Yeah.

51:38 So it's Atom, which is developed by GitHub.

51:40 Oh, Atom.

51:41 Oh, yeah.

51:41 Sorry, sorry.

51:42 Yeah, I must have misheard you.

51:43 Atom, of course, I know Atom.

51:44 Yeah.

51:44 Sorry, that's my accent.

51:45 No, no.

51:46 Yeah.

51:46 Cool, Dustin.

51:48 I'm a Vim user.

51:49 Vim, right on.

51:51 Ernest?

51:51 I also am a Vim user.

51:52 Nice.

51:53 All right.

51:53 Now, this particular question I ask of everybody, but it's kind of interesting because you're

51:57 both on the inside and the outside.

51:59 So, notable PyPI package.

52:02 Ernest, how about you go first?

52:03 Notable in what way?

52:06 Notable in that, like, it's probably not necessarily the most popular thing.

52:11 Like, people always say requests, which is fine.

52:12 But, like, I learned about this thing.

52:15 You should totally check it out, sort of notable.

52:18 Like, it's not necessarily totally known, but it's actually amazing.

52:21 And it's just a pip install away.

52:23 Recently, with the type of squatting thing we sort of talked about, I was on the hunt for

52:28 something that would just tell me all of the standard limb module names.

52:31 And that exists.

52:33 And go figure, it is called, I think it's called standard limb module names.

52:38 Descriptive names are good.

52:41 Yeah.

52:42 And so, we were able to add that to PyPI and very quickly be able to have a good block

52:48 of that first line of defense.

52:50 Somebody didn't try to pip install regex or something.

52:54 Right, right, right.

52:55 Yeah.

52:55 pip install re.

52:56 No, not doing that.

52:57 Dustin?

52:58 In the course of this project, I had this sort of favorite Python package I'd learned about,

53:02 which is pretend, which we use pretty heavily on Warehouse for sort of mocking things out

53:08 in tests.

53:08 So, the new PyPI has like 100% test coverage.

53:12 So, there's a lot of mocking going on.

53:14 And so, that's, I think, Alex Gaynor's tool.

53:18 And it's been really helpful.

53:19 I think my, as of lately, my favorite package is not actually on PyPI, but I just discovered

53:25 it the other day.

53:25 I'm kind of a sucker for like funny little hacks or jokes.

53:28 And so, this guy, Dominic Medzinski, he made this project called Import PyPI.

53:34 And it's really interesting.

53:36 What it does is it sort of wraps the import command.

53:38 And if you don't have a given package on your system, it will go out to PyPI, get it, and

53:44 install it, and it will just work.

53:46 So, you never actually have to pip install anything again.

53:48 I ran across that as well, and that's pretty interesting.

53:50 It's quite ironic it's not on PyPI.

53:52 But, yeah.

53:54 Does it do that on the fly?

53:56 Yeah, it does.

53:57 I think it does.

53:57 If it doesn't find it.

53:58 I don't think it's really recommended for production grade usage, but it's a fun little hack.

54:03 It is quite interesting for what it's worth.

54:05 Hope it puts a --user on it, at least.

54:08 All right, Nicole, do you have one?

54:10 Oh, yeah, I do.

54:11 As I said, I only dabble in Python.

54:13 But when I was dabbling, I got really into testing, and I really liked Factory Boy, which

54:20 creates factories.

54:24 So, I use that a lot for running Selenium tests running over my Django code base when

54:29 I was developing with Python.

54:31 It was a really cool project.

54:32 I think it's actually based off a Ruby project originally.

54:36 Yeah.

54:36 A ThoughtBots factory bot.

54:38 So, yeah.

54:39 It's a really great project to work with.

54:40 Awesome.

54:41 That sounds like a great one.

54:42 All right.

54:43 Well, thank you all for being on the show.

54:45 I want to give you one final chance for a call to action.

54:48 There's people who have packages they maintain.

54:50 They should probably play with your stuff, right?

54:52 Try the new thing.

54:54 We have people who maybe want to contribute to open source.

54:57 Ernest, you spoke about the sprints.

54:59 What should people do?

55:00 They should come to the...

55:01 If they're going to be a PyCon, they should come to the packaging sprints.

55:03 So, I'll be there.

55:04 Ernest, some part of Ernest will be there after running PyCon.

55:08 We'll see what's left of him.

55:10 And we're going to just sprint on the packaging ecosystem, including PyPI, and see what we can build.

55:16 Awesome.

55:16 You should go verify your primary e-mail.

55:19 Janitorial aspects of that.

55:22 So, go verify your e-mail address.

55:24 That's super helpful for us.

55:25 Yeah, that's awesome.

55:26 Yeah, Nicole.

55:27 The other thing is I'm planning on running a sprint also at Europython this year in Edinburgh.

55:32 So, for people who are based in Europe who want to contribute to the project, we'll be running a sprint there as well.

55:39 And the other thing is people should consider donating to the Python Packaging Working Group because we actually were lucky enough to receive an award from Mozilla to be able to fund working on our warehouse for the last few months.

55:54 But that money is about to run out.

55:57 We have used that money to get to our goal, which is to launch the new PyPI and shut down Legacy.

56:03 But in terms of the future development of the project, you know, any funding that we can secure is obviously going to mean that we can move faster and more reliably and be less reliant on our volunteers for our sort of core infrastructure.

56:19 So, yeah, I know the PSF is currently running a fundraising campaign.

56:23 So, certainly consider donating to the working group.

56:26 And there's a handy link, actually, at the top of the new site if you do want to donate.

56:31 So, yeah, any contributions would be most welcome.

56:34 That is a great suggestion.

56:36 And, yeah, I think people definitely should do that.

56:39 I forgot to call out the Mozilla Open Source Foundation and say thank you for that.

56:43 But, like, the reason we're here having this conversation and it got this major boost is largely, like, that was a major factor in it, right?

56:51 Dustin, you wanted to add something.

56:53 Yeah, the Mozilla Award is definitely the reason why this was all possible.

56:57 So, yeah.

56:58 I wanted to have a call to action.

56:59 Anyone that wants to contribute to the project or just contribute to Open Source, come and find us on GitHub.

57:04 We are a pretty friendly group and we have a bunch of issues tagged a good first issue that you could take a crack at.

57:11 And we'd like to see more contributors every day.

57:13 Absolutely.

57:13 And it's much easier, as you all have laid out, for various reasons why that's the case.

57:18 Ernest?

57:18 Yeah, I definitely wanted to just, like, I'm shaking here.

57:21 How did we not talk a little bit more about Mozilla Open Source Support Grant Program?

57:27 Indeed, it is the sole reason why PyPI.org launched on Monday and not in, like, another year or 18 months.

57:38 Because just the amount of work that went into making this all possible, I think, in retrospect and without being super optimistic looking forward, wasn't incredible.

57:50 And just based on looking back, it probably would have been an indefinite period of time before this occurred without being able to have people committed and thinking and soliciting the community to help as well.

58:04 So Mozilla was instrumental and forever indebted to them for how much they made this happen.

58:12 Yeah, that's really awesome.

58:12 And thank you to them.

58:14 That's great.

58:14 I want to add one final thing.

58:16 People should donate to the Python Packaging Working Group.

58:19 But they should also, if they have a company that massively depends upon Python, they should say, dear company, you're running a $5 billion business on this.

58:28 Could we set up a $1,000 recurring donation monthly to this?

58:32 Because without this, your business goes away, or at least a good chunk of it.

58:37 Yeah, the number of organizations and companies that depend on PyPI are most of them, it seems like.

58:43 So, yeah, it's now possible to make recurring donations.

58:45 So we definitely appreciate the support.

58:48 Right.

58:48 Awesome.

58:49 All right.

58:49 Well, let's leave it there.

58:51 Thank you all for being on the show.

58:52 It's been a great conversation.

58:53 And congratulations on the launch.

58:55 I'm super excited to see it.

58:56 Thanks, Michael.

58:57 Thanks, Michael.

58:57 Thank you.

58:58 This has been another episode of Talk Python To Me.

59:03 Our guests have been Nicole Harris, Ernest Durbin III, and Dustin Ingram.

59:08 And this episode has been brought to you by ActiveState and Codicy.

59:13 ActiveState gives you a faster way to build and secure open source runtimes.

59:17 From your first line of code through to production, check it out at talkpython.fm/activestate.

59:24 Review less, merge faster with Codicy.

59:28 Check code style, security, duplication, complexity, and coverage on every change while tracking code quality throughout your sprints.

59:36 Try them at talkpython.fm/codicy.

59:40 C-O-D-A-C-Y.

59:42 Want to level up your Python?

59:43 If you're just getting started, try my Python jumpstart by building 10 apps or our brand new 100 days of code in Python.

59:50 And if you're interested in more than one course, be sure to check out the Everything Bundle.

59:54 It's like a subscription that never expires.

59:57 Be sure to subscribe to the show.

59:59 Open your favorite podcatcher and search for Python.

01:00:01 We should be right at the top.

01:00:02 You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct RSS feed at /rss on talkpython.fm.

01:00:12 This is your host, Michael Kennedy.

01:00:13 Thanks so much for listening.

01:00:15 I really appreciate it.

01:00:16 Now get out there and write some Python code.

01:00:18 I'll see you next time.