Typosquatting and Supply Chains Vulnerabilities
That's the topic of this episode. Bentz Tozer and John Speed Meyers are here to share their research into typosquatting on PyPI and other sneaky deeds. But we also discuss some potential solutions and fixes.
Episode Deep Dive
Guests Introduction and Background
Bentz (Benz) Tozer and John Speed Meyers join this episode from In-Q-Tel (IQT), a nonprofit that invests in and fosters leading-edge tech with a focus on cybersecurity. Bentz has a strong background as a software developer and systems engineer, spending 20 years in the defense industry before turning his focus to cybersecurity. John has a blend of data science, economics, and programming skills and works at IQT Labs researching open-source security and other high-tech solutions. Together, they share insights into how the Python package ecosystem can become vulnerable to attacks—especially via typosquatting and malicious software supply-chain threats.
What to Know If You're New to Python
If this is your first foray into Python, here are a few essentials from the conversation to help you get more out of it:
- Package management in Python usually happens with pip. When you run
pip install some_package
, it executes code under your user permissions. - The Python Package Index (PyPI) is the official software repository where you’ll find most libraries and frameworks, but verifying you’re installing the correct package is crucial.
- Creating virtual environments or using Docker is a recommended best practice to isolate and protect your system when exploring new packages.
Key Points and Takeaways
Supply Chain Superpowers and Blind Trust
The Python Package Index offers a huge array of libraries just onepip install
away—part of Python’s “superpower.” But these millions of packages also mean developers typically trust code blindly. As Bentz and John highlight, it’s vital to remember that installing dependencies can execute arbitrary code.- Links / Tools:
Typosquatting: A Quiet Attack Vector
Attackers exploit simple spelling mistakes by uploading near-identical package names (e.g., “pandar” vs. “pandas”) to trick developers. This is one of the most common forms of malicious package abuse, with occasionally thousands of downloads from unsuspecting users.- Links / Tools:
Malicious Packages on PyPI
Several examples, including requests vs. request and “pandar” vs. “pandas,” show how easy it is to slip in a bad library. Historically, PyPI has even removed over 3,000 malicious packages in one wave. Even if only 40 known malicious examples existed, the potential damage can be large, especially if installed in corporate or government settings.Importance of Responsible Disclosure
Developers who discover malicious packages or vulnerabilities should use responsible disclosure: quietly notify maintainers or email the PSRT before going public. This helps quickly remove threats and minimizes damage. There are also projects like the Backstabber’s Knife Collection (an open repository of malware samples) that researchers can use to study malicious code.- Links / Tools:
Scanning Tools: PyPI Scan, Aura, and Aura Borealis
Bentz and John’s research led to creating scanning tools. Simple approaches (like comparing the name and metadata of lesser-known packages to popular ones) already reveal suspicious libraries. Aura Borealis aims to be a front-end for deeper static analysis (e.g., scanning all PyPI packages).- Links / Tools:
- PyPI Scan (no direct URL in transcript but cited as an open-source CLI tool)
- Aura & Aura Borealis (open-source scanning & front-end for analyzing metadata across PyPI)
- Links / Tools:
Practical Security Steps
Developers should adopt speed bumps when installing: double-check the spelling of package names or copy/paste from official documentation. Using private PyPI repositories, pinned versions, or Docker-based “quarantine” environments (like local containers or VMs) helps control what code ends up in your production environment.- Links / Tools:
Hardened pip and Namespace Protection
The episode highlights the need for pip “safeguards,” such as blocking suspiciously close package names to the top downloaded libraries. Namespacing (like “ownername/package”) could make it clearer who published a package and reduce confusion with sound-alike libraries.OpenSSF and Ecosystem-Wide Solutions
The Open Source Security Foundation (OpenSSF) is a Linux Foundation project tackling these challenges across multiple language ecosystems. It encourages standard best practices and might help fund or unify scanning solutions, policy frameworks, and developer education.- Links / Tools:
Examples of Supply Chain Attacks
While the episode focuses on Python, it references bigger incidents like SolarWinds and XcodeGhost for Apple iOS. These highlight the scale and impact of hijacked developer tools. Even if Python’s pip ecosystem is smaller by comparison, it’s not immune to large-scale exploits.Responsible AI in Security Tooling?
Though not deeply explored in the episode, they hint that some scanning approaches may eventually incorporate machine learning or AI to detect suspicious packages in real time. However, even “simple” code checks can yield big benefits.
Interesting Quotes and Stories
“There’s a little bit of paranoia that comes with working in cybersecurity, it’s true.” — Bentz Tozer
“We realized you can do a lot by just scanning names and metadata. We found ‘pandar’ that way, which was doing keylogging!” — John Speed Meyers
“You’ll never
pip install
the same way after listening to this episode.” — Michael Kennedy
Key Definitions and Terms
- Typosquatting: Creating package names nearly identical to popular ones so unsuspecting users install the malicious “near-clone.”
- Supply Chain Attack: Targeting a developer tool, library, or repository with the goal of reaching many downstream users.
- PSRT (Python Security Response Team): The team at the Python Software Foundation that handles security vulnerabilities.
- Social Distancing for Packages: A proposal to block or warn about similarly named packages to reduce typosquatting.
- Aura Borealis: A front-end system for analyzing output from “Aura,” which performs static checks on the entire PyPI set of packages.
Learning Resources
If you’re new to Python or want to deepen your understanding of foundational coding practices (including safe coding habits), these courses from Talk Python Training can help:
- Python for Absolute Beginners: Ideal for those just starting out, covering essential Python concepts and coding fundamentals before jumping into advanced topics like security.
Overall Takeaway
Typosquatting and broader supply chain risks present a significant threat to the open, highly collaborative nature of Python’s ecosystem. Being vigilant—double-checking spelling, using private repositories, scanning new dependencies, and reporting malicious packages—can go a long way toward keeping our projects safe. As the community rallies around new tooling and more robust infrastructure, the hope is that these attacks will become both rarer and easier to stop.
Links from the show
SolarWinds: csoonline.com
XCodeGhost: macrumors.com
Python Package Index nukes 3,653 malicious libraries uploaded: theregister.com
Dependency confusion: medium.com
Typosquatting Is About More Than Typos: iqt.org
Approaches to Protecting the Software Supply Chain: iqt.org
A Quant’s View of Software Supply Chain Securityz: usenix.org
Organizations
Open Source Security Foundation (OpenSSF): openssf.org
Python Security Response Team: python.org
Proposed solutions and tools
pypi-scan: github.com
AuraBorealis App: github.com
Project Aura: aura.sourcecode.ai
Aura source code: github.com
Reduce Typosquatting Harm via Social Distancing for Top PyPI Packages: github.com
Have I Been Pwned: haveibeenpwned.com
Snyk Package Advisor: snyk.io
Backstabbers-Knife-Collection: dasfreak.github.io
NetworkML Package: github.com
Misc
Google as a Visionary Sponsor: pyfound.blogspot.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy