Learn Python with Talk Python's 270 hours of courses

Software Supply Chain Security with Phylum

Episode #457, published Fri, Apr 19, 2024, recorded Wed, Jan 24, 2024

We've spoken previously about security and software supply chains and we are back at it this episode. We're diving in again with Charles Coggins. Charles works at a software supply chain company and is on to give us the insiders and defender's perspective on how to keep our Python apps and infrastructure safe.

Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

This conversation covers a wide range of topics around Python packaging, supply chain security threats, and best practices to keep your environment safe. Below are the key topics and takeaways.

Guest Background

In this episode, our guest is Charles “Charlie” Coggins, a seasoned Python developer who works at Phylum, a company focused on software supply chain security. Charlie originally started his career in a non-traditional programming path, worked for the US government on cybersecurity initiatives, and later transitioned into Python development. He has spent the past couple of years at Phylum working on Python integrations and helping defend against modern threats in open-source software ecosystems.

1. Software Supply Chain Security Concerns

  • Why it matters: Software developers have significant power to impact many users; a single compromised dependency may affect thousands of downstream projects.
  • Multiplicative effect: A malicious or vulnerable library can propagate through transitive dependencies, making early detection and safe practices crucial.

2. Lock Files and Dependency Management

  • Importance of pinning dependencies: Pinning versions (e.g., with lock files) ensures reproducibility and can prevent malicious updates from being unwittingly pulled in.
  • pip-tools and pip compile
    • A popular way to generate a “lock file” style requirements output that includes the complete transitive dependency list.
    • GitHub: pip-tools
  • Other tools: Mentions of poetry, hatch (with hatchling), and the now-rejected PEP 665 proposal to standardize Python lock files.

3. PEPs around Python Packaging

  • PEP 517 & 518: Discussed as the mechanism behind pyproject.toml and build backends.
    • PEP 517 defines a build-system independent format for source trees.
    • PEP 518 specifies minimum build system requirements and the structure of pyproject.toml.
  • pyproject.toml: Modern approach for declaring build systems and dependencies, reducing the need for setup.py.

4. Common Supply Chain Attacks

  • Typosquatting: Malicious packages use a name very close to a well-known package (e.g., missing or swapped letters in requests).
  • Starjacking: Attackers copy legitimate repository metadata (like GitHub stars) to appear legitimate on PyPI.
  • Dependency Confusion: A private/internal package name is hijacked when a higher-versioned package of the same name is published to PyPI.
  • Repo Jacking & Expired Domains: Taking over old GitHub handles or domains tied to a package’s original author to push compromised updates.

5. Phylum’s Approach and Tooling

  • Phylum CLI and Integrations
    • Phylum CLI is published in Rust but installable via Python (pipx install phylum or pip install phylum).
    • Phylum can be integrated into CI/CD (e.g., GitHub Actions, GitLab pipelines) to block or warn on malicious dependencies.
    • Free community edition allows up to five projects, while paid tiers accommodate larger teams and organizations.
    • Website: phylum.io

Key Takeaways

  1. Lock Down Your Dependencies
    Use strict version pinning (via tools like pip-tools or poetry) to ensure you’re only installing known, trusted versions.
  2. Monitor and Update Regularly
    Attackers rely on unmaintained or outdated environments. Periodically review and update your pinned dependencies.
  3. Verify Source Integrity
    Watch out for unverified or direct Git installations and ensure you’re referencing the correct package repos.
  4. Implement Automated Security Checks
    Tools like Phylum or other CI/CD scanners help detect malicious updates or suspicious package behavior before it hits production.

Overall Takeaway

Software supply chain attacks on Python projects are becoming more common, but there are concrete steps you can take to mitigate risk:

  • Pin dependencies with lock files.
  • Regularly audit and monitor your code and its third-party packages.
  • Employ CI/CD security tools to catch issues early.

By investing in these practices, you’ll drastically reduce the chance of inadvertently shipping malicious or compromised Python applications.

Links from the show

Series: How Malicious Python Code Gains Execution: blog.phylum.io

Pick a Python Lockfile and Improve Security: blog.phylum.io
Bad Beat Poetry: blog.phylum.io
PEP 665 – A file format to list Python dependencies for reproducibility of an application: peps.python.org
PEP 517 – A build-system independent format for source trees: peps.python.org
PEP 518 – Specifying Minimum Build System Requirements for Python Projects: peps.python.org
Lockfiles should be committed on all projects: classic.yarnpkg.com
An Overview of Software Supply Chain Security: tldrsec.com
Typosquatting: docs.phylum.io
Common Attack Pattern Enumeration and Classification: capec.mitre.org
Dependency Confusion: docs.phylum.io
Expired Author Domains: docs.phylum.io
Unverifiable Dependency: docs.phylum.io
Repo Jacking: Hidden Danger in Broken Links: blog.phylum.io
Software Libraries Are Terrifying: medium.com
phylum 0.43.0: pypi.org
linguist: github.com
rich-codex ⚡️📖⚡️: ewels.github.io
Phylum Community Discord: discord.gg
The dream is dead?: mastodon.social
When "Everything" Becomes Too Much: The npm Package Chaos of 2024: socket.dev
pip-tools: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon