Learn Python with Talk Python's 270 hours of courses

Privacy as Code with Fides

Episode #409, published Sat, Apr 1, 2023, recorded Thu, Mar 23, 2023

We all know that privacy regulations are getting more strict. And that many of our users no longer believe that "privacy is dead". But for even medium-sized organizations, actually tracking how we are using personal info in our myriad of applications and services is very tricky and error prone. On this episode, we have Thomas La Piana from the Fides project to discuss privacy in our applications and how Fides can enforce and track privacy requirements in your Python apps.

Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

Guests Introduction and Background

Thomas La Piana is a seasoned Python developer and data engineer focused on privacy engineering. After studying politics in college, he transitioned to self-taught programming in Python, landing a data intelligence job that quickly evolved into working on data pipelines and analytics. He eventually joined Ethyca, where he contributes to the open-source Fides project, a privacy-as-code platform helping organizations automate privacy compliance. Thomas is deeply passionate about bridging the gap between engineering teams and legal or compliance stakeholders, aiming to make privacy compliance more automated and less error-prone for organizations large and small.

What to Know If You're New to Python

If you're just starting with Python and want to get more out of this episode’s focus on privacy:

  • Familiarize yourself with basic command-line interfaces (CLI) since tools like Fides often provide CLI-driven workflows.
  • Understand what YAML files are and how Python libraries parse them. YAML is a key part of Fides’ configuration.
  • Get comfortable with concepts like database connections in Python, as privacy tools often scan and modify data in multiple databases.
  • Know how to install and manage Python packages (e.g., with pip) so you can quickly try out tools like Fides.

Key Points and Takeaways

  1. Fides: Privacy-as-Code Platform Fides is an open-source privacy engineering platform designed to help automate data mapping, privacy checks, and regulatory compliance. By using metadata and configuration (in YAML), Fides manages data subject requests (DSRs) and ensures that personal information is tracked and deleted or returned as required by laws like GDPR and CCPA.
  2. Data Subject Requests (DSRs) Made Easier One of Fides’ main features is the automation of data subject requests, such as the right to be forgotten or a request to see all data a company holds. Instead of manually combing through multiple databases and APIs, Fides can automatically track and delete relevant user data, respecting foreign key relationships and order of operations.
    • Links and Tools:
  3. Automated Data Mapping Fides helps organizations create a holistic “map” of where personal data lives in their systems. Privacy and compliance teams can then ensure that each data store’s usage aligns with stated policies. This mapping stems from metadata added by developers via YAML or detected by scanning the organization’s infrastructure.
    • Links and Tools:
      • DBT (discussed as a data lineage tool, though not part of Fides itself)
      • Various database connectors (Postgres, MongoDB, Salesforce, etc.) leveraged by Fides
  4. Privacy Checks in CI/CD Borrowing the “shift left” mindset from security, Fides allows developers to enforce privacy requirements early in the development process. If an engineer introduces a new data use that violates organizational or legal policies, Fides can block the merge in CI, preventing non-compliant code from ever reaching production.
    • Links and Tools:
      • GitHub Actions or other CI/CD platforms that integrate with Fides
      • Nox (mentioned separately for running automated tasks and tests)
  5. Runtime Observability and eBPF Beyond static analysis, Fides can leverage eBPF (Extended Berkeley Packet Filter) to observe network traffic in Kubernetes clusters. This approach helps detect unexpected outbound or inter-service calls (e.g., to MailChimp) that might involve user data, creating an automated runtime map of data flows.
    • Links and Tools:
      • eBPF.io for more on eBPF
      • Fides “system scanner” (in development or proof-of-concept within Fides)
  6. Challenges for Medium-Sized Companies While large companies like Google can hire entire teams for privacy, and small businesses can often manually handle data requests, medium-sized enterprises often struggle. Fides aims to automate away manual overhead by programmatically identifying where data is stored, how it is used, and ensuring compliance across changing infrastructures.
  7. Cookie Consent and Privacy Fatigue The conversation highlighted how regulations like GDPR brought about cookie banners, which often frustrate both users and site owners. Fides doesn’t directly solve cookie banner annoyances but underscores that these friction points illustrate the broader complexity of privacy compliance.
    • Links and Tools:
      • CCPA official site for reference
      • Self-managed solutions for cookies (not specifically integrated with Fides but related to the privacy conversation)
  8. Open Core Model Thomas and the team at Ethyca believe core privacy compliance features—data mapping, data subject request automation—should remain free and open source. Enterprise features like advanced machine learning classifiers or runtime scanning might live in a paid tier (Fides Plus), but the essential building blocks are accessible to all.
  9. Simplifying Developer Experience Thomas shared how, in prior roles, he spent countless hours on one-off privacy tasks. Tools like Fides standardize privacy oversight, bridging communication between legal, compliance, and engineering. The CLI is built in Python, featuring user-friendly libraries such as rich-click for better terminal outputs.
  10. Privacy is Here to Stay Laws, regulations, and user expectations around data privacy are only increasing. Fides exemplifies how code-centric solutions can make compliance a natural part of the software delivery cycle rather than an afterthought. For individuals or teams building applications, adopting “privacy by design” will be indispensable moving forward.

Interesting Quotes and Stories

  • Cookie banners as the new “mini-game”: Thomas joked it’s like a game of “where did the cookie pop-up move this time?” referring to the frustration of constant consents on sites.
  • “Take a Penny, Leave a Penny” analogy: Thomas used this to describe the casual exchange of user data within companies that often lose track of how data is circulated.
  • Runtime fear: Michael quipped about old systems no one dares to touch, especially when personal data might be locked in them, illustrating the real struggles teams face with legacy software and compliance requests.

Key Definitions and Terms

  • DSR (Data Subject Request): A user-initiated request to view, modify, or delete personal data an organization holds about them. Required by regulations like GDPR (Europe) and CCPA (California).
  • Data Mapping: Identifying all systems and databases that collect, store, or process personal information. Essential for compliance and for fulfilling DSRs efficiently.
  • Privacy by Design: The principle of building applications and systems with data protection and user privacy baked in from the beginning rather than fixing issues afterward.
  • eBPF: A powerful way to observe network and process behavior at the Linux kernel level, used in some advanced privacy scanning solutions to detect data flow at runtime.
  • Open Core: A business model in which core product features are open source, while advanced or enterprise features are paid.

Learning Resources

Here are a few resources to further explore privacy, Python tooling, and modern project workflows:

  • Modern Python Projects: Discover best practices for Python project structure, dependency management, and tooling that can complement your use of a privacy-focused tool like Fides.
  • Getting Started with pytest: Use automated testing to ensure new features and data-handling code meet your privacy requirements, hooking in with Fides’ checks.
  • Visual Studio Code for Python Developers: If you’re new to VS Code (like Thomas once was), this course will help you configure a powerful, privacy-code-friendly Python environment.

Overall Takeaway

Privacy regulations are only growing in scope, and manual tracking of personal information is becoming both riskier and more cumbersome. Tools like Fides offer a modern, automated solution that brings privacy “into the code,” combining metadata, scanning, and runtime checks to ensure compliance. Whether you’re a small team or part of a large organization, thinking about privacy as part of the continuous integration and development process (just like security) will empower everyone to handle user data more responsibly—and efficiently.

Links from the show

California Consumer Privacy Act (CCPA): oag.ca.gov
30 Biggest GDPR Fines So Far: tessian.com
Website fined for Google Fonts: theregister.com
Fides on Github: github.com
Fides: ethyca.com
Bunny.net Fonts: fonts.bunny.net
DBT: getdbt.com
eBFP Kernel tools: ebpf.io
nox: nox.thea.codes
rich-click: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon