20 Recommended Packages in Review
Our guest, Antonio Andrade put together a GitHub repository cataloging guests' response to this question over the past couple of years. So I invited him to come share the packages covered there. We touch on over 40 packages during this episode so I'm sure you'll learn a few new gems to incorporate into your workflow.
Episode Deep Dive
Guest Introduction and Background
Antonio Andrade is a passionate Python developer deeply involved in the Python community. He enjoys tinkering with data and automation workflows, which led him to create a GitHub repository cataloging every notable PyPI package mentioned by past Talk Python to Me guests. Antonio joins the show to share highlights from these community-sourced recommendations and discuss how he uses Python for everything from data science projects to automating personal workflows. His enthusiasm for discovering and celebrating the small-yet-impactful packages in the Python ecosystem shines throughout this episode.
What to Know If You're New to Python
If you’re newer to Python and want to follow all the package and tooling discussions in this episode, here are a few essentials:
- Familiarize yourself with the concept of virtual environments (e.g.,
venv
orconda
) so that installing these packages doesn’t conflict across projects. - Understand the basics of web frameworks (Flask or FastAPI) and how Python interacts with data tools (like pandas or SQLite).
- Recognize that Python’s packaging ecosystem (e.g., PyPI, pip) is central to discovering and installing these libraries.
Key Points and Takeaways
- Antonio’s GitHub Repo of Package Recommendations
Antonio built a GitHub repository to aggregate and track the “notable PyPI package” answers from Talk Python guests. Over time, these answers create a snapshot of emerging trends and hidden gems in the Python world. The repo makes it easy to revisit previously mentioned tools and even submit new ones from recent episodes.
- Links / Tools:
- Antonio’s GitHub Packages Repo
- GitHub in general for versioned collaboration
- Links / Tools:
- Tortoise ORM and Beanie (Async Database Libraries)
Tortoise ORM simplifies async database interactions, aiming for a clean active-record style API. Beanie extends that idea to MongoDB, pairing with Pydantic models for an asynchronous NoSQL workflow. Both highlight how Python async capabilities unlock performance and cleaner code for database-driven apps.
- Links / Tools:
- UMAP for Dimensionality Reduction
UMAP (Uniform Manifold Approximation and Projection) helps data scientists reduce high-dimensional data into more approachable 2D or 3D forms. It’s popular for visualization of clustering or similarity across large datasets, and it’s especially powerful in combination with pandas or scikit-learn.
- Links / Tools:
- Plotext: Terminal-Based Plotting
Plotext allows you to generate textual plots directly in your terminal, making it handy for quick data inspections or logging. It closely mimics matplotlib syntax but outputs ASCII or character-based charts, so you don’t have to leave the command line to visualize data.
- Links / Tools:
- FSSpec and Dynaconf
FSSpec standardizes file I/O across multiple backends like local systems, S3, or other remote file stores with a uniform API. Dynaconf streamlines Python project configuration, enabling environment-specific settings and secrets management in a single, flexible system.
- Links / Tools:
- AWS CDK (Cloud Development Kit) and Automation
The AWS Cloud Development Kit (CDK) lets you define your cloud infrastructure in Python code, helping you script AWS resources rather than configuring them manually. This approach is especially powerful for those doing repeated deployments or advanced DevOps workflows in Python.
- Links / Tools:
- Luigi for Workflow Management
Luigi, developed at Spotify, orchestrates complex pipelines by modeling tasks and their dependencies in Python. It’s widely used in data engineering to chain together multiple steps, ensuring each step completes successfully before triggering the next.
- Links / Tools:
- Pydantic for Data Validation
Pydantic leverages Python type hints for fast data parsing and validation. It’s essential when receiving unstructured or user-generated data, automatically converting types or returning detailed error messages if the data doesn’t fit expected schemas.
- Links / Tools:
- PipX for Python Command-Line Tools
PipX is a tool-focused package manager letting you install and run Python CLI apps in isolated environments system-wide. It’s perfect for commands like
black
,glances
, orpyjokes
that you want to always have at hand without conflicting with other dependencies.- Links / Tools:
- Rich and Black for Code Formatting and Display Rich produces beautiful CLI elements (tables, syntax highlighting, and more), while Black is the “uncompromising” code formatter that removes style debates from your team. Combined, they make for a more pleasant Python development experience, from code consistency (Black) to visually rich terminal output (Rich).
- Links / Tools:
- Seaborn for Data Visualization Seaborn builds on matplotlib to produce attractive statistical plots with less boilerplate. It automatically handles many aesthetic decisions, making it especially popular in data science for quick or publication-worthy charts.
- Links / Tools:
- Stevedore for Plugin Management Stevedore makes building a plugin system in Python straightforward by allowing dynamic loading and management of separately distributed extensions. This is great for applications that want to let third-party packages “plug in” new behaviors at runtime.
- Links / Tools:
Interesting Quotes and Stories
- Antonio on the Inspiration behind the Repo: “I think from my point of view, it’s a way to celebrate the people and celebrate those small packages. They deserve a place where everyone can contribute.”
- On the Value of Python: “What I think is most important is the time you save. If you want to prove value, Python is probably the best way to do it.”
Key Definitions and Terms
- Async/await: A programming paradigm in Python enabling non-blocking operations, crucial for scaling I/O-driven apps.
- ORM (Object Relational Mapper): A library that maps database rows to Python objects, removing much manual SQL writing.
- CLI (Command-Line Interface): Textual interface to interact with software; many Python tools, like PipX or Rich, enhance CLI workflows.
- Dimensionality Reduction: Techniques like UMAP that compress large, high-dimensional datasets into fewer dimensions for analysis or visualization.
Learning Resources
Here are a few resources to help you delve deeper into Python:
- Python for Absolute Beginners: If you’re starting your programming journey, this course will walk you through Python’s fundamentals step by step.
- MongoDB with Async Python: For those curious about Beanie, Pydantic, and async operations.
- Modern APIs with FastAPI and Python: If you enjoyed hearing about async tools, you can learn how to build modern, fast APIs with Python’s more advanced features.
Overall Takeaway
This episode highlights how the Python community continually discovers and elevates smaller yet innovative tools in the ecosystem. From specialized databases and advanced CLI plotting libraries, to robust code formatting solutions, each package solves real-world challenges in a lightweight and Pythonic way. By curating these recommendations in a single GitHub repository, Antonio provides a snapshot of the best (and often lesser-known) tools for Python developers of all levels. Ultimately, the conversation reminds us that the Python ecosystem thrives on collaboration, open-source innovation, and a willingness to share knowledge to help each other succeed.
Links from the show
Notable PyPI Package Repo: github.com/xandrade/talkpython.fm-notable-packages
Antonio's recommended packages from this episode:
Sumy: Extract summary from HTML pages or plain texts: github.com
gTTS (Google Text-to-Speech): github.com
Packages discussed during the episode
1. FastAPI - A-W-E-S-O-M-E web framework for building APIs: fastapi.tiangolo.com
2. Pythonic - Graphical automation tool: github.com
3. umap-learn - Uniform Manifold Approximation and Projection: readthedocs.io
4. Tortoise ORM - Easy async ORM for python, built with relations in mind: tortoise.github.io
5. Beanie - Asynchronous Python ODM for MongoDB: github.com
6. Hathi - SQL host scanner and dictionary attack tool: github.com
7. Plotext - Plots data directly on terminal: github.com
8. Dynaconf - Configuration Management for Python: dynaconf.com
9. Objexplore - Interactive Python Object Explorer: github.com
10. AWS Cloud Development Kit (AWS CDK): docs.aws.amazon.com
11. Luigi - Workflow mgmt + task scheduling + dependency resolution: github.com
12. Seaborn - Statistical Data Visualization: pydata.org
13. CuPy - NumPy & SciPy for GPU: cupy.dev
14. Stevedore - Manage dynamic plugins for Python applications: docs.openstack.org
15. Pydantic - Data validation and settings management: github.com
16. pipx - Install and Run Python Applications in Isolated Environments: pypa.github.io
17. openpyxl - A Python library to read/write Excel 2010 xlsx/xlsm files: readthedocs.io
18. HttpPy - More comfortable requests with python: github.com
19. rich - Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal: readthedocs.io
20. PyO3 - Using Python from Rust: pyo3.rs
21. fastai - Making neural nets uncool again: fast.ai
22. Numba - Accelerate Python Functions by compiling Python code using LLVM: numba.pydata.org
23. NetworkML - Device Functional Role ID via Machine Learning and Network Traffic Analysis: github.com
24. Flask-SQLAlchemy - Adds SQLAlchemy support to your Flask application: palletsprojects.com
25. AutoInvent - Libraries for generating GraphQL API and UI from data: autoinvent.dev
26. trio - A friendly Python library for async concurrency and I/O: readthedocs.io
27. Flake8-docstrings - Extension for flake8 which uses pydocstyle to check docstrings: github.com
28. Hotwire-django - Integrate Hotwire in your Django app: github.com
29. Starlette - The little ASGI library that shines: github.com
30. tenacity - Retry code until it succeeds: readthedocs.io
31. pySerial - Python Serial Port Extension: github.com
32. Click - Composable command line interface toolkit: palletsprojects.com
33. Pytest - Simple powerful testing with Python: docs.pytest.org
34. testcontainers-python - Test almost anything that can run in a Docker container: github.com
35. cibuildwheel - Build Python wheels on CI with minimal configuration: readthedocs.io
36. async-rediscache - An easy to use asynchronous Redis cache: github.com
37. seinfeld - Query a Seinfeld quote database: github.com
38. notebook - A web-based notebook environment for interactive computing: readthedocs.io
39. dagster - A data orchestrator for machine learning, analytics, and ETL: dagster.io
40. bleach - An easy safelist-based HTML-sanitizing tool: github.com
41. flynt - string formatting converter: github.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy