Learn Python with Talk Python's 270 hours of courses

10 tips to move from Excel to Python

Episode #288, published Sat, Oct 31, 2020, recorded Tue, Sep 15, 2020

Excel is one of the most used and most empowering piece of software out there. But that doesn't make it a good fit for every data processing need. And when you outgrow Excel, a really good option for a next step is Python and the data science tech stack: Pandas, Jupyter, and friends.

Chris Moffitt is back on Talk Python to give us concrete tips and tricks for moving from Excel to Python!

Episode Deep Dive

Guest Introduction and Background

Chris Moffitt is the creator of Practical Business Python (PB Python), a blog focused on using Python to solve day-to-day data and automation challenges in business settings. He works in the medical device industry, leveraging Python for analytics, pricing strategy, data transformation, and reporting. Chris has a passion for teaching people how to break out of spreadsheet-only workflows and enhance their processes with Python’s data science stack. He is a returning guest who previously discussed escaping “Excel Hell,” and here he continues that conversation, sharing tips and tools to empower Excel-focused users to seamlessly transition to Python.

What to Know If You're New to Python

If you’re just starting out and want to get the most from this episode, here are a few points to help you:

  • Local Setup: Installing Python on Windows often involves using Miniconda to manage packages and virtual environments.
  • Interactive Notebooks: Tools like Jupyter Notebook are fantastic for step-by-step data exploration and experimentation.
  • Data Mindset: Shift from single cells and manual formulas to thinking in terms of columns, rows, and transformations over entire datasets at once.
  • Automation Possibility: Python can automate repetitive tasks you’d otherwise do manually in Excel—especially useful if you frequently import data and apply similar operations.

Key Points and Takeaways

  1. Why Move from Excel to Python Excel is powerful for many tasks but struggles with large datasets, repeatability, and complex data transformations. Python, particularly with libraries like pandas, offers clarity, performance, and easier automation—so you can run the same data workflow repeatedly without risk of “breaking” hidden formulas. By defining a sequence of steps in code, you become more efficient and reduce errors caused by manual cut-and-paste operations.
  2. Defining Your Data Problem Clearly Jumping straight into machine learning or large-scale projects can be overwhelming. Chris advises starting with small, well-understood tasks, such as automating a daily or weekly spreadsheet procedure. Focusing on a concrete, repetitive workflow gives you a clear goal and a smaller learning curve, ensuring you build skills steadily in Python.
  3. Organizing Data into Raw, Intermediate, and Output Phases One of Chris’s tips is to keep a clean directory structure so you never alter your original data. You load and transform raw files programmatically, writing cleaned results to a separate folder. This approach makes it far easier to re-run analyses, reproduce results, and troubleshoot mistakes if they arise.
  4. Essential pandas Operations (Selection and Accessors) Mastering .loc, .iloc, and the .str / .dt accessors lays the foundation for most data analysis tasks in Python. You can filter or slice rows and columns by labels, handle text cleaning with .str.replace(), .str.lower(), or manipulate date/time columns using .dt.year(), .dt.month(), and so on. This “vectorized” mindset replaces loops with concise, expressive statements.
  5. Boolean Indexing (Filtering Your Data) In Excel, you might filter rows manually or use nested IF statements. In pandas, you can do powerful queries in one line by combining conditions (e.g., df[(df['Sales'] > 100000) & (df['State'] == 'CA')]). This makes it easy to chain transformations and quickly see results for specific subsets.
  6. Summaries and Grouping with GroupBy If you need to summarize data—like aggregating sales by region or by product—pandas’ groupby() method is a workhorse. You can compute sums, means, counts, or more advanced statistics in seconds. This replaces complicated pivot tables or repeated copy-paste consolidation in Excel.
  7. Pivot Tables in pandas Pivot tables in Excel are beloved for quick, interactive summaries. Pandas offers pivot_table() to achieve similar outputs in code. While not a drag-and-drop UI, it provides equally powerful summarization for multi-dimensional data. Once you learn it, you can automate pivot-like transformations across many files or datasets.
  8. Merging and Concatenating Multiple Data Sources Rather than manually cutting and pasting, pandas can seamlessly merge or append data frames—even if you have millions of rows. This is essentially an automated, more robust version of Excel’s “VLOOKUP” or “XLOOKUP,” allowing for sophisticated joins and simpler code reuse in your projects.
  9. Data Visualization and Plotly Express Excel charts are fine, but for richer interactive plotting, Chris highlighted Plotly Express. It automatically creates hoverable, zoomable plots in Jupyter Notebooks with minimal code. Exporting to PNG or SVG is easy (using libraries like Kaleido), so you get professional-looking visuals to share with others or embed in reports.
  10. Automating Reports and Working Alongside Excel Python doesn’t have to replace Excel entirely. You can still read or write .xlsx files and deliver final outputs so co-workers receive spreadsheets they’re comfortable with. Tools like OpenPyXL or built-in pandas Excel I/O functions let you set formulas, styles, and pivot tables in your output so that teams can pick up exactly where they expect.
  1. Shortcomings of Excel: Errors & Limitations Excel’s hidden formulas, misinterpreted gene names (the famous date auto-format error), and limited rows highlight the dangers of pushing Excel too far. Python’s clarity and reproducibility help you avoid fiascos that come from accidental cell edits or data conversions. This structural reliability is a key selling point for adopting Python in data-centric organizations.
  1. Growing Adoption and Organizational Buy-In Chris points to the importance of small wins. Demonstrating a single automated daily task can show leadership the time savings and accuracy gains from Python. Having a supportive group or a “cohort” within the company helps maintain momentum and confidence for broader adoption.
  1. Side Table Library for Quick Summaries Chris’s own library, Side Table, extends pandas for easy frequency tables and quick overviews. Similar to a simpler pivot table, it can show you how data is distributed across categories in one line of code—handy when exploring new datasets.
  1. WSL (Windows Subsystem for Linux) for Advanced Setup For power users on Windows, Chris described how WSL2 unlocks near-native Linux performance without dual-boot or heavy VMs. You can run Ubuntu, use Linux tooling, then seamlessly open browsers or file explorers in Windows. This is optional but helpful if you want a more Linux-like workflow.
  1. Choosing an Editor (VS Code) Chris switched from Sublime Text to Visual Studio Code. He praises VS Code’s monthly improvements, its integration with Python, Jupyter Notebooks, and WSL, making it easier for Windows users to maintain consistent workflows and handle projects that might span Windows and Linux environments.

Interesting Quotes and Stories

Naming Genes to Satisfy Excel: Chris referenced the real-world case where scientific organizations renamed certain genes because Excel auto-formatted them as dates. This underscores how spreadsheet “helpfulness” can derail entire datasets.

Automation as a Selling Point: “If you can show that running this script takes five seconds where it used to take half an hour by hand, people start to see the power of Python.”

Key Definitions and Terms

  • Vectorized Operations: Performing actions on an entire column or set of values in a single step rather than looping cell-by-cell.
  • Boolean Indexing: Filtering a data frame using expressions like df[df['value'] > 100].
  • Pivot Table / Pivoting: Summarizing data across multiple dimensions in a compact table; in pandas, pivot_table() or groupby() help replicate Excel’s pivot experience.
  • Miniconda: A lightweight distribution of conda for managing Python environments and packages.
  • WSL: Windows Subsystem for Linux, a feature that lets you run a Linux environment natively within Windows.

Learning Resources

Here are some ways to deepen your knowledge and reinforce key ideas from this episode:

  • Move from Excel to Python with Pandas: A Talk Python course featuring Chris Moffitt’s expertise and real-world examples, tailored to Excel users transitioning to Python.
  • Python for Absolute Beginners: If you’re just getting started with programming, this course offers a clear, hands-on approach to core Python concepts.
  • PB Python Blog: Chris’s blog covers practical examples, including Excel-to-Python topics, real business use cases, and more advanced articles.

Overall Takeaway

Switching from Excel to Python is not about abandoning spreadsheets altogether; it’s about leveraging powerful, repeatable code-based workflows for tasks that outgrow Excel’s reliability and scale. With libraries like pandas and Plotly, plus tools for reading, writing, and collaborating via Excel formats, you have the best of both worlds: automated workflows, fewer errors, and the ability to continue sharing results with non-Python teams. Whether you’re a business analyst, data scientist, or casual Excel user, adopting Python’s data stack will accelerate your work, reduce mistakes, and open up new possibilities for data insight and collaboration.

Links from the show

Chris on Twitter: @chris1610
Practical Business Python: pbpython.com
Escaping Excel Hell with Python and Pandas Episode 200: talkpython.fm
SideTable package: pbpython.com

Learn more and go deeper
Move from Excel to Python with Pandas Course: training.talkpython.fm
Excel to Python webcast: crowdcast.io
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon