Brought to you by Rollbar - Put errors in their place with pip install rollbar

Episode #56: Data Science from Scratch

Published Wed, Apr 27, 2016, recorded Tues, Apr 19, 2016.

You likely know that Python is one of the fastest growing languages for data science.

This is a discipline that combines the scientific inquiry of hypotheses and tests, the mathematical intuition of probability and statistics, the AI foundations of machine learning, a fluency in big data processing, and the Python language itself. That is a very broad set of skills we need to be good data scientists and yet each one is deep and often hard to understand.

That's why I'm excited to speak with Joel Grus, a data scientist from Seattle. He wrote a book to help us all understand what's actually happening when we employ libraries such as scikit-learn or numpy. It's called Data Science from Scratch and that's the topic of this week's episode.

Links from the show:

Book: Data Science from Scratch:
Joel on Twitter: @joelgrus
Joel on the web:
Partially Derivative Episode:
Allen Institute for Artificial Intelligence:

Data Science Libraries

Numpy episode: #34:
Continuum: Scientific Python and The Business of Open Source

scikit-learn episode: #31: Machine Learning with Python and scikit-learn:

Google's TensorFlow:

Want to go deeper? Check out my courses

Joel Grus
Joel Grus
Joel Grus is a research scientist at the Allen Institute for Artificial Intelligence and the author of *Data Science from Scratch: First Principles with Python*. Previously he worked as a software engineer at Google and as a data scientist at a variety of startups. He lives in Seattle, can be found on Twitter at @joelgrus, and blogs sporadically at

Individuals: Support this podcast via Patreon or one-time via Square Cash or . Corporate sponsorship opportunities available here.