Learn Python with Talk Python's 270 hours of courses

Data Science from Scratch

Episode #56, published Wed, Apr 27, 2016, recorded Tue, Apr 19, 2016

You likely know that Python is one of the fastest growing languages for data science.

This is a discipline that combines the scientific inquiry of hypotheses and tests, the mathematical intuition of probability and statistics, the AI foundations of machine learning, a fluency in big data processing, and the Python language itself. That is a very broad set of skills we need to be good data scientists and yet each one is deep and often hard to understand.

That's why I'm excited to speak with Joel Grus, a data scientist from Seattle. He wrote a book to help us all understand what's actually happening when we employ libraries such as scikit-learn or numpy. It's called Data Science from Scratch and that's the topic of this week's episode.

Links from the show:

Book: Data Science from Scratch: amzn.to/1rhcbdT
Joel on Twitter: @joelgrus
Joel on the web: joelgrus.com
Partially Derivative Episode: partiallyderivative.com
Allen Institute for Artificial Intelligence: allenai.org

Data Science Libraries

numpy: numpy.org
Numpy episode: #34:
Continuum: Scientific Python and The Business of Open Source
:
talkpython.fm/episodes/show/34

pandas: pandas.pydata.org
scikit-learn: scikit-learn.org
scikit-learn episode: #31: Machine Learning with Python and scikit-learn:
talkpython.fm/episodes/show/31

matplotlib: matplotlib.org
Google's TensorFlow: tensorflow.org


Want to go deeper? Check out our courses

Talk Python's Mastodon Michael Kennedy's Mastodon