Learn Python with Talk Python's 270 hours of courses

Tips for ML / AI startups

Episode #356, published Mon, Mar 14, 2022, recorded Thu, Feb 17, 2022

Have you been considering launching a product or even a business based on Python's AI / ML stack? We have a great guest on the episode this week, Dylan Fox, who is the cofounder of AssemblyAI and has been building his startup successfully over the past few years. He has interesting stories of 100s of GPUs in the cloud, evolving ML models, and much more that I know you'll enjoy hearing.

Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

Guests introduction and background

Dylan Fox is the founder of AssemblyAI, a speech-to-text and audio intelligence platform built with a deep focus on privacy, cutting-edge machine learning models, and developer-friendly APIs. Dylan started the company around 2017, leveraging his experience in natural language processing (NLP) and large-scale computing. Over time, he has led AssemblyAI to process millions of audio files daily and continually evolve their ML architecture to deliver more accurate transcription results and advanced audio analysis features. His passion for both fostering high-growth startups and delving into challenging ML / AI problems shines through in this conversation.


What to Know If You're New to Python

If you’re just starting out with Python and want to dive into ML or AI topics, here are some notes from this episode to help prepare you:

  • Having a grasp of Python’s core syntax and package management (like pip) is essential for exploring frameworks such as TensorFlow, PyTorch, or scikit-learn.
  • Building APIs in Python often involves asynchronous processing and frameworks such as Tornado, Flask, or FastAPI.
  • Tools like huggingface.co provide pre-trained models you can fine-tune in Python without deep ML expertise.
  • Even modest command-line skills (e.g. custom LS replacements like PLS) can speed up daily coding tasks in Python.

Key points and takeaways

  1. Core Challenge of AI Startups AI-focused startups demand balancing cutting-edge research with practical user-facing features. Dylan shares that while new ML architectures can yield breakthroughs, they must align with real user needs and be maintainable at scale.
  2. Choosing between TensorFlow and PyTorch The episode highlights how PyTorch gained tremendous popularity because research advances, papers, and model examples often appear in PyTorch first. Although TensorFlow still has its place, PyTorch tends to simplify experimentation and transfer from research to production.
  3. Starting Simple with scikit-learn Before jumping into deep learning, scikit-learn can efficiently handle many classification or regression tasks. Dylan notes that for something like predictive system monitoring or simpler use cases, classical ML approaches (e.g., SVMs) can be very effective and less resource-intensive than large neural networks.
  4. Production Infrastructure and ML Ops Complexity Putting ML models into production is notably different from standard web apps. The conversation breaks down how new neural architectures require changes to container images, instance types, GPU/CPU usage, auto-scaling, and orchestration.
  5. Cost Tradeoffs: Renting vs. Owning Compute AssemblyAI’s training infrastructure often uses dedicated GPU servers rather than on-demand cloud instances, saving significantly when training across dozens of GPUs for weeks at a time. Bootstrapped startups may rely on AWS, but once usage grows, purchasing or leasing dedicated hardware can dramatically reduce costs.
  6. Privacy and Data Handling For privacy reasons, AssemblyAI does not permanently store user audio data. Instead, they hold audio only in memory during processing and then store transcripts (optionally, with user consent for model improvement). This approach reduces risk for both customers and the platform itself.
  7. Real-time vs. Batch Processing Real-time APIs (like streaming voice over WebSocket) add an extra layer of complexity due to ultra-low latency requirements. Many ML startups must decide how to handle streaming data, whether to compress or batch it, and how to manage concurrency effectively.
  8. Scaling Auto-scaling Dylan explains their team’s approach of overprovisioning compute to handle spikes while they refine auto-scaling policies. Although costlier in the short run, it prevents downtime. Once usage patterns stabilize, more precise autoscaling can balance performance and cost.
  9. Data Moats and Large Language Models (LLMs) While some argue that possessing massive private datasets is a moat, Dylan suggests that public domain data plus advanced architectures (like GPT-3) can often match or surpass private data advantages. This democratizes ML innovation but also increases competition.
    • Links / Tools:
  10. Raising Capital vs. Bootstrapping Balancing heavy R&D costs and the speed needed for ML breakthroughs often leads AI startups to seek venture capital. However, Dylan acknowledges that many successful startups do bootstrap, focusing on a minimal workable model and scaling up profits to grow over time.

Interesting quotes and stories

  • Memorizing Al Gore’s TED Talk: Dylan mentioned testing early speech-recognition models so frequently on an Al Gore talk that parts of the transcript are now ingrained in his memory—a humorous testament to the iterative nature of ML model development.
  • The Real-time vs. Batch Challenge: “For the asynchronous API, it’s basically like an orchestration problem—kicking off background jobs in parallel, collecting them, then returning the results,” illustrating how much custom plumbing is needed to handle large-scale ML data processing behind the scenes.

Key definitions and terms

  • ML Ops: The practice of streamlining machine learning model development (R&D) and deployment to production systems, including auto-scaling, continuous integration, monitoring, and more.
  • SVM (Support Vector Machine): A classical supervised learning algorithm used for classification and regression tasks, often effective with smaller datasets.
  • PyTorch: A deep learning framework favored for rapid research and easier model deployment, developed primarily by Facebook’s AI Research lab.
  • Ephemeral Storage: A method of handling data only in memory or temporary storage during processing, often used for heightened privacy.
  • GPU (Graphics Processing Unit): Specialized hardware well-suited for parallel operations in machine learning tasks, often essential for training large neural networks.

Learning resources

If you want to gain deeper insights into Python and machine learning, here are a couple of helpful courses from Talk Python Training.

  1. Python for Absolute Beginners: Ideal for those totally new to Python, covering foundational concepts to help you become a confident Python developer.
  2. Build An Audio AI App: Get practical experience building a real-world audio-focused ML application, similar to the conversation around AssemblyAI.

Overall Takeaway

This conversation with Dylan Fox offers a behind-the-scenes look at what it takes to run a successful AI startup—from the day-to-day complexities of building production ML infrastructures to decisions around scaling costs and code frameworks. Whether you’re working on speech-to-text or any domain-specific AI, focusing on real-world customer needs, carefully evaluating your compute resources, and adopting iterative model improvements can set the stage for both cutting-edge innovation and consistent product delivery.

Links from the show

Dylan Twitter: @YouveGotFox
AssemblyAI: assemblyai.com
TensorFlow: tensorflow.org
PyTorch: pytorch.org
hugging face: huggingface.co
SciKit-Learn: scikit-learn.org
GeForce Card: nvidia.com
pLS: twitter.com
This journalist’s Otter.ai scare is a reminder that cloud transcription isn’t completely private: theverge.com
Programming language trends: insights.stackoverflow.com
Can My Water Cooled Raspberry Pi Cluster Beat My MacBook?: the-diy-life.com
PyTorch vs TensorFlow in 2022: assemblyai.com/blog/pytorch-vs-tensorflow
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon