00:00 Have you been considering launching a product or even a business based on Python's AI? ML Stack? We have a great guest on this episode, Dylan Fox, who is the founder of Assembly AI and has been building his startup successfully over the past few years.
00:16 He has interesting stories of hundreds of GPUs in the cloud involving ML models and much more that I know you're going to enjoy hearing. This is Talk Python To Me episode 356, recorded February 17, 2022.
00:33 Welcome to Talk Python.
00:45 Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy and keep up with the show and listen to past episodes at talkpython.Fm. And follow the show on Twitter via @talkPython. We've started streaming most of our episodes live on YouTube, subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.
01:09 This episode is brought to you by the Stack Overflow podcast. Join them to hear about programming stories and about how software is made and Sentry find out about errors as soon as they happen. Transcripts for this and all of our episodes are brought to you by Assembly AI. Do you need a great automatic speech to text API, get human level accuracy and just a few lines of code?
01:32 talkpython.fm/assemblyai.
01:35 Dylan, welcome to Talk Python to Me.
01:37 Yes, thank you. I am a fan, have listened to a lot of episodes and the podcast fan, so I'm happy to be on here. Thanks.
01:45 Yeah, I'm sure you are. Your specialty is in the realm of turning voice into words.
01:51 Yeah, that's right.
01:52 I bet you do a lot of studying of media like podcasts or videos and so on, right?
01:58 Yeah. It's actually funny. So I started the company. I started an assembly like four years ago and there was this one audio file that I always used to test our speech recognition models on. It was this Al Gore Ted Talk in 2007, I think. And I've almost memorized parts of that Ted Talk because I've just tested it so many times. It's actually still part of our end to end test suite. It's in there. It's like a legacy kind of founder thing in the country.
02:24 How cool.
02:24 Yeah, it is kind of funny, especially now that we're like 30 people at the company and I'll see some of the newer engineers, like writing tests around like that Al Gore file still. And it makes me laugh because that's like there's no real reason I picked that. It was just something easy that came to me.
02:41 Yes, you can start. I just got to grab some audio hear something, right?
02:44 Yeah, exactly.
02:45 So definitely. I've also listened to a ton of podcasts and I was with we just started releasing models for different languages. And I was with someone from our team last week and I heard this phone call and it's like foreign language people like screaming on this. I was like, what are you listening to? And it is funny. As an audio company, you get sometimes data from customers and it's like you have to listen to it.
03:09 Yeah, I bet there's some interesting stories in there. Yeah, for sure.
03:12 Well, we're very Privacy conscious, so not too many.
03:16 But yeah, there was just on The verge, there was just an article about a different speech to text company. Have you seen this? But there was some suspicious stuff going on. Let me see if I can find it, I think. What was it called? It was Otter.
03:34 Otter AI.
03:35 I'm not asking you to speak on them, but this a journalist Otter AI scares a reminder that cloud transcription isn't completely private. Basically, there was a conversation about Uighurs in China or something like that. And then the unprompted reached out to the person who was in the conversation, said, could you tell us the nature of why you had to speak about this?
03:59 No way.
03:59 They're like, what?
04:01 That is crazy.
04:03 We're a little concerned about why it's kind of like interested in the concept of our conversation.
04:08 Yeah. There's a lot of that suspicion around. There's some conspiracy theories right around. Like, oh, does your phone listen to you while you're around? And does it evidently listen to you and then use that data to remarket to you? And I was talking to someone about this recently. I did nothing about the location based advertising world, but sometimes I'll be talking about something and then I'll see ads for it on my phone or if I'm on Instagram or something. And someone told me it's probably more based on the location that you're in and the other data that they have about you.
04:40 You were at your friend's house. Your friend just searched for that, then told you about it.
04:44 Yeah, exactly.
04:45 I think the reality of that is that it's actually more terrifying than if they were just listening to you, that they can piece together this shadow reality of you that matches reality so well, yeah.
04:56 Like your friend just bought this thing and you went over and then. So maybe you're interested in this thing because you're probably.
05:02 They probably told you about it or something, right?
05:04 Yeah, it is really crazy. It is really crazy. I haven't paid too much attention to all the changes that are happening around. Like, I listened to some podcast, I think, on The Wall Street Journal about, like, the big change that Google is making around that tracking. And now a lot of people are up in arms about that. And it was saying something how like they're going to have and sorry if I'm derailing whatever plan we had for a conversation here.
05:24 You're derailing in a way that I'm, like, super passionate about because it's so crazy. But, yeah, too deep. But, yeah, it's so interesting.
05:31 They said that there's probably butchering this, but something for each user they're just going to have like six categories about you. And then one of them is going to be randomly inserted as to somehow anonymized your profile. And I just thought, yeah, it's super weird to hear about how they're doing, what the meeting was internally. They came up with that idea. So I'm like, well, let's just throw a random category on there.
05:54 I don't know, my thoughts are we're faced or were presented with a false dichotomy. Either you can have horrible, creepy tracking advertising shadow things like we talked about, or you can let the creators and the sites that make the services you love die. There are more choices than two in this world.
06:15 Right.
06:15 For example, you could have some kind of ad that is related to what is on the page rather than who is coming to the page.
06:23 Right.
06:23 You don't necessarily have to retarget me.
06:25 Right.
06:25 For example, right here on this podcast, I'm going to tell people about not sure without looking the schedule what the advertisers are for this episode, but I am going to tell people about that. And it is exactly related to the content of the show. It's not that. I found out that Sarah from Illinois did this Google search and visit this page. So now we're going to show there's so many sites like this one here on The Verge. You could have ads for Assembly AI, and maybe you don't actually want this one, but, you know, like things like this, it would be totally reasonable to put an ad for speech to text companies on there that requires no targeting and no evil shadow companies. And there's, you know, like, we go on and on. But there are many ways like that, right. That don't require this false dichotomy that is being presented to us. So hopefully we don't end up with either of those because I don't think those are the best options are the only options.
07:21 Yeah. It's weird how that's kind of how things developed to where we are now. But I agree with you. It's probably a lot because everyone's looking for like.
07:30 Okay, well, if we do retargeting, we can get 2% better returns and no one is worried about, well, what happens to society?
07:39 That's actually what I was going to say. It's all about the high growth, like, society that we have where you need to maximize growth and maximize returns. And I mean, I understand this acutely. Like, I'm CEO of a startup. I get it. But yeah, it's when it's like growth over everything, you end up with things like what you said, like, proves that our return is 2%. So let's do this. But you don't think about what the trade offs will be.
08:03 Yes, absolutely. All right. Well, thanks for that diversion. That was great.
08:07 Yeah.
08:08 But before we get beyond it, let me just get your background real quick. How did you get into programming? And I'm going to mix up little and machine learning.
08:16 Yeah. Yeah, definitely. Do you want the long story of the short story? How many minutes go intermediate, intermediate, intermediate. So the intermediate story is that I started a company when I was in College, just like a College startup thing, and at the time was very limited in my programming knowledge. I had done some basic, like HTML when I was a kid. I was really into Counterstrike and Call of Duty and would like sell private servers.
08:45 I don't know how I got into this, but I rented the servers and I would like remote Windows desktop into them and set up private counter strike servers and then sell those and set up like a basic website for it with HTML and CSS. And my brother was super into computer, so it's always kind of into computers. And then in College got into startups and I think programming and startups are really connected. So through that, learned how to code, learned how to program, started attending Python Meetups in Washington DC, where I went to school. And that's how I met Matt McKay, whose mutual connection so attended a bunch of meet ups, learned how to program, and then got really into it. But I think what I found myself more interested in was the more like meaty programming problems, more like, I guess, algorithm type problems. And that kind of naturally led me to like machine learning and NLP, and then kind of just like took off from there because I found that I was really interested in machine learning and different NLP problems.
09:50 Those are obviously the really hard ones.
09:54 Yeah.
09:55 Especially probably. When was this that you were doing it?
09:58 This is maybe like 2013, 2014.
10:02 Okay.
10:03 Yeah.
10:04 Kind of the early days of when that was becoming real. Right. I remember feeling like all this AI and text to speech or speech to text rather type of stuff was very much like fusion, like 30 years out, always 30. It's going to come eventually, but people are doing weird stuff in the Lisp and it doesn't seem to be doing much of anything at all.
10:26 Like Pearl script.
10:28 Yeah. And then all of a sudden we end up with amazing speech attacks. We end up with self driving cars. Like something clicked and it all came to life.
10:36 Yeah. It's kind of crazy, especially over the last couple of years. I think what's really interesting is that a lot of the advances in self driving cars and NLP and speech and text, they're all based on similar machine learning algorithms, like the transformer. Right. Which is really popular type of neural network that came out was initially applied towards just NLP, like text language modeling related tasks. Now that's shown to be super powerful for speech as well. Whereas classical machine learning, there are still these underlying algorithms, like support vector machines or other types of underlying algorithms, but a lot of the workers around the data and so how can you extract better features for this type of data. And you had to be I remember when I was getting into speech recognition, I bought this speech recognition, like, textbook. And this is a while ago, and it was around like really understanding phonemes and how different things are spoken and how the human speech is spoken. And now you don't need to know about that. You just get a bunch of audio data and you train these big neural networks. They figure that out.
11:39 Right. You wanted to understand British accents and American accents. You just give it a bunch of more data.
11:44 Yeah, exactly.
11:46 But it is crazy to see where things have gotten over the last couple of years in particular.
11:51 Yeah.
11:52 So when I was starting out, neural networks were there, but they're a lot more basic. And you have like, there's a lot more compute resources down, more mature libraries like TensorFlow and PyTorch. I think I went to one of the first TensorFlow meetups that they had or not meet ups like developer days or whatever down at the Google conference. It's like so new still. Yeah, it's so new. Yeah.
12:14 It's easy to forget. It is a while ago. It is that all the stuff didn't even exist, right?
12:19 Yeah, absolutely.
12:20 So you mentioned Assembly AI.
12:22 Yes.
12:23 That's what you're doing these days, right?
12:24 Yeah. So I am the founder of a company called Assembly AI. We create APIs that can automatically transcribe and understand audio data. So we have APIs for automatic speech to text of audio files, live audio streams, and then APIs that can also summarize audio content, do content moderation on top of audio content, detect topics, what we call audio intelligence APIs.
12:52 We have a lot of startups and enterprises using our APIs to build the way we call applications on top of audio data, whether it's like content moderation of a social platform or speeding up workflows like I'm sure you have, where you take a podcast recording and transcribe it so you can make it more shareable or extract pieces of it to make it more shareable.
13:13 Exactly. Yeah. Basically, for me, it's a CLI command that runs Python against your API, against a remote MP3 file, and then magic.
13:25 That's the great thing about podcast hosts. They're also programmers. I've talked to a few and they're all like, there's a bunch that are non programmers and they use these different services, but every podcast host that I've talked to that's a programmer. They have their own, like CLI's and Python script running.
13:42 Yeah.
13:43 There's a whole series of just CLI's other commands to do the workflow. I do want to just give a quick statement Disclaimer.
13:51 Yes.
13:52 So if you go over to the transcripts or possibly, I suspect if you listen to the beginning of this episode, it will say that it's sponsored by assembly AI.
13:59 This episode is not part of that sponsorship. This is just you and I got to know each other. You're doing interesting stuff. You've been on some other shows that I've heard that the conversation was interesting. So I invited you on. Thank you for sponsoring the show. But just to point out, this is not actually part of that, but the transcripts that we do have on the show the last year or so are basically generated from you guys, which is pretty cool.
14:23 And we don't even need to talk about assembly that much on this podcast. We can talk about other things. Yeah.
14:28 So one of the things I want to talk about and maybe what's on the screen here gives a little bit of a hint. Being TensorFlow is why do you think Python is popular for machine learning startups in general?
14:42 I'm not as deep in that space as you, but looking in from the outside, I guess I would say it feels very much like Python is the primary way, which a lot of this machine learning stuff is done.
14:52 Yeah, it's a good point. So why that is outside of machine learning, even I think Python is just such a popular language because it's so easy to build with compared to PHP or C# and even JavaScript. Like, when I learned to code, I started with Python because the syntax was easy to understand. There are a lot of good resources. And then there's kind of this, like, Snowball effect where more people know Pythons, there's more tutorials about Python, there's more libraries about Python, and it's just more popular of a language. Yeah. This insight is pulling this up.
15:26 Yeah. But if you pull up the Stack Overflow trends for the most popular programming languages, there's only one that is going dramatically up out of ten languages or something.
15:37 It's just so much more popular.
15:38 Yeah, it is. It's so interesting how it's really sort of taken off. And it wasn't back in when you got started. And when I got started back in this general area, it was number one language then the number one then was, what is that?
15:54 C#.
15:54 C#. But you got to keep in mind, this is a little bit of a historical bias of Stack Overflow. Stack Overflow was started by Jeff Outwood and Joel Spokesky, who came out of the Netspace.
16:05 Okay.
16:05 So when they created.
16:06 Its initial traction was in C# and VB. But over time, clearly, it's become like where programmers go, obviously. So he got a bit with a grain of salt. But that was the number one back in the early days.
16:17 Another founder legacy decision.
16:19 Yes, exactly.
16:22 I agree that it's absolutely generally popular, and I think there's some interesting reasons for that. It's just so approachable. But it's not a toy. Right. A lot of approachable languages are toy languages, and a lot of non toy languages are hard to approach.
16:37 This portion of Talk Python to me is brought to you by the Stack Overflow podcast.
16:42 There are few places more significant to software developers than Stack Overflow, but did you know they have a podcast?
16:48 For a dozen years, the Stack Overflow Podcast has been exploring what it means to be a developer and how the art and practice of software programming is changing our world. Are you wondering what skills you need to break into the world to technology or level up as a developer? Curious how the tools and frameworks you use every day were created? The Stack Overflow Podcast is your resource for tough coding questions and your home for candid conversations with guests from leading tech companies about the art and practice of programming. From Rails to React, from Java to Python, the Stack Overflow Podcast will help you understand how technology is made and where it's headed. Hosted by Ben Popper, Cassidy Williams, Matt Kiernanda, and Sierra Ford, the Stack Overflow Podcast is your home for all things code. You'll find new episodes twice a week.
17:35 Wherever you get your podcast, just visit Talkpython.
17:38 Fm/StackOverflow and click your Podcast player icon to subscribe. One more thing. I know you're a podcast veteran and you could just open up your favorite podcast app and search for the Stack Overflow Podcast and subscribe there. But our sponsors continue to support us when they see results, and they'll only know you're interested from Talk Python if you use our link. So if you plan on listening, do use our link 'talkpython.fm/stackOverflow' to get started. Thank you to Stack Overflow for sponsoring the show.
18:07 Yeah. Like, for me, it was very easy to get started with Python.
18:12 I taught myself how to program. I went to College. I studied, like, economics, so I did not study programming in College computer science. And the first language I started to try to learn was PHP. And I bought this huge PHP textbook and made it halfway through. And I was like, what is going on? I gave up and then tried again with Python later, and it was so much easier. And then I also wonder how much of this is, like, for the machine learning libraries in specific. Like, you have these macro trends where a lot of the data science boot camps that have been so popular there's, like ScikitLearn. I know we have a tab up there. There's like NumPy. And NLTK is one of the popular NLP libraries. So there are a lot of libraries in Python. And the early, like, when I was getting into NLP, I worked a lot with NLTK and Scipi and ScikitLearn and NumPy, and a lot of work was done around there. And so people that were doing data science or doing some type of machine learning were already in Python. And then now you have PyTorch and TensorFlow, and it's just like kind of cemented. Like, okay, the machine learning libraries today, the popular ones, you work with them in Python. Yeah.
19:18 You want to give us your thoughts on those? We've got TensorFlow, PyTorch and probably Scikit-Learn and as well, those are the traditional ones. We've got newer ones, like hugging face.
19:27 Yeah. They're a cool company.
19:29 Maybe give us a survey of how you see the different libraries, ML space, the libraries that people might choose from.
19:35 So when we started the company, everything was in TensorFlow.
19:38 When was that?
19:39 Back in late 2017. Everything was in TensorFlow. And actually, I don't know what your PyTorch came out. I don't even know if it was out back then or maybe it was just internally Facebook.
19:52 Yeah, it's pretty new. Yeah.
19:53 Yeah. So TensorFlow was definitely they got started early. I think their docs and the framework just got complicated over the years, and then they sort of rebooted with TensorFlow 2.0, and then there's like Keras that was popular. It kind of got pulled in now, I think. So we switched everything over to PyTorch in the last year or two. A big reason for that was that. And we actually put out this article on our blog comparing like PyTorch and TensorFlow. And we have this chart where we show the percentage of papers that are released, where the code for the paper is in PyTorch versus TensorFlow. And it's a huge difference. Like most of the latest research gets implemented. Yeah. Here it is. If you go down to one of. So this is Hugging face. Can you go keep going. Research papers. Yeah, go up that one.
20:43 Yeah.
20:44 Okay. So it shows, like the fraction of papers. And so what we're showing here for the people that are listening is like a graph that shows the percentage of papers that are used build using PyTorch versus TensorFlow over time.
20:57 Yes. When you started, it was, what is the six 7%, probably 10% PyTorch and the balance being TensorFlow when you started your company and now it's 75% PyTorch. That's a huge, very large change.
21:10 Dramatic change. PyTorch was a company. It would be like probably raising a lot of money. I think one of the reasons we picked PyTorch is because a lot of the newer research was being implemented in PyTorch. First, there were examples in PyTorch, and so it was easier to get they have it on their tagline. But to quote them from research to production. Right. It was easier to get more exotic advanced neural networks into production and actually start training models with those different types of layers or operations or loss functions that were released in these different papers. So we started using PyTorch and we kind of haven't looked back.
21:46 Well, if you're tracking all the research and trying to build a cutting edge start up around ML, you don't want to wait for this to make its way to other frameworks. You want to just grab it and go. That's where the research is being done. That helps a lot, right?
22:00 Exactly. Yeah. You can get just get up and running a lot faster with the newer research. And so most companies that I talk to now they're all using PyTorch. I think PyTorch is definitely like the more popular framework. There are some new ones coming out that have people excited. But still, from what I can sense, PyTorch is if someone was going to get started today, I would tell them start with PyTorch.
22:23 I think TensorFlow is also who runs PyTorch. I think who runs PY.
22:27 Torch, which is released by Facebook, right.
22:28 Yeah.
22:28 And then TensorFlow.
22:29 That's Google.
22:30 Yeah. And I think Google's tried to tie TensorFlow into the cloud ML products to train your models on Google Cloud and use their TPUs in the cloud. And there's probably some like business cases behind that. But I feel like it may have made the developer experience worse because it's trying to get back to Google, whereas PyTorch isn't trying to get you to train your models on Facebook cloud or something.
22:52 Yeah. What's the story with Hugging face?
22:55 People probably wouldn't use Facebook Cloud if that existed nowadays.
23:00 I don't know if you'd want to host your data Metacloud Metacloud now.
23:04 Yeah, Metacloud. You can only do it in VR. What's the story with hugging face?
23:07 So Hugging face is a cool. So there's a company actually, and they have it's kind of hard to even explain. It's like you could basically get access to a bunch of different pretrained models really quickly through hugging pace. And so if you want to, a lot of work around NLP now is like how familiar with self supervised learning or base models for NLP? How familiar with that?
23:29 Somewhat.
23:29 So the idea is to have a general model and then apply some sort of transfer learning to build up a more specialized one without training from scratch.
23:39 Exactly. Yeah. And then that general model is really just trained to learn representations of the data. It's not even really trained, like with us, our particular NLP task, it's just like trade to learn representations of data and then with those representations that it learns, you can then say like, okay, I'm going to train you towards this specific task with some label data in a supervised matter. And so there's some really popular open source like base models, foundation models, like Bert is one that's a bunch of others. But you can easily get like load up Bert basically, and fine tune it on your data with hugging face. So if you're trying to get a model up and running quickly and like the NLP, like the text domain, you can do that pretty easily with hugging face.
24:25 Okay. Yeah.
24:26 So interesting, it's less like if you want to build your own neural network from scratch, like inputs to outputs, implement your own Moss function, all that you do that in PyTorch. If you want to try to just quickly fine tune burnt for a specific task that you're trying to solve, you could still go like the PyTorch out, but it would just be faster to go tugging base. So they've seen a lot of adoption there. And then Scikit-learn is kind of like the old school library that's been around forever with like the OG OG. Yeah. If you want to do stuff with support vector machines or random forests or like KDR's neighbors, this ScikitLearn is probably still really popular in that for those different use cases.
25:05 I do think I hear ScikitLearn being used quite a bit still.
25:09 Yeah.
25:10 Maybe in the research, the academic if you go take a course on it, probably there's a lot of stuff on this, I would guess.
25:17 Yeah. Like there's a lot of times where, I mean, you don't really need to build a neural network. I mean, there's parts of our stack that are like basic machine learning, like statistical models. And if you can get away with it, it's a lot easier to train and you don't need as much data and it's easier to deploy. So like a lot of like, recommendation type models. And sometimes SBMs are just like good enough.
25:37 SBM Support Back to Machines are just good enough for a task that you might want to have for a lightweight Netflix recommendation or YouTube recommendation. Not like the high end stuff that I'm sure they're actually doing something like that.
25:50 Yeah.
25:51 That kind of recommendation engine.
25:52 Yeah. Something basic. Although I actually am kind of overwhelmed with like the Netflix and YouTube recommendations are very good. Netflix recommendations and prime recommendations are kind of underwhelms by you would think that you watch.
26:04 I agree.
26:05 Yeah. It's still so hard to find things to watch sometimes on those platforms.
26:09 It is. And YouTube, Interestingly, seems to have an end. So if you scroll down through YouTube ten pages, it will start showing you like, well, it seems like we're out of options here. We'll show you ten from this one channel, and then we'll just kind of stop.
26:23 I know you got a lot of videos you could just keep recommending. I'm pretty sure if you would keep recommending there's stuff down here. But yeah, I agree. It's interesting.
26:31 I feel like it's gotten better, too. Like my YouTube consumption has really picked up over the last year. I would say the recommendation algorithms. And I don't know if it's just more content being created or maybe it's just like a personal thing for me. And there was some thing on Hacker News, too, about YouTube comments that one of the founders of Stripe posted are generally very positive.
26:52 There's really good comments on YouTube, too. So they definitely also come up with ways to classify comments as being high value or not and then put those on top. And nowadays those models are definitely used with something like some big neural networks and transformer, because those neural networks, they're so much better at understanding context and SPMs. You have to still for a lot of these classical machine learning approaches, like feed it hand labeled data. But the neural networks you have made, they're really good for those language tasks now.
27:24 Yeah, absolutely. Christopher, the audience has a question that's kind of interesting. Does it make sense to start with ScikitLearn? If, for example, you're trying to predict when a production machine is not out of tolerance yet, is trending to be.
27:37 Is that light guy?
27:38 If you were monitoring, like a data center for maybe VMs.
27:42 Like you're guessing Ram or like memory is going high or some statistic is like predictive that this VM will probably go down.
27:50 Failure is coming.
27:51 Failure is coming. Yeah. And the question was, is the SBM or ScikitLearn to start with? Yeah. I would actually probably say that's where you want to go with something like ScikitLearn, because there's probably very clear cut patterns. I would say if you're unsure of what the pattern is, then a neural network is good because the neural network can, in theory, like you're feeding it raw data and it's learning the pattern. But if you know what the pattern is like, okay, there's probably like these signals that if a human was just sitting there looking at it all day, would be able to tell this system is probably going to go down. Then you just can train an SDM or some type of classical machine learning model with ScikitLearn to be able to do those predictions with pretty high accuracy. And then you've got a super lightweight model. You don't need much training data to train it because you're not trying to build something that's like super generalizable to all systems or all AWS instances. It's probably something unique to your system. But I would say that's kind of where the difference is. And then it's a lot easier, too, because if you're trying to build like a neural net, it's like what type, how many layers, what kind of optimization schedule, learning rate. There's always hyper parameters and things you have to figure out. You still have to do that, too, for classical machine learning to a degree. But if your problem is not that difficult, it's not as fancy nowadays, but it gets the job done.
29:10 Yeah. I suspect you could come up with some predictors and then monitor them in this model, whereas opposed to here's an image that is a breast scan. Does it have cancer or not? Right.
29:20 Exactly.
29:21 We don't even really know what we're looking for, but there probably is a pattern that could be pulled out by a neural network.
29:26 Exactly. Yeah, that's a great point.
29:29 We're trying to build some predictive scaling for our API right now because one of the problems with the challenges of a startup that's doing machine learning in production is we deploy hundreds of GPUs and thousands of CPU cores into production every day at peak load, and you have to be able to scale to demand. And if you overscale at that size, then there's just huge costs that come with that. If you underscale, there's bad performance for end users. And so we've done a ton of work around, like auto scaling and trying to optimize models of production and things like that. And now we're trying to do some predictive scaling. And for that, for example, we probably do something super simple with like ScikitLearn. We want to do a neural net for that.
30:10 Yes. The scaling sounds like solving a basically a similar issue as understanding failure, right?
30:15 Yeah, exactly.
30:17 The lack of scaling sometimes is kind of the results is failure.
30:21 Yeah. They're somewhat related together. Yeah. You talked about running stuff in production, and there's obviously two aspects for machine learning companies and startups and teams and products that are very different than, say, the kind of stuff I do. I've got APIs that are running, we've got mobile apps, we've got people taking the courses, but all of that stuff there is like one. It's always the same.
30:44 Right.
30:45 We put stuff up and people will use it and consume it and so on. But for you all, you've got the training and almost the R and D side of things that you've got to worry about working on and scaling and you've got the productionising. So maybe tell us a little bit about what do you guys use for both parts of training?
31:04 Maybe start with the training side.
31:05 Yeah, the training side. It's basically impossible to use the big clouds for that because it would just be prohibitively expensive, at least for what we do. So we train like these huge neural nets for speech recognition and different NLP tasks, and we're training them across like 48, 64 GPUs, like really powerful GPUs.
31:24 I've got the Gforce 30, 90, which is up here. Do you know what kind you're using? Yeah.
31:30 So we use a lot of V100 and A100.
31:36 Basically what we do is we rent dedicated machines from provider, and each machine we're able to pick the specs that we want, like how many GPUs, what cards, how much Ram, what kind of CPU we want on there. So we're able to pick the specs that we want. And we found that that's been the best way to do it, because the big clouds, if you're running like a dozen, dozens of GPUs of the most expensive type of GPUs for like weeks on it, you could do that if you have like one training run you wanted to do, but a lot of times you have to train a model halfway through it doesn't work well. You have to restart or finish this training and the results are not that good and you learn something. So you have to go back and start over. And now what we're doing is buying a bunch of our own compute. My dream is to have some closet somewhere with just like tons of GPUs and our own mini data center for the R and D. Because if things go down when you're training a model, you checkpoint it as you go. So if your program crashes or your server crashes, you could resume training. Whereas, like, production workloads, we use AWS for that because things can't go down. And I don't think we'd want to take on our own competency of like hosting our own production infrastructure. But for the R&D stuff, we are looking into just buying a ton versus renting because it'd be a lot more cost efficient. And you can, instead of basically paying each year for the same computer, you just buy it once and then you just pay for the electricity and server hosting costs and maintenance costs that come with that.
33:03 Yeah. Maybe find a big office building and offer to heat it for free in the winter by just running on the inside.
33:09 There's this, like, you can run like Nvidia SMI. I don't play around GPUs at all, but you can see what the temperature is of the GPU. And sometimes I remember a while ago when I was training some of these models, I would just look at what the temperature is during training and yeah, it gets so hot. And these data centers have to have all these special cooling infrastructure to keep the machines down. It's pretty environmentally unfriendly.
33:32 Yeah. To the extent that some of them to the extent that people are creating underwater data center like nodes and putting them down there and just letting the ocean be the heat sink, it's crazy.
33:45 You can buy some land Antarctica and put our stuff there. That's only the GitHub.
33:52 The Arctic code thing. I forget what it's called.
33:54 Yeah, the Arctic code vault.
33:58 Yeah. So we could do something like that for our GPUs when we get bigger. That's the dream. That's where it might nerd out.
34:03 There you go.
34:06 I think we have, I think somewhere like maybe like 200 GPUs that we use just for R&D in training. And we're getting a lot more because you don't want to be a lot of times there's like scheduling bottlenecks. So two researchers want to run a model and need a bunch of compete to be able to do that. They're both good ideas. You don't want to have to wait four weeks for someone to run their model because compute is taken. So we're trying to unblock those scheduling conflicts by just getting more compute. And for the production side, we deploy everything in AWS right now and onto smaller GPUs, because a lot of our models do inference on GPU. Still, some of our models do inference on CPU.
34:47 Interesting to evaluate the stuff, it still uses GPUs. The current models are created correct.
34:53 Yeah, we could run it on CPU, but it's just not as parallelizable as running it on CPUs. There's a lot of work that we could probably do to get it really efficient so that we're running it on as few CPU cores as possible. But one of the problems is almost like every three to four months we're like throwing out the current neural network architecture and using a different one that is giving us better results. Sometimes we'll make the model bigger or there'll be a small tweak in the model architecture yields better results, but a lot of times it's like, okay, we've kind of iterated within this architecture as much as we can. And now to get the next accuracy buff, we have to go to a new architecture. We're undergoing that right now. We released one of our newer speech recognition models. We released, I think like three months ago, and the results are really good. But now we have one that is looking a lot better, and it'd be like a completely different architecture. And so it's just that trade off of do you spend a bunch of time optimizing the current model that you have and trying to prune the neural network and do all these optimizations to get it really small? Or do you just spend that research effort and that energy focused on finding the next accuracy game? And because we're trying to win customers and grow our revenue, it's just, all right, let's just focus on the next model. And when we have a big enough team or when we can focus on it, we'll work on making the models smaller and more compute efficient and less costly to run. But right now. Yeah. Like our speech recognition model that does inference on a GPU. There's a couple of our NLP related models, like our content moderation model that does influence on the GPU. And then there's our automatic punctuation and casing restoration model that runs on a CPU because that's not as compute intense. So it really varies.
36:37 Yeah. It's pretty interesting to think about how you're optimizing the software stack and the algorithms and the libraries and whatnot when you're not doing something that's changing so quickly.
36:50 If it's working, you can kind of just leave it alone. Right. I've got some APIs I think they're built either in pyramid or Flask. Sure, it would be nicer to rebuild them in FastAPI, but they're working fine. I have no reason to touch them. Right. So there's not a huge step jump I'm going to take. They're not under extreme load or anything. Right.
37:14 This portion of Talk Python to Me is brought to you by Sentry. How would you like to remove a little stress from your life? Do you worry that users may be encountering errors, slowdowns or crashes with your app right now? Would you even know it until they sent you that support email? How much better would it be to have the error or performance details immediately sent to you, including the call stack and values of local variables and the active user recorded in the report? With Sentry, this is not only possible, it's simple. In fact, we use Sentry on all the Talk Python web properties. We've actually fixed a bug triggered by a user and had the upgrade ready to roll out as we got the support email. That was a great email to write back. Hey, we already saw your error and have already rolled out the fix. Imagine their surprise, surprise and delight your users. Create your Sentry account at 'talkpython.fm/sentry' and if you sign up with the code 'talkpython' all one word.
38:09 It's good for two.
38:10 Three months of Sentry's business plan, which will give you up to 20 times as many monthly events as well as other features. Create better software, delight your users, and support the podcast.
38:22 Visit talkpython.fm/sentry and use the coupon code talkpython.
38:29 But in your world, there's so much innovation happening around the models that you do have to think about that. So how do you work that trade off? How do you like, well, could we get more out of what we've got or should we abandon it and start over? Right, because it is nice to have a very polished and well known thing as well.
38:47 Definitely. And every time you throw out our architecture and implement a new architecture, you've now got to figure out around that architecture at scale. And you don't want to have any hiccups for your current customers or users of your API, which sometimes happens because these models are so big that you can't just write this model at service that sits on a GPU and does everything. You have to break it up into a bunch of component parts to let it so that you can run it efficiently at scale. So there's like eight, nine micro services for like a single model, because you break out like, okay, all these different parts and try to get it running really efficiently in parallel. But it does beg the question of like, how do you build good CI CD workflows and good DevOps workflows to get models that are production quickly? And this is something that we're working on right now and trying to solve. A lot of times we have better models and we sit on them for two, three weeks because to get them into staging, we have to do low testing, see if anything with scaling have to change because the model profile is different. Are there any weird edge cases that we didn't check or see during testing? So it slows down the rate of development because it's hard to do CICV. It's not like you just, okay, run these tests, the code works, go there's like compute profile changes that happen. And so maybe you need a different instance type or you need to write useless CPU.
40:04 But way more Ram. So if you actually deploy, it's going to crash or something.
40:07 Exactly. And then doing that at scale, you have to profile out and do load testing. And so really, we're trying to figure out how to get these models into production faster. And I think the whole ML Ops world is so in its infancy around things like that. And it's a lot of work. Yeah, it's a lot of work. So for us, the trade off though, is always like our customers and developers, they just want better results and always more accurate results. And so we just always are working on pushing our models, making them more accurate. We can iterate within a current architecture grade. Like sometimes you can just make the model bigger or make a small change and then you get a lot of accuracy improvements and it's just like what we call a drop in update where no code changes. It's just literally like the model that you're Loading is just different and then it's just more accurate.
40:53 Right. That's easy.
40:54 That's the dream. It's just a drop in, but that's maybe like 30% of updates like the other 70% are. Okay, you've got a new architecture or it's got a pretty different compute profile. So uses a lot more Ram or it's a lot slower to load in the beginning. So we need to scale earlier because instances come online later and become healthy later. So there's all these things you have to think about. Yeah.
41:17 The whole DevOps side of this sounds way more interesting and involved than I. Yeah, it's painful too.
41:23 I mean, we're like, I can't explain how many grass we have in data dog, just like monitoring things all day. And Luckily I don't have to work on that anymore. That was very stressful and I was like owning the infrastructure. Now we have people that are better at it than me. We had two devops. people start on Monday, but yeah, like, devops is a huge piece of this.
41:42 Yeah, that's quite interesting. I do want to just circle back to one real quick thing. You talked about buying your own GPUs for training and people might have everything and like, who would want to go and get their own hardware in the day of AWS download whatever. Right. Like, it just seems crazy, but there's certainly circumstances. Here's an example that I recently thought about. So there's a place called Mac Stadium where you get Macs in the cloud. How cool, right? So maybe you want to have something you could do with extra things. And what does it cost? Well, for a Mac Mini in one, it's $132 a month.
42:14 Is that higher level?
42:15 Well, the whole device, if you were to buy it, costs $700.
42:20 Yeah, that's true.
42:22 And I suspect that even though the GPUs are expensive, there's probably something where if you really utilize it extensively, it actually makes to buy it. It stops making sense in ways that people might not expect.
42:33 Yeah, it's a buy it, you mean. Right, like it stops making sense. Yeah, that's what we're doing.
42:37 It's not making sense to rent it in the cloud.
42:38 Yeah. I mean, we spent a crazy amount of money renting GPUs in the cloud and it's like, okay, if we had a bunch of money to make a capex purchase. Right. Like just shout out a bunch of money to buy a bunch of hardware up front it would be so much better in the long run because it is similar to the example you made about. If you don't have a lot of cash, then you're only going to use a Mac for a couple of months, right.
43:01 You need it for two weeks, then it doesn't make sense to buy it.
43:03 Correct.
43:04 You pay the $100 and you get right.
43:06 Or if you don't have like 2K and then you just rent it. If you don't have the money to buy a house, you rent an apartment. Right. Things like that. So there are definitely benefits. And I think for most models, you don't need crazy compute you could get away with. Like you could buy a desktop device that's like two GPUs, or you could rent a dedicated machine or still do it on AWS if you like one or two GPUs. And it would be insane. So if you're just starting out, all those options are fine. But if you're trying to do like big models or train a bunch in parallel, you need more compute. And definitely it doesn't make sense to use the big clouds for that. There's a bunch of dedicated providers that you can rent dedicated machines from, just pay a monthly fee regardless of how much you use it. And it's a lot more efficient for companies to do that.
43:59 Interesting. Give me your thoughts on sort of capex versus Opex for ML startups rather than, I don't know, some other SaaS service that doesn't have such computational stuff.
44:12 You got to buy a whole bunch of machines and GPUs and stuff versus Capex. Like, well, it's going to cost.
44:18 I feel like it's crazy. Things are more possible because you can get the stuff in the cloud, prove an idea, and then get investors without going, well, let's go to friends and family and get 250,000 for GPUs. And if it doesn't work, we just do Bitcoin.
44:32 Yeah, definitely. I mean, we started in the cloud, right? So first models we trained were on K80's on AWS took like a month to train.
44:43 Wow.
44:43 It was terrible. So we started in the cloud and then now that we're fortunate to have more investment in the company, we can make these Capex purchases. But yeah, the operating expenses of running an ML start up are also like crazy. Like payroll and payroll and AWS are biggest expenses because you run so much compute and it's super expensive. And what I talk about and what we talk about is there's nothing fundamental about what we're doing that makes that the case. It just goes back to that point of like, do you spend a couple of months optimizing your models, bringing compute costs down, or do you just focus on the new architecture and kind of pay your way to get to the future? Like this growth versus and we're like a venture back company. So like there's expectations around our growth and all that. So we just focus on like, okay, let's just get to the next milestone and not focus too much on bringing those costs down, because there's the opportunity cost of doing that. But eventually we'll have to.
45:42 Yeah. It's a little bit of the ML equivalent of sort of the growth. You can lose money to just gap users, but this is the sort of gain. It is capabilities, right?
45:55 It is, yeah, it is 100%.
45:56 And then you'll figure out how to do it efficiently once you find your way.
46:00 Okay. And I'll give you a tangible example. I mean, we've been adding a lot of customers and developers on the API, and there's always a new scaling problems that come up. And sometimes we're just like, look, let's just scale the whole system up. It's going to be inefficient, there's going to be waste, but let's scale it up, and then we'll fine tune the auto scaling to bring it down over time versus having to step into, like, a more perfect auto scaling scenario that wouldn't cost as much, but there would be bumps along the way. So we just scaled everything up recently to buy us time to go work on figuring out how to prove some of these auto scale.
46:37 Interesting.
46:38 Yeah.
46:38 You could spend two weeks trying to figure out the right way to go to production, or you could spend just more money to work because you might not be sure with the multiple month life cycle, some of these things, is this actually going to be the way we want to stick with? So let's not spend two weeks optimizing it first.
46:57 Right.
46:57 Very interesting.
46:58 And I mean, like, look, not every company can make that decision. Like, if you are bootstrapped or you're trying to get off the ground, which, like, a lot of companies are, you do have to make those. You can't just pay your way to the future.
47:09 Yeah. And I'm a big fan of bootstrapped companies and finding your way. I don't think that necessarily just set a ton of money on fire.
47:17 Right.
47:18 I already way forward. But if you have backers already, then they would prefer you to move faster, I suspect.
47:25 Correct. Yeah, correct, correct. Like, I always was self conscious about our operating costs as ML companies because they are high compared to other SaaS companies where you don't have heavy compute. But the investors we work with, they get like, okay, there's nothing like that fundamental about this that requires this cost to you just have to spend time on bringing them down. And there's like a clear path. It's not like Uber, where it's like the path to bring costs down or like, self driving cars because it's expensive to employ humans that's like so far down the road. But for us, it's like, okay, we need to just spend three months making these models more efficient and they'll run a lot cheaper. But it's that trade off. I love bootstrap companies, too. I mean, just a different way to do it.
48:11 Something special about like you're actually making a profit, you're actually of customers and knowing and the freedom for sure. So you probably saw me mess around with the screen here, pull up this Raspberry Pi thing. There's a question out in the audience says could you do this kind of stuff on a raspberry Pi and a standard raspberry Pi? I suspect absolutely no. Have you ever seen that there are water cooled raspberry Pi clusters?
48:38 I see that. That is crazy.
48:40 Is that insane?
48:41 That's insane.
48:42 What kind of computer are they getting on that?
48:44 It's pretty comparable to a MacBook Pro on this. That's what, eight watercooled raspberry Pi's in a cluster and it's really an amazing device. But if you look back at you sort of consider it like a single PC with a basic Nvidia card or a MacBook Pro or something like that, that's still pretty far from what you guys need. Like, how many GPUs did you say you were using to train your models? Yeah.
49:10 You're 64 for the bigger ones. Yeah. In parallel. Yeah.
49:14 These are not small GPUs.
49:17 I'm going to maybe throw it out there for you and say probably no, maybe for the scikit learn type stuff, but not for what you're doing. Not the TensorFlow PyTorch.
49:25 Yeah. Not for training. But you could do entrance on a raspberry Pi. Like, you could squeeze a model down super tiny, like what they do to get some models onto your phones and you're on that out of raspberry Pi. You get the model slow enough, the accuracy might not be great, but you could do it. There's a lot of stuff happening around the Edge, I think a lot of that series, yes.
49:45 The Edge compute the sort of ML on device type stuff.
49:48 Like a lot of the speech recognition on your phone now happens on device and not in the cloud.
49:53 Sort of related to this. Like the new M1 chips and even the chips in the Apple phones before they come with neural engines built in, like multicore neural engines. Interesting for Edge stuff again, but not really going to let you do the training and stuff like that. Right.
50:11 I haven't done much iOS development, but I know there's like SDKs now to kind of get your neural networks on device and make use of the hardware on the phone. And definitely if you're trying to deploy yourself on the Edge, there's a lot more resources available to you.
50:27 It's a really good experience because having you speak to your assistant or you do something and it says thinking, okay, well, I don't want that. Like, I'll just go do it if I got to wait 10 seconds.
50:37 Right. But it happens immediately. And there's the Privacy aspect, too.
50:40 Yeah, absolutely. The Privacy is great.
50:42 Yeah. Like the wake word on the Alexa, I don't know if you noticed, but the wake words on the Alexa device, they happen locally. That runs locally. Although I've heard that when you say Alexa, they verify it in the cloud with a more powerful model. Interesting, because sometimes it'll trigger and then shut off. I don't know if you've ever seen that happen.
51:00 Yes. It'll spin around and go, no, that wasn't right.
51:03 Yeah, exactly. I think what's happening is that they're sending the wake words to the cloud to verify. Did you actually say Alexa? Probably the local models below some sort of confidence level. It sends it up to the cloud and then the cloud verifies like, yes, start processing. But it is much faster from a latency perspective, although with 5G, I don't know, mobile Internet is so much faster now.
51:25 It'S getting pretty crazy. Yeah, absolutely.
51:27 Sometimes I'll be somewhere. My WiFi is slow and I'll just tether my phone and it's like faster. Yeah.
51:33 If I'm not at my house, I usually do that. If I go to a coffee shop or an airport, I'm like there's a very low chance that the WiFi here is better than my 5G gathered.
51:41 Yeah, exactly.
51:45 Check Woody on the audience has a really interesting question. I think that you can speak to because you're in this space right now living it. What do investors look at when considering an AI startup or maybe AI startup? Not just specifically speech to text?
52:01 Yes, it's a good question. I think it really depends on like are you building like a vertical application that makes use of AI? So you're building some like call center optimization software where there's like AI out of the hood, but you're using it to power this business use case versus are you building some infrastructure AI company like we're building APIs for speech to text? Or if you're building a company that's exposing like APIs for NLP or different types of tasks, I think it varies what they look at. I am not an expert in fundraiser AI startup, so I want to make that very clear.
52:37 So maybe don't take my advice too seriously.
52:39 Yeah, but you've done it successfully, which is I mean, there are people who claim to be experts but are not currently running a successful backed company.
52:49 Sure.
52:49 I wouldn't put too much of a caveat there.
52:51 Yeah. I think we just got lucky with meeting some of the right people that have helped us. But I think it's like, yeah, are you doing something innovative on the model side?
53:01 Do you have some innovation on the architecture side? I actually don't really think the whole data mode is that strong of an argument personally, because there's just so much data on the Internet now.
53:11 Data mode being like we run Gmail so we can scan everybody's email. That gives us a competitive advantage.
53:17 Something like that.
53:18 Exactly.
53:19 I don't know, you might get like a slight advantage, but there's so much data on the Internet and there's so many innovations happening around.
53:26 Look at GPT-3 that opens put out right that was just trained on like, crazy amount a huge model trained on crazy amounts of public domain data on the Internet works so well across so many different tasks. So even if you had a data mode for a specific task, it's arguable that GPT-3 could beat you at that task. So I think it depends what you're doing. But I don't personally buy into the whole data mode thing that much, because even for us, we're able to build some of the best speech to text models in the world, and we don't have this like, secret source of data. You know, we just have a lot of innovation on the model side, and there's tons of data in the public domain that you can access now. So I think it's really about like, are you building some type of application that is making the lives of like, a customer developer, some start up easier? Leveraging AI.
54:19 Are you solving a problem we'll pay money to solve?
54:22 Yeah, exactly. Because I actually think it's more about the distribution of the tech you're building versus the tech itself. So are you packaging it up in an easy to use API? Or imagine you're selling something to podcast hosts that uses AI? Ai could be amazing, but if, like, user interface sucks.
54:44 Here's what you do. You're going to make a post request over to this and you put this header in and how you do paging no, here's the library. In your language you call the one function. Things happen, right. Like how presentable or straightforward do you make it? Right, right.
54:58 Because I actually think that's a huge piece of it. Are you making it easy or are you making is the distribution around the technology or creating like, really powerful and do you have good ideas around that? So I think it's a combination of those things. But to be honest, I think really depends on what you're building and what the product is or what you're doing. Because it varies.
55:17 It varies a lot.
55:18 Yeah. There's also the part that we as developers don't love to think about, but the marketing and awareness and growth and traction, you could say, look, here's the most amazing model we have. Well, we haven't actually got any users yet, but that is a really hard sell for investors unless they absolutely see this has huge potential. Right. But if you're like, look, we've got this much monthly number of users, and here's the way we're going to start to create a premium offering. And that's something we're not particularly skilled at as developers. But that's a non trivial part of any tech startup, right?
55:57 Oh, yeah. And I think as a developer too, you kind of like shy away from wanting to work on that because it's so much easier to just write code or build a new feature versus like, go solve this hard marketing problem or go marketing sales.
56:09 You got to have them even if you're bad at them. You don't like it?
56:12 Yeah. We're fortunate that we get to market to developers, so I enjoy it because you get to talk to developers all the time. But, yeah, that's a huge piece of it, too. Definitely.
56:23 It's going to all come together. Yeah.
56:25 This up a little bit. We're getting sorted near the end. But let's talk about you've got this idea of got your models.
56:30 You'Ve got your libraries.
56:31 You'Ve trained them up using your GPUs. Now you want to offer it as an API. Like, how do you go to production with a machine learning model and do something interesting? You want to talk about how that's work? I know you talked a little bit about running the cloud and whatnot, but do you offer as an API over Flask or do you run it in a class?
56:50 What are you doing there?
56:51 Are they Lambda functions?
56:52 Yeah, it's a good question.
56:53 What's your world look like?
56:54 So we have Asynchronous APIs where you send in an audio file and then we send you a web hook when it's done processing. And then we have real time APIs over WebSocket where you're streaming audio and you're getting stuff back over a WebSocket in real time. The real time stuff is a lot more challenging to build, but the Async stuff really what happens is we have like one of our main APIs was built in Tornado.
57:18 I don't know if you legacy early Async enabled Python web framework before Async IO was officially a thing.
57:26 Yeah. So I built the first version of the API and Tornado. So it's kind of like still in Tornado for that reason. A lot of the newer things are newer microservices are built FastAPI or Flask. And so for the Asynchronous API, what happens is like you're making a post request. The API is really just like a crowd app. It's storing a record of the request that you made with all the parameters that you turned on or turned off. And then that goes into a database. Some worker that's like the orchestrator is constantly looking at that database and it's like, okay, there's some new work to be done and then kicks off all these different jobs, all these different micro services, some over queues, some over Http collects everything back orchestrates like what could be done in parallel, what depends on what to be done first, when that's all done, all the kind of Asynchronous background jobs, the orchestrator pushes the final result back into our primary database and then that triggers you getting a web hook with the final results. So that's like in a nutshell, kind of what the architecture looks like for the Asynchronous workloads. There's like tons of different microservices, all with different instance types and different like, compute requirements, some GPU, some CPU, some all different scaling policies. That's really where the hard part is. That's kind of like the basic overview of how the Asynchronous stuff works in production.
58:47 Yeah, very cool. Are you seeing Postgres or MySQL or something like that?
58:52 Postgres for the primary DB because we're on AWS. We use DynamoDB for a couple of things, like federal records we need to keep around for. When you send something in, it goes to DynamoDB and that's where we keep track of basically like your request and what parameters you add on and off. That kicks off a bunch of things. But the primary DVS Postgres. Yeah, I think there's like at this point it's getting pretty large.
59:16 There's like a few billion records in there because we process a couple of million audio files a day with the API. Sometimes I'll read on hacker news like this, I think like GitHub went down at one point because they couldn't increase increment. The primary key values any higher in 64 is overflowing.
59:35 Something like that.
59:36 The back of my mind. I hope we're thinking about something like that because that would be really bad if we came up against something like that.
59:42 Do you store the audio content in the database or they go in like some kind of bucket, some object storage thing.
59:49 So we're unique and if we don't store a copy of your audio data for Privacy reasons for you. So you send something in, it's sort of femorally, like in the memory of the machine that's processing your file. And then what's stored is the transcription text encrypted at rest because you need to be able to make a get request for the API to fetch it. But then you can follow up with a delete request to permanently delete the transcription text from our database as well. So we try to keep no record of the data that you're processing because we want to be really Privacy focused and sensitive.
01:00:22 Some customers will toggle on that. We keep some of their data to continuously improve the models, but by default we don't store anything.
01:00:31 Yeah, that's really cool. Yeah, that's good for Privacy. It's also good for you all because there's just less stuff that you have to be nervous about when you're trying to fall asleep. What if somebody broke in and got all the audio? Oh, wait, we don't have the audio. Okay, so that's not a thing they could get things like that, right?
01:00:46 Yeah, definitely.
01:00:50 I hadn't thought about that before, but I'm imagining that what that would be like.
01:00:54 Now you're going to be nervous because there's probably other stuff for that.
01:00:56 Yeah. Now you got me thinking of that in that space. Like, what are those things we need to lock up now?
01:01:03 We're mostly a team of engineers, so I think the 30 people, like 70%, are engineers with a lot more experience than me. So we're doing all everything like by the book, especially with the business that we're in.
01:01:15 Yeah, of course, Dylan, I think we are out of time, if not out of topic. So let's maybe wrap this up a little bit with the final two questions and some packages and stuff. So if you're going to work on some Python code, what editor are you using these days?
01:01:29 I'm still using Sublime.
01:01:31 Right on.
01:01:32 What do you use?
01:01:32 The OG easy ones. I'm mostly PyCharm. If I want to just open a single file and look at it, I'll probably use Vs code for that. That's probably just want to open that thing. Not all the project ideas around it, but I'm doing proper work. Probably PyChram on these days.
01:01:47 Yeah, that makes sense.
01:01:48 Yeah. And then notable PyPi project.
01:01:51 Some library out there.
01:01:52 You've already talked about it like TensorFlow and some others, but anything out there you're like? Oh, you should definitely check this out.
01:01:58 I would check out Hugging Face if you haven't yet. It's pretty cool library.
01:02:01 Yeah.
01:02:02 Pretty cool library.
01:02:03 Yeah. Hugging Face seems like a really interesting idea.
01:02:05 Yeah.
01:02:06 I want to give a quick shout out to one as well. I don't know if you've seen this. Have you seen pLS, please, as an LS replacement?
01:02:14 Chris May told me about this yesterday.
01:02:16 Brian, for Python Bytes, check this out. So it's a new LS that has icons and it's all developer focused on a virtual environment. It will show that separately. If you got a Python file has a Python icon, the things that appear on the list are controlled somewhat by the git ignore file and other things like that. And you can even do a more detailed listing where it will show like the git status of the various files. Isn't that crazy?
01:02:41 That's really cool. Yeah.
01:02:43 That's a Python library. Pls.
01:02:44 Pls. That's awesome. I'll check that one out.
01:02:46 Yeah, people will check that out.
01:02:48 Yeah. All right.
01:02:49 Dylan, thank you so much for being on the show. It's really cool to get this look into running ML stuff in production and whatnot. Thanks for having me on a final call. Yeah, you bet. You want to give us a final call to action. People are interested in sort of maybe doing an ML start up or even if they want to do things with Assembly.
01:03:06 Ai, if you want to check out our APIs for automatic speech text, you can go to our website 'assemblyai.com', get a free API token. You don't have to talk to anyone. You can start playing around. There's a lot of Python code samples that you can grab to get up and running pretty quickly. And then. Yeah, if you're interested in ML start ups, I think that one of the things that I always recommend is if you want to go like the funding route, definitely check out Y Combinator as a place to apply because that really helped us get off the ground. They helped you out with like a lot of credit around GPUs and resources and it helps a lot. Helped us a lot.
01:03:41 Were you in 2017 like that?
01:03:44 Yeah, 2017 so super helpful. And I would highly recommend that there's also just a big community of other like ML people that you can get access to through that. So that really helped. And I would recommend people check that out.
01:03:57 How about if I don't want to go one more refunded?
01:04:00 Yeah. So one more. There's also like an online accelerator called 'Pioneer'. I don't know if you've heard of this, but that's also a good one to check out too. If you don't want to go the accelerator route, then I would say, yeah, really. It's just about getting a model working good enough to close your first customer, then just keep iterating. So don't get caught up in reaching state of the art or in the research. Just kind of think of like the MVP model that you need to build. They go win your first customer and they kind of keep going from there.
01:04:30 Yeah. Awesome.
01:04:31 All right.
01:04:32 Well, thanks for sharing all your experience and for being here.
01:04:34 Yeah. Thanks for having me on. This is fun.
01:04:36 Yeah, you bet. All right.
01:04:37 Bye.
01:04:39 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering.
01:04:45 It really helps support the show.
01:04:47 For over a dozen years, the Stack Overflow podcast has been exploring what it means to be a developer and how the art and practice of software programming is changing the world. Join them on that adventure at dogpython. Fm StackOverflow. Take some stress out of your life. Get notified immediately about errors and performance issues in your web or mobile applications with Sentry. Just visit talk. Python. Fm/sentry and get started for free and be sure to use the promo code 'talkpython' All One Word. Want you level up your Python we have one of the largest catalogs of Python video courses over at talkpython. Our content ranges from true beginners to deeply advanced topics like memory and Async. And best of all, there's not a subscription in site. Check it out for yourself at Training.python.fm be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the itunesfeed at /itunes, the GooglePlay feed at /play, and the Directrss feed at /rss on talkpython.fm.
01:05:48 We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at 'talkpython.fm/youtube'. This is Yours host Michael Kennedy, thanks so much for listening.
01:06:02 I really appreciate it.
01:06:03 Now get out there and write some Python code.