#359: Lifecycle of a machine learning project Transcript
00:00 Are you working on or considering a machine learning project?
00:03 On this episode, you'll meet three people from the MLOps community, Dmitryo Sprinkman, Kate Kuznikov, and Vishnu Ratchakanda.
00:10 They're here to tell us about the lifecycle of a machine learning project.
00:13 We'll talk about getting started with prototypes and choosing frameworks,
00:16 the development process, finally deployment and moving into production.
00:20 This is Talk Python to Me, episode 359, recorded March 22nd, 2022.
00:25 Welcome to Talk Python to Me, a weekly podcast on Python.
00:42 This is your host, Michael Kennedy.
00:44 Follow me on Twitter where I'm @mkennedy and keep up with the show and listen to past episodes at talkpython.fm.
00:50 And follow the show on Twitter via at Talk Python.
00:53 We've started streaming most of our episodes live on YouTube.
00:57 Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.
01:04 This episode is brought to you by Sentry and their awesome error monitoring product,
01:10 as well as the Stack Overflow podcast, bringing you stories about software development.
01:14 Transcripts for this and all of our episodes are brought to you by Assembly AI.
01:19 Do you need a great automatic speech-to-text API?
01:22 Get human-level accuracy in just a few lines of code.
01:24 Visit talkpython.fm/assembly AI.
01:27 Kate, Vishnu, and Dimitros.
01:30 What's up?
01:31 Welcome to Talk Python to Me.
01:32 Hey.
01:32 How you doing, man?
01:33 Great.
01:34 Hey, I'm doing great.
01:34 It's fantastic to have you all here.
01:36 I'm psyched to talk about ML Ops and your community and some best practices and tooling
01:43 and all that kind of stuff.
01:44 And I think it's going to be a lot of fun.
01:46 So hopefully you're looking forward to it as well.
01:48 Yeah, for sure.
01:49 Yeah, for sure.
01:50 So think about one of the hottest areas of technology right now, whether it's trying to
01:55 get a job or it's VC funding or whatever, right?
01:58 The machine learning and AI is like one of the peak buzzwords right now.
02:03 And so I think this is going to be a fun conversation both to talk about this hot topic, but also
02:07 maybe to demystify it a little bit.
02:09 Yeah, I hope we can do that.
02:11 I think we can.
02:11 I definitely think we can.
02:12 And you all built this cool community, mlops.community, which we're going to talk about a lot.
02:18 But before we get into that topic, let's just start with your background and how you got into
02:24 machine learning and all this stuff.
02:25 Kate, let's start with you.
02:26 All right.
02:26 So my background is probably not so conventional for the field, although nothing is conventional
02:32 for ML, I guess, at this point.
02:34 So originally, I actually started studying business and economics back in the day, and
02:39 then I slowly shifted into basically digital analytics for marketing and stuff like that,
02:44 at which point I realized that I'm more interested in the numbers than in influencing people.
02:49 So I went through a bootcamp and then one thing led to another through consulting, internships
02:54 and so forth.
02:55 Once you get your foot in the door, I guess it's easier to establish yourself.
02:58 And then I ended up continuing on that path, mostly data science.
03:03 So not crazy robots or anything.
03:06 It's still very close.
03:08 No self-driving cars right now.
03:09 No self-driving cars yet.
03:10 So it's still mostly basically ML for commercial purposes, like working with sales data, with
03:17 text data and so forth.
03:19 But that was it.
03:20 I think there's a lot of people that come from economics in that area and get into computation
03:26 and then get into data science.
03:27 It's a bit of a gateway, I think.
03:29 Yeah.
03:29 And I mean, economics is also kind of this easy choice of subject.
03:33 When you're finishing school, you're kind of like, I can do math, so I'm not going to
03:37 do just social science.
03:39 So, OK, let's go for economics.
03:40 And then you're like, what can I actually do with it?
03:43 Do I want to work in a bank or maybe something else?
03:46 Yeah.
03:46 Yeah, exactly.
03:47 I had a similar experience with my math degree, but.
03:49 Exactly.
03:50 Yeah.
03:51 Fantastic.
03:51 So you said you did a bootcamp, which I think is interesting.
03:54 When I think bootcamps, I think JavaScript front end, not necessarily the best thing to
04:01 just produce a whole world full of JavaScript front end folks.
04:04 But that's sort of the main bootcamp story.
04:07 It sounds like yours was in data science.
04:09 Was that a good experience?
04:10 What was that like?
04:11 So we actually did have those paths as well with Java and front end.
04:15 I just decided to go for the data science one.
04:19 I think they even called it big data in AI or something like that, but technically it was
04:23 just like pure nice Jupyter notebooks where you are massaging the data a little bit and
04:28 then running your first couple of models.
04:29 It was very nice in a sense that it was like all of us struggling together was very intense.
04:35 So you kind of can test drive your idea of whether it's actually for you or not in a very short
04:41 time.
04:41 And ours lasted like, I think three or four weeks.
04:44 And after that internship, the same company, sorry, bootcamp would get you into an internship.
04:49 If you pass the whole bootcamp, you pass the last test, do the project and then do the interview,
04:55 you can be a candidate there.
04:57 And I think that was very appealing to me because it had some consequence.
05:00 It wasn't just, oh, let's try this out.
05:02 And then I don't know what I'm going to do.
05:04 Who's going to hire me, whatever.
05:05 It was very well thought out.
05:07 So in that sense, I think it's a great program that they have put together.
05:12 I think they're not doing it anymore, like specifically for data scientists, because they
05:17 found out it's very hard, even in consulting, to employ very, very junior data scientists and
05:23 get value out of them.
05:24 So now they're training people in data engineering and then a little flavor of data science so
05:31 that later you can develop the skill and maybe apply that knowledge down the road.
05:35 Yeah.
05:36 Interesting.
05:36 I do think it probably is a little bit challenging to get in as a junior because there's not a
05:40 huge team of data scientists a lot of times.
05:43 So interesting.
05:43 Yeah.
05:44 How about you?
05:45 Yeah.
05:45 So I fell into this very, very happenstance in a happenstance kind of way, which was I was
05:53 working on the sales side of a company that was selling tools to machine learning engineers.
05:58 And so I was that guy that would spam people.
06:02 And so if I spammed you, please forgive me back in the day.
06:06 If I tried to reach out to you on LinkedIn and I tricked you into connecting with me, that
06:10 was my past life.
06:12 I have repented for my sins.
06:14 And now I do the community stuff full on.
06:19 But yeah, the company was doing ML tooling and we were really focused on provenance and
06:25 trying to be able to reproduce your runs or your data, your models, everything that comes
06:31 along with the modeling side of machine learning.
06:34 And so what happened was though, when the pandemic hit, that company went out of business.
06:41 And just about like three weeks before the company went out of business, nobody was picking
06:47 up the phone.
06:47 Nobody was connecting with me on LinkedIn.
06:49 And so our CEO said, why don't we try and do something around a community?
06:54 So people don't want us to come to them.
06:55 Maybe we can make a place for them to come to, right?
06:57 Exactly.
06:58 Like, and so our CEO, I got to hand it to him.
07:00 It's his name's Luke Marsden back in those days.
07:03 And he was the CEO of this company called Dot Science, which is now defunct.
07:07 He said, you know what?
07:09 He has open source in his blood.
07:11 Like he can't help but do the open source thing.
07:15 Even though this company wasn't open source, he wanted the community to be very open source
07:20 and vendor neutral and open to anyone.
07:22 And in the beginning, I didn't understand that.
07:24 I was like, wait a minute, you're going to let our competitors come in here.
07:27 And not only that, you're going to let them give a talk?
07:30 No way, man.
07:31 This is crazy.
07:32 And so he was all about it.
07:34 And he helped me to understand like the value of community and what it's about.
07:37 And then because of that, those first like weeks, I just was interviewing people as a podcast
07:45 host because I needed to ramp up.
07:47 And so I was interviewing people.
07:49 Basically, I was learning on the air and I was talking to different machine learning engineers
07:53 when we would have the meetups and we would run it like almost like a podcast slash meetup.
07:58 And it would be a live podcast, kind of not too dissimilar to what we are doing right now.
08:03 And then I was able to talk to enough people and then more people started joining the community.
08:08 And then I started learning more about it.
08:10 And now here we are two years later and I'm still running with the community.
08:15 It became something that is like my pride and joy.
08:17 And I can't see it's part of my identity now, which I don't know if that's a good or bad thing,
08:22 but it's part of it.
08:23 It sounds really fun.
08:25 And these communities are super fun to build.
08:27 And two years, that's a long time to be building.
08:29 And so it's the kind of stuff that doesn't necessarily explode overnight.
08:33 But if you sort of take a long view and then look back, you're like, wow, look what we built.
08:38 That's pretty neat.
08:39 Yeah, totally.
08:39 I mean, and Vishnu was in it at the very beginning and Vishnu and I talked quite a bit.
08:43 And it was like when there was, I remember in the first couple months, I was sitting there and thinking, wow, if we hit 600 people in Slack, that will be like
08:55 uncomprehendable.
08:56 And so, yeah, then it's kind of like you said, it's just grown from there.
09:03 And you look back and you go, how did we get here?
09:05 Wow.
09:06 This is incredible.
09:07 Yeah.
09:07 I think one of the things that's easy to lose sight of for creators and community members
09:12 or organizers and stuff, it's really easy to get tied up.
09:16 And you look at like a famous YouTube person with a million views on their videos,
09:20 or you look at some crazy TikTok thing or something.
09:23 But 600 people, that's pretty close to a keynote for a lot of conference.
09:28 That's like the premier group where you try to get them all together.
09:33 Like that's pretty, pretty amazing.
09:35 So that's so true.
09:36 Which is awesome.
09:37 Yeah.
09:37 Which new?
09:38 How about you?
09:39 Well, how I got started in ML, I think if I really had to trace it back, it was maybe
09:44 around 2017, you know, around that time, I was really kind of thinking about what I should
09:51 try and do in my career.
09:52 I was really passionate about, I am continuing to be very passionate about healthcare and biotech
09:57 and sort of life sciences field.
09:59 And I was reading a lot and kind of thinking, what do I want to do with my career?
10:02 It was maybe, you know, kind of similar to Dimitri.
10:05 I was actually contemplating some of the very more, sales or business development or product management side of thing in the industry.
10:11 But I was like, you know, it doesn't feel tangible enough.
10:13 It doesn't feel like I'm actually helping make people healthier in a very hands-on way.
10:18 And so I started to read more papers as part of a master's program on the impact of machine
10:23 learning in, you know, health, healthcare and biotechnology and decided, okay, this is what
10:27 I want to do.
10:28 And kind of just spent some time teaching myself how these things work, took some courses,
10:32 became a machine learning engineer at a medical device company.
10:35 And quickly after starting that realized, well, there's a lot more to doing model.
10:39 That, you know, a lot more to machine learning than just doing model.fit.
10:42 And that's when I joined the MLOps community back when it was about 300 people in May of 2020,
10:47 you know, have been in it since.
10:49 Dimitri and I are pretty, pretty tied to that.
10:51 And that's kind of tracked my growth as a machine learning engineer and understanding,
10:54 you know, okay, how do we go from model to model and production to system to, you know,
11:00 business value.
11:01 And that's kind of been my journey in terms of this industry to date.
11:04 Yeah.
11:05 Super neat.
11:05 I think around 2017 was certainly when the promise of machine learning for healthcare really
11:12 started to gain public awareness.
11:14 You know, you had like, I don't remember exactly the timing, but like that XPRIZE around mammography
11:19 images and stuff like that.
11:21 We're like, oh, this is starting to beat doctors, right?
11:23 And it's automatic.
11:25 Like, could it be both fast and sort of, you know, beating radiologists estimates of whether
11:31 or not someone has cancer and things like that.
11:33 And that's pretty neat.
11:34 I haven't been tracking it that closely, those sorts of developments, but I'm sure it's just
11:39 grown from there.
11:39 Yeah.
11:39 I mean, I think around that time you had, it's a good memory with XPRIZE that was on, on sort
11:44 of available.
11:45 There was also a chest X-ray work that was open source from the NIH, big data sets around,
11:50 you know, what's the, what is the promise of machine learning modeling for that problem,
11:54 diagnosing pneumonia.
11:55 And then also AlphaFold came out around that time from DeepPine for biotech, protein engineering.
12:00 And so on all sorts of different vectors, you had a lot of energy and I guess I was, you
12:05 know, sucked in.
12:05 Yeah.
12:06 That's cool.
12:06 I suspect that the protein folding understanding is even more important really than you have this
12:11 problem.
12:12 But like here we can understand how to fix it maybe for sure.
12:14 Yeah.
12:14 Yeah.
12:15 I think the protein folding stuff, it's especially with the open sourcing of AlphaFold 2 recently,
12:20 the buzz is pretty incredible.
12:21 Yeah, absolutely.
12:21 Well, let's start off by talking a bit about the MLOps community some more, and then we can
12:27 dive into a couple of layers of best practices and machine learning in production and stuff like
12:34 that.
12:35 So maybe Demetrius, we talked a bit about it, but you want to introduce what it's all about
12:39 for folks?
12:40 Yeah.
12:41 And so we're really trying to just be like, this landing page is funny because it's something
12:48 that we put together, whatever, years ago.
12:51 And now we definitely need an upgrade.
12:54 And so we're doing that right now because I realized after I surveyed the whole community,
13:00 a lot of people are like, wow, we're in this community, we're in the Slack.
13:03 But we didn't realize that there is newsletters you got going on.
13:08 You've got roundtable discussions happening.
13:10 There's reading groups.
13:11 There's all kinds of different initiatives that people from the community are doing.
13:16 And we don't actually make that very clear for people when they join, right?
13:23 So maybe they know about the Slack or maybe they know about the meetups or the podcasts that
13:27 we have, but that's it.
13:29 And they don't necessarily get to go through all of that and have a better overview.
13:34 So that's what we're doing now.
13:35 And really like the community for us, it's trying to find this place or trying to define
13:42 and create this space where people can come together.
13:45 They can learn, they can engage with practitioners who are on the front lines and get their questions
13:51 answered, but also like meet others and network with people.
13:55 And we're doing that in this virtual space.
13:57 We're trying to really share, collaborate, learn, and trying to make it fun also.
14:04 So it's not just like numbers, crunching numbers and stale and boring.
14:09 This portion of Talk Python to me is brought to you by Sentry.
14:14 How would you like to remove a little stress from your life?
14:17 Do you worry that users may be encountering errors, slowdowns, or crashes with your app
14:23 right now?
14:23 Would you even know it until they sent you that support email?
14:26 How much better would it be to have the error or performance details immediately sent to
14:31 you, including the call stack and values of local variables and the active user recorded
14:36 in the report?
14:37 With Sentry, this is not only possible, it's simple.
14:40 In fact, we use Sentry on all the Talk Python web properties.
14:44 We've actually fixed a bug triggered by a user and had the upgrade ready to roll out
14:49 as we got the support email.
14:50 That was a great email to write back.
14:52 Hey, we already saw your error and have already rolled out the fix.
14:56 Imagine their surprise.
14:57 Surprise and delight your users.
14:59 Create your Sentry account at talkpython.fm/sentry.
15:04 And if you sign up with the code talkpython, all one word, it's good for two free months of Sentry's
15:10 business plan, which will give you up to 20 times as many monthly events as well as other
15:15 features.
15:15 Create better software, delight your users and support the podcast.
15:19 Visit talkpython.fm/sentry and use the coupon code talkpython.
15:26 I'll throw an idea out there.
15:27 You'll tell me what you think.
15:29 I already sort of hinted at this.
15:30 You know, for software developers, a lot of times they're on teams where there's, you know,
15:34 group five or 10 of them working together.
15:36 But I feel like that's less true for machine learning folks and more of a specialized role.
15:42 Not always.
15:43 I know there's teams and stuff, but a lot of companies have a data scientist or one or
15:47 two data scientists.
15:48 And it seems to me like communities like this would be more about even more valuable than
15:53 say like a software development one where it's maybe easier to find in your environment as
16:00 a meet at a meetup or at a company that you already work at or something like that.
16:03 Totally.
16:03 Totally.
16:04 I resonate with your description there, Michael, because that was me when I found the community.
16:08 Right.
16:09 You know, I was a machine learning engineer who was putting together some models that I
16:14 thought could be used better by the company that I was at, you know, could be integrated
16:18 into our hardware device a little bit better than I thought that, you know, than was currently
16:21 being done.
16:22 And I was trying to talk to our software engineers and realize like, hey, there's kind of a, just
16:27 like a translation gap.
16:28 Like, what is this field of like taking a model that exists and has some value in terms of
16:34 like accuracy and precision or whatever else.
16:36 And I turned that into, you know, an artifact that's consumed by software teams and then
16:40 ultimately by business teams.
16:41 And I read about ML Ops in 2020, but was really not clear what exactly that meant.
16:46 It felt more like a buzzword than it did like a field at that time.
16:49 And I joined the community and it essentially became real-time stack overflow for me, right?
16:54 Where people are asking and answering questions daily about how do you actually implement production
17:00 ML?
17:01 What does it look like?
17:02 What are the best practices?
17:03 And how can we help each other do that at work and be better ML practitioners day to
17:08 day?
17:08 So it's been great from that standpoint.
17:09 Yeah.
17:10 Fantastic.
17:10 Kate, how about you?
17:11 How did you come to the community?
17:13 I'm not actually sure exactly, but I think it was also like the whole pandemic.
17:17 You tried to connect to the world and including ML Ops.
17:20 And at the time, a friend of mine was also starting a community in Riga.
17:24 It wasn't ML Ops.
17:25 It was data science in my hometown.
17:27 Because, again, we couldn't meet in person.
17:29 So we did that.
17:31 And along the same time, we discovered that there are other Slack communities.
17:34 And it was just fun to...
17:36 We had a different focus.
17:38 So it was just fun to be kind of part of both and in different directions.
17:41 And I do agree with the stack overflow notion.
17:45 Like sometimes, especially if you're working alone on a problem, you just go there to feel
17:49 like you're not crazy, first of all.
17:50 To think that's what you're thinking.
17:53 And secondly, just basically, I can search right now.
17:55 There's like such a huge knowledge base because so many questions have been asked.
18:00 So I can go there to quickly validate some ideas, see which angles have been taken into consideration.
18:05 So in that sense, it was fun.
18:07 And definitely the networking aspect.
18:08 Luckily, Berlin is full of ML people.
18:11 And that's definitely one of the places you can reach out for questions.
18:15 I had some people reach out to me, ask about like, what are the companies to apply to?
18:20 Are there any jobs?
18:21 Do you want to refer me?
18:22 And vice versa.
18:23 So it's definitely fun.
18:24 Yeah, nice.
18:25 I don't know why this is, but you're right.
18:27 It does seem like Berlin and sort of generally Germany as well.
18:32 It's just a lot of times I'm talking to people about data science or machine learning.
18:36 I'm like, wait, you're both in Berlin?
18:38 It's like of all the places in the world, it does seem like a hotbed of machine learning.
18:44 That's pretty cool.
18:44 And now there's a lot of companies that just source their talent here and work in completely
18:49 different markets because they know that there are people here that people will be willing
18:53 to relocate here as well.
18:55 So that's definitely adding to the whole dynamic.
18:58 Yeah, absolutely.
18:58 It's a cool place to live.
18:59 And I think probably a lot of folks in Germany, I know the education system there, they end up
19:05 very good and strong backgrounds in math and stuff.
19:08 So, right.
19:09 Compared to other places in the world, like the US, certainly coming out of high school,
19:14 there, you come out with a very strong math degree.
19:17 I know that much.
19:17 So pretty cool.
19:18 All right.
19:18 Well, let's start by talking about, I wanted to cover sort of the URL thoughts on some different
19:25 layers or stages of a machine learning project.
19:28 So we could structure our conversation around that.
19:31 So I want to talk about prototyping, come up with ideas, choosing frameworks, things like
19:35 that.
19:35 And then talk about building the project with development and then running and maintaining
19:40 it over time, the whole ops sort of machine learning and production side of things.
19:45 So let's start with prototyping.
19:47 So I guess the first question I wanted to ask, and this might be, I don't know, y'all can laugh
19:52 at me or whatever, but deciding is a problem actually a machine learning problem?
19:56 Because there is so much hype around AI and machine learning that sometimes I think a series
20:02 of if statements is called artificial intelligence.
20:05 I remember there was this scandal, I think it was in the UK, there was some airline company
20:10 that was specifically finding families traveling together and booking them apart from each other
20:16 and then charging them money to book themselves back like a premium to so they could sit with
20:21 their family.
20:21 Like, sorry, your seat didn't get put next to your family.
20:24 Whoopsie.
20:25 But you can pay 50 bucks or 50 quid or whatever and get back by your family, right?
20:29 And it was in the lawmakers, they were like, they're using artificial intelligence to do this
20:34 terrible.
20:34 Like, that sounds like an if statement.
20:36 So, you know, like I think deciding, should I just encode some decision-making into software,
20:43 which we often call programming versus is it a real machine learning problem where neural
20:48 networks or NLP, like how do you all think about this is a machine learning problem versus
20:54 this is just a software problem, traditional style?
20:56 I'll just say one thing just before I let Vishnu and Kate jump in and it comes from one of the
21:03 community members.
21:04 Who has like a famous tweet and blog posts about like the first rule of machine learning is don't use
21:12 machine learning.
21:14 And so, if you can avoid it at all costs, it's probably better just because you're adding additional
21:20 complexity.
21:21 And if you don't need to do that, then it's going to be infinitely easier on you to not.
21:28 That being said, there are a lot of times where you do need to add it.
21:32 And there are a lot of businesses that are realizing business value from it.
21:36 So, that's like kind of a hand wavy answer, but I'll see if Vishnu or Kate have anything
21:41 more concrete.
21:42 Yeah, absolutely.
21:43 Also, I just want to give a quick shout out.
21:45 I did have Eugene on the podcast in March of last year.
21:50 We talked about the seven lessons that machine learning can teach us about life or something
21:54 like that, which is pretty, he's a very thoughtful guy.
21:56 Yeah, he's great.
21:57 He is awesome.
21:58 Yeah.
21:59 Your question is about whether or not to use machine learning versus maybe more traditional
22:03 approaches.
22:04 Is that correct?
22:05 Right, right.
22:05 Someone comes to you with a problem and says, we need to make our API do this or our website
22:10 do this or our app do that.
22:11 And how do you decide, like, this is a machine learning problem versus just, you know, you
22:15 just need to decide if statement or switch statement or whatever.
22:18 I deal with this problem a lot in my role right now.
22:21 I'm a data scientist at a healthcare services company, actually the first day to hire.
22:25 And a lot of times I have people come to me that says, hey, this is a clinical population.
22:30 This is a group of people that we want to serve.
22:32 You know, how do you think about delivering our intervention to them?
22:37 I'm speaking in generalities, but I can talk a little bit more specifically to some examples
22:40 of this later on.
22:42 And to me, what matters most is driving value on a single business metric, because to be honest,
22:49 that's the simplest way that your company does better and you do better, right?
22:53 It's not through, let's say, implementing the most fancy algorithm.
22:57 You know, we're not working in research environments anymore when it comes to machine learning as
23:02 often as we used to, right?
23:03 If you're in a research role, be my guest.
23:05 Use the most advanced algorithm or whatever the coolest, latest model might be.
23:10 But for me, the number one thing is what is the business model metric I'm trying to drive?
23:15 Is that engagement for our population?
23:17 Is that some sort of like quality metric in the healthcare sense that I'm trying to change?
23:22 And then what is the simplest possible way that I can advance that number, right?
23:27 Is it just like a standard SQL query I run?
23:29 Is it, you know, some kind of template in SQL query or is it a machine learning model?
23:33 And the idea is simple is reliable and scalable, much more so than something complicated.
23:39 And that's kind of the way that I triage whether or not machine learning is the right intervention
23:44 for a problem.
23:45 And so to kind of recap, it's like, what's the business model metric I'm trying to push?
23:49 And then what's the simplest possible thing, knowing that it's something that needs to be
23:53 reliable and maintainable for the long term.
23:54 Yeah.
23:54 Excellent.
23:55 Kate?
23:55 I agree with Vishnu.
23:56 Well, my advantage as a newcomer to this career path, I guess, was that I was just unable to go very fancy in many of the cases.
24:07 So it was a no brainer to go, what is the simplest way and what is actually the way I can solve the problem?
24:14 And then just by having adopted that mentality early on, you kind of, you basically hack the problem, right?
24:19 And then also coming from more business marketing background, again, I'm not thinking about what is the fancy, shiny new tool that I want to use.
24:27 Sometimes I do now because I'm bored, but this is more for the cases where nobody cares what I'm really doing.
24:33 And then I show them something sparkly.
24:36 From my side, I would say it's very important to communicate the solution rather than thinking about whether it's an ML problem or not at the end of the day.
24:46 So if you give the people what they want and with the shine and sparkle that they want, they will forget whether there's AI behind it or not.
24:54 Having said that, though, in consulting experience, sometimes people really come to you and they're like, I want this predictive model and just like leave it or take it.
25:03 And you do it for them, even though it's not the most, the best approach, or especially if you try to explain, maybe that is not the best approach.
25:11 But sometimes people just need to, I don't know, maybe they have a KPI of having developed one and introduced one ML project into their teams.
25:21 And this is it, like whatever way.
25:23 So there are some cases, but I think most people can be reasoned with.
25:28 Yeah, the stockholders demand more AI.
25:30 So we need more AI.
25:31 I don't know what that means, but we definitely need it.
25:33 Yeah, that's great.
25:34 And I do resonate with the idea of just start simple and evolve.
25:39 Like so many people perceive software and software projects in the broader sense as I have to think about it for a long time and get it right.
25:48 Rather than let me try to build something over a day.
25:51 And if that doesn't work, I'll try something else.
25:54 Or we can evolve it over time.
25:56 Maybe that comes from back when software was harder to change and the tools were chromium or whatever.
26:01 Yeah, I think just jump in, try something and go with that.
26:04 Yeah, I wanted to say something real fast on machine learning, especially.
26:09 And this is something that I think a lot of people don't realize until later on.
26:15 And I see it time and time again from the veterans in the community coming and answering questions that are being proposed.
26:23 And that is like really taking it back when you're proposing a solution or when you're asking a question in the community.
26:31 It's like, but what is it exactly?
26:33 What is the business value here that you're trying to affect?
26:37 Because machine learning is so closely tied to business that you can't really...
26:43 It's like that Venn diagram is so overlapping on the business side.
26:48 And it also has the technology side.
26:50 But a lot of times we just take for granted that it's technology.
26:54 But you can't do that, right?
26:56 Like especially with machine learning.
26:58 Like if you're an SRE and you're just focusing on the SLOs and all of that, it's a lot easier to be like, okay, cool.
27:05 This is a very technology problem.
27:07 But when you're machine learning and like Vishnu said, like you really got to know what kind of metrics are you trying to move the needle on?
27:14 And how are you going to go about that?
27:16 So that was just something that I noticed comes up quite a bit from the veterans saying like, wait a minute, slow down.
27:24 Before you go for that next one, what is it that you're trying to do here?
27:28 Yeah.
27:28 Probably someone comes in with a question like, I'm trying to make PyTorch do this thing.
27:33 And I'm having a hard time.
27:34 Like, whoa, whoa, whoa.
27:35 But do you really need to do it?
27:36 Like, is that really?
27:37 Maybe it's cool or whatever.
27:40 But is that really what you need in this situation?
27:42 Could you just use Pandas?
27:43 That's exactly what it is.
27:45 And oftentimes it's something even more complicated than that, where it's like, I'm trying to have, you know, I'm trying to use Kubernetes to do this retraining job 100,000 times.
27:54 So I can see which of these models is most effective at this particular metric.
28:00 You're like, whoa, whoa.
28:01 You're about to undertake a huge engineering project with all kinds of complicated tools.
28:05 Why?
28:06 Yeah.
28:07 But if you ask why you send people into existential crisis, so that is the interest as well.
28:12 I guess the audience, Cam asks or says rather, a lot of times I've researchers propose algorithms that are too computationally complex to run at the speed we need for production.
28:23 And I think that's probably also an interesting thing to consider when you're doing the sort of prototyping, getting things started is maybe here's a algorithm.
28:33 I'll give you a great answer.
28:34 But, you know, Vishnu, you talked about training 100,000 ways, right?
28:38 Like, can you really spend $25,000 to decide, you know, on compute in the cloud or something to decide if this is even going to work, right?
28:47 Or could you do that in production and over the data you have or something, right?
28:50 What are your thoughts on this?
28:51 A lot of times, and I'd say like a lot of times machine learning, our default in terms of technical requirements is pretty unilateral, you could say, right?
28:59 It's like increased accuracy or the precision or whatever the score is.
29:03 But in software engineering and in other forms of engineering, they're much better at understanding multivariate requirements, right?
29:09 And I think that's where you see the tension, right?
29:12 The data scientist says, but this model works at the problem.
29:15 It does the thing it's supposed to.
29:16 It's better on this one thing.
29:17 And the engineer says, well, wait, no, there are a million things that I think about.
29:20 I need to think about latency.
29:21 I need to think about my capability.
29:22 I need to think about, you know, how big the file might be.
29:25 There might be a million different things.
29:27 And I think that's where you see the kind of thing that Cam's talking about.
29:30 Yeah, interesting.
29:30 And I think it comes from academia, actually, because they just recently, I remember a year ago, we're doing a meta science course where it discusses ML, basically model assessment, and how the same researchers could not reproduce their results because they just realized they use different compute.
29:49 So they will be having different results.
29:52 And in each of those cases, they realize they need to bring that information in, in the academic paper to have comparable scenarios.
30:02 So maybe you have a little compute, a little compute, and then this is the best model.
30:05 And in the other scenario, the other one is better model.
30:08 And I think now that there's changes in academia, it will kind of become more standard to consider elsewhere as well.
30:15 Yeah, it's a good point.
30:16 How many times have we heard of stories, and I'm sure people listening have lived these stories of a data scientist coming with a model, and it's like, or they go, they get tasked with a problem.
30:27 They go work on a model for a while, and they come back with the accuracy or F score is like, perfect.
30:35 And then they give it to whoever the stakeholder is, another stakeholder, and that stakeholder is like, but this is useless to me.
30:41 I don't care what your accuracy score, you're not actually solving the problem here.
30:45 And it just goes back to this, like, make sure you're very clear on what you're optimizing for.
30:52 And then another point I wanted to make, which I feel like is super important, is depending on your use case, you really have to be vigilant about what you're trying to do and what you use to get there.
31:05 Because if you're doing like autonomous vehicles, that is a whole nother world as compared to what Vishnu is doing or what Kate's doing, right?
31:14 Like, even if you're just dealing with unstructured data, and you have a computer vision problem that's in healthcare, and it's deciding if someone has cancer, that takes a lot longer.
31:24 You can put out like one model, and it has to get approved by the FDA.
31:28 And that takes a long time for it to go through that process.
31:31 So you don't really care about like, updating that model in real time and gathering that data right on the fly, right?
31:37 But if you're in autonomous vehicles, that's a whole nother set of problems that you need to get into.
31:43 And so really like recognizing what is your use case?
31:46 What is the big things that you need to take into account as you're looking into these use cases?
31:51 And where do you want to optimize?
31:54 For sure. And I definitely want to talk about that tradeoff of like, building the perfect model versus evolving it later.
32:00 This portion of Talk Python to Me is brought to you by the Stack Overflow podcast.
32:06 There are few places more significant to software developers than Stack Overflow.
32:11 But did you know they have a podcast?
32:13 For a dozen years, the Stack Overflow podcast has been exploring what it means to be a developer,
32:19 and how the art and practice of software programming is changing our world.
32:23 Are you wondering which skills you need to break into the world of technology or level up as a developer?
32:28 Curious how the tools and frameworks you use every day were created?
32:32 The Stack Overflow podcast is your resource for tough coding questions and your home for candid conversations with guests from leading tech companies about the art and practice of programming.
32:43 From Rails to React, from Java to Python, the Stack Overflow podcast will help you understand how technology is made and where it's headed.
32:51 Hosted by Ben Popper, Cassidy Williams, Matt Kierninder, and Sierra Ford, the Stack Overflow podcast is your home for all things code.
32:59 You'll find new episodes twice a week wherever you get your podcasts.
33:02 Just visit talkpython.fm/stackoverflow and click your podcast player icon to subscribe.
33:09 And one more thing.
33:10 I know you're a podcast veteran and you could just open up your favorite podcast app and search for the Stack Overflow podcast and subscribe there.
33:16 But our sponsors continue to support us when they see results and they'll only know you're interested from Talk Python if you use our link.
33:23 So if you plan on listening, do use our link, talkpython.fm/stackoverflow to get started.
33:28 Thank you to Stack Overflow for sponsoring the show.
33:31 So another area that I think is interesting to consider as you're getting started has to do with how much compute does it take to solve some of these problems and build some of these models?
33:44 Like more than a lot of other areas of software development, training models takes a ton of energy, which can either mean money or time or both.
33:53 You're putting aside like just the carbon costs of like spending a lot of time on servers, but just, you know, how are you going to accomplish that?
33:59 Right.
33:59 So how do you think about that tradeoff?
34:01 Like, oh, can I train this up on my laptop or do I need to get a GPU cluster or what are your thoughts there?
34:08 It's funny.
34:09 I actually think the compute part of MLOps is perhaps the most solved portion of the stack.
34:15 We usually think about things in terms of data model and code in machine learning, right?
34:22 You know, your data is changing.
34:23 You want your model to change as the data changes.
34:25 And then your code is just sort of a way that you control the model that you're creating based on that data.
34:30 Right.
34:31 And, you know, when you think about what AWS has done over the last few years to make it more possible than ever companies like paper space as well.
34:39 And even, you know, Google cloud was making cloud free and GPU instances free.
34:44 Like it's very easy to experiment with compute much more so than ever before.
34:48 And I think that is the hardest part of the process, right?
34:52 It's freedom to experiment is usually what is the constraint, right?
34:57 Right.
34:57 And, you know, I'll just contrast that with data.
35:00 It's very difficult to, I would say, experiment in the machine learning process right now with different data sets, with trying to work with sort of synthetic data or use different cuts of data that are correlated with interpretability.
35:13 That is a very much more complicated area.
35:16 I would say that I think a lot of machine learning professionals spend a lot of time doing a lot of manual work on as opposed to compute nowadays, which is by and large, I think like a very experimental and solved problem.
35:26 Yeah.
35:27 Excellent.
35:27 Do you do any edge computing with your, do you do like medical devices or anything like little, little devices that people walk around with that you've got to put real time stuff onto?
35:37 I used to, I used to work at a medical device company that was doing an eye imaging device.
35:41 And we used to put a machine learning model on an ophthalmic imager.
35:45 And it was interesting.
35:47 It really gave me a lot of appreciation for, you know, what some of the folks at, at Apple and Nvidia and Fitbit and some of the other, you know, sort of bigger companies that do a lot of, a lot of machine learning on device do.
36:01 It takes a particularly talented group of hardware and software and machine learning professionals to make all that stuff work together.
36:08 And I would definitely suggest to anybody interested in this field, like check out what Nvidia is doing.
36:13 They're doing a lot of really cool stuff.
36:14 From what I learned in my time doing it, I learned that, you know, simplicity is really, it's the way that you can get things done.
36:21 We were just exporting model weights to a pickle file and then doing some very low level computation to make it as fast as we needed.
36:28 So that's about my experience working on that level.
36:31 Interesting.
36:31 Okay.
36:31 Yeah.
36:32 I'm always blown away that pickle is still used, but especially in this area, it's pretty interesting.
36:36 It works.
36:37 I know.
36:37 It definitely does.
36:38 It's easy.
36:38 It's just whatever.
36:39 Just save that.
36:40 We'll deal with that.
36:41 Let's talk real quick about one thing you have over on the community.
36:46 You have a couple of things up at the top here.
36:48 And one of them is called this feature store.
36:51 It sounds like it might be helpful for people to get started.
36:54 Do you want to tell us about this a little bit?
36:56 Yeah.
36:56 So we like set out to demystify a space.
37:00 And like, I have so many stories of how this just was like, I didn't know what I was getting myself into when I started creating this.
37:09 Because the MLOps, first of all, like right now in machine learning, there's not clear spaces of like, oh, this is this tool.
37:20 And you need this tool if you're doing X, Y, Z.
37:23 And here's the space that it occupies.
37:26 There's like this tooling sprawl.
37:28 And some tools do a little bit of this.
37:30 And other tools do a little bit of that.
37:31 And maybe this tool does all of these things, but it doesn't get you with that.
37:35 And so what I've been able to congregate on or like I landed on a few things that are different spaces that are clearly defined.
37:46 And one is the monitoring space, because I think that's just really easy for people to comprehend.
37:53 Monitoring is it's like software development monitoring.
37:56 But then you add the machine learning aspect to it.
37:59 And you add that data.
38:00 And then boom, you've got lots of new stuff to monitor.
38:04 And then there's the feature store part.
38:05 There's also like the metadata management or experiment tracking piece.
38:10 And then there's the deploy piece.
38:11 And like you can ask Vishnu, I've gone back and forth on how we're going to present these things, what we're going to do in order to create.
38:21 Should we create the framework and basically do Gartner's job for them and say like, these are what you should be looking for in the tooling if you're looking for a feature store or if you're looking for a monitoring tool.
38:34 And that's what we kind of have been trying to do.
38:37 And we've been working with all the different companies in the community and then also like practitioners who use this.
38:43 And we've been asking them like, hey, is this what you would expect from a feature store or a monitoring company or metadata management or deployment tool?
38:52 The problem is that like there's a lot of tools right now that are doing like they're specializing in optimizing compute.
38:59 And then they also have deployment as a feature.
39:02 So is that a deployment tool?
39:04 Yeah.
39:04 So, yeah, it's a tricky one.
39:07 But I guess the takeaway maybe is over on your website, you've got like some different categories that sort of try to do that comparison to help people pick some of these tools.
39:14 Yeah, exactly.
39:16 And so that's where me pulling my hair out for the last like year and a half has been just trying to figure out like what because I know that if I'm struggling differentiating all of these tools, I'm not the only one.
39:28 This is what I go on and people reach out to me quite a bit to tell me about their new MLOps tool.
39:34 Then I go on to their website and I'm like, what exactly do they do?
39:37 It's always the same like stat.
39:40 You hear like 80% of machine learning models never make it into production.
39:43 And that's like what they have is they're like H1 on their website.
39:46 And then it's like, but what do you do?
39:49 I don't understand.
39:50 So we tried to like take a non-biased approach to figuring that out.
39:54 Yeah, very cool.
39:55 Let's move over to the development side.
39:59 So you sort of figured out your path.
40:01 You decided it is a machine learning problem, how you want to approach it and so on.
40:05 What are some of the recommendations or techniques that you have for people in that stage of these ML projects?
40:14 You know, the one that I hear a lot of times is, or the category, I guess, is there's a lot of software engineering practices that don't necessarily get brought into data science as often as they should.
40:26 For example, like unit testing or version control or stuff like that.
40:29 Like what are your thoughts?
40:30 Kate and Vishnu especially.
40:31 I'll let Kate go first.
40:33 Yeah, it's a good one.
40:34 I wouldn't call myself a super awesome engineer.
40:38 I'm still on that path trying to just be tolerant and understanding to my other folks on the team who actually know what they're doing and not to make their lives too hard.
40:47 My personal super tiny hack was, especially in the days where I think people are still using notebooks, but I'm using VS Code with this like hash percent percent sign, which turns your script into cells.
41:01 And when I'm done, I just clean it up and it becomes a normal script again.
41:06 Or at least I can basically copy it and do that.
41:10 You can still run it, yeah.
41:10 And still run it and test it out that way.
41:12 And that I think really speeds up the process.
41:15 And of course, yeah, version control.
41:17 Currently, I'm working with teams or just basically either data engineer or SREs who are helping me make things a bit more polished or deploy them a bit better.
41:29 So I don't have to think about it that much.
41:31 So I'm happy to hear from Vishnu.
41:34 Yeah.
41:34 To me, components of the software engineering workflow that I suggest data scientists and machine learning engineers heavily leverage are version control, CI, CD testing, and in general, you know, clean code best practices.
41:47 Right.
41:48 You know, I'm not.
41:49 Yeah.
41:50 Code reuse functions rather than just top to bottom with copy paste.
41:54 Yeah.
41:55 I mean, if you do, if you do those three, four things, like, you know, that's actually, you know, I would say like a good bit of a hard day-to-day stuff, right?
42:03 Just following those best practices.
42:04 Yeah.
42:04 And they'll really accelerate you.
42:06 I try to set those things up early on in the process of running a project, you know, setting up the repo, get a repo the right way, setting up CI, CD to the extent that makes sense.
42:16 You know, you don't want to over-engineer infrastructure around your project without having a project too early on.
42:21 But I think those kinds of things are very helpful.
42:24 And, you know, we had on a guest on our podcast, our ops community podcast, Demetrius and I, a guy named Jesse Johnson, who is the VP of software engineering and data scientists at a company called DuPont Therapeutics.
42:35 And he has this concept known as building software from the outside in.
42:38 And what that means is you start sort of at the finished, you start like when you're building like a product, a product, maybe that involves an API, like define what the API looks like.
42:48 Define what you'd like your end user, what their experience to be like, and then build backwards to all the things that allow for that to happen.
42:56 And I like to think about the same thing with the machine learning model or with whatever kind of data science output that I have, which is what I want that end experience to look like, how elegant should it feel, how well, how reliable, maintainable and well architected should it be.
43:10 And then build in, you know, build inwards towards that.
43:13 So that's sort of another software sort of engineering concept that I try to apply.
43:16 Yeah.
43:17 A lot of people talk about how unit testing makes your code better.
43:21 And I think that's actually, you've touched on the key to why that is.
43:25 Because instead of just thinking about the details of the algorithm, you have to think about like, well, how is this going to look when somebody tries to use it?
43:31 Right.
43:31 And like, because that's what you got to kind of do in the test is use it a little bit.
43:34 Exactly.
43:35 Kate, do you want to go?
43:35 I just wanted to do a quick joke that it runs on my machine.
43:39 So it must be good.
43:40 It must work.
43:42 Do you know about the, it works on my machine certification program?
43:45 No.
43:47 What is that?
43:48 I need this.
43:48 Oh, it's brilliant.
43:50 It has some interesting rules.
43:52 So yeah, anytime somebody is sort of, you can do this, you can give them this stamp.
43:58 And it's brilliant.
43:59 To be certified, you have to compile your application code.
44:02 This is like a language of files.
44:03 Getting the latest version of any code changes from any other developer is purely optional.
44:07 Launch the app.
44:09 Cause at least one path to be executed.
44:11 Preferred way to do this is ad hoc manual testing.
44:15 You can omit this step if the code change was fewer than five lies or in the developer's
44:20 professional opinion.
44:21 The code change could not possibly result in an error.
44:23 Check in your code.
44:24 Certified.
44:25 Oh boy.
44:26 Yeah, I know.
44:27 It's the problem that we all run into, right?
44:29 Yeah.
44:29 Yeah.
44:30 Reproducibility and the shareability and so on.
44:34 And you're right that like source control and CI and these things are absolutely, absolutely
44:39 important.
44:39 One thing, Demetrius, go ahead and jump in with your thought.
44:42 Then I want to talk about a tool here real quick.
44:44 Oh yeah.
44:44 I was just going to say that one of the guests on the community podcast made an excellent point.
44:50 We've had two that come on, that have come on and talked about testing specifically for
44:54 machine learning.
44:55 And one of them, Svet, was talking about how when you properly test, and actually,
45:01 Mohamed, both of them, they kind of said the same thing in different words.
45:04 When you have tests set up and they're properly there, not just unit tests, but also like testing
45:09 the data and this kind of stuff.
45:11 One of the side effects that you get from that is it's documentation.
45:15 It's like you have these artifacts that are left.
45:18 So you, as someone who's coming in and jumping in new, you get to see, oh, okay, so this is
45:24 what's happening with this slice of data.
45:27 And this is when you are trying to debug something, it gives you a much better picture and it's
45:33 more clearly laid out than if you were to just leave it all up to chance, we could say.
45:39 And, you know, the open source angle, if you're looking at some package or library to use for
45:45 your code and you go and check it out and it has no tests, there's a good chance you're like,
45:50 ah, this thing's not really ready.
45:51 People are not putting the effort to make sure it's good enough to remain good enough over time
45:56 and to help onboard new contributors that you really want to depend upon it, right?
46:01 So I think there's that angle that's important as well.
46:03 The two tools I wanted to talk about really quick in this area is this thing called DCV,
46:10 which is open source version control for machine learning projects.
46:14 maybe, and then also before that is NB dev from fast.ai.
46:19 Vishnu, you spoke about using Git and checking in yourself and so on.
46:23 And one of the challenges, I think notebooks are great, but one of the challenges of them is they
46:28 kind of save their output in the file and every run potentially be generating a different output,
46:33 which means some like meaningless Git merge conflict problem you got to deal with.
46:38 Do you have, do you have ideas to fix, fix some of that stuff?
46:41 Or do you use anything like NB dev to sort of make that simpler?
46:43 Good question.
46:44 I like paper mill a lot.
46:46 I follow the Netflix approach for working with notebooks.
46:49 I think that they're a very useful tool.
46:51 I think that if you're prescriptive, give us the quick summary of what the Netflix philosophy there is.
46:57 Sure.
46:57 Yeah.
46:57 Maybe they should do that.
46:58 Basically, Netflix's point is notebooks are a really effective, interactive prototyping tool that if given really prescriptive guardrails can be a part of a production process.
47:12 And they use notebooks in production at Netflix and they really encourage their use internally.
47:16 And they've developed a lot of tools that help make sure that the known flaws of notebooks, such as the state element that you pointed out, don't lead to things falling off the rails.
47:27 And that's the way I like to think about it as well.
47:29 I don't use an NB dev myself, but I do think it's a great project that for people who are thinking about using notebooks, they should check out.
47:38 I think another tool that I've heard of is called Plumber.
47:41 But the most important thing is when you're using notebooks, I think rather than getting into a holy war about whether they're the best tool or not, it's understanding that is it the right tool for my job?
47:53 And I view the right context for that as being when iteration and speed is of the essence.
47:59 There is no tool that is better than a Jupyter notebook.
48:02 It's just there isn't.
48:03 And then we have the empirical proof for that.
48:04 So let's just find ways of putting process around it to make it work.
48:08 Sure.
48:08 Yeah, that makes sense.
48:10 What about tools like DCV, some of these version control systems?
48:15 You know, part of the problem is the data sets and data science in general, but in machine learning as well, it's just like it doesn't actually make sense to check them directly into Git.
48:24 To Git, yeah.
48:25 You know, yeah, yeah, yeah.
48:26 So some tools like this have sprung up to allow for sort of a side-by-side store of your data, but it's also tracked in Git, but not entirely stored in Git.
48:35 So do you, Kate or Vishnu, or are you using this?
48:39 I'm trying not to, but this is exactly the space that that company that went out of business at the beginning of the podcast, this is what they were playing in and trying to make like this happen.
48:49 Interesting, okay.
48:49 But yeah, I'll let Kate and Vishnu give their opinions on DVC.
48:54 Yeah, it's also the kind of tool that you would have to categorize into your Gartner four quadrant.
49:00 Yeah, yeah.
49:01 Somehow.
49:01 What I found, what's so interesting is that, I mean, just, and this is horrible to say, but it feels like people have so
49:09 many other problems with getting their machine learning into production that this is almost
49:14 like a secondary thing that comes afterwards, or they don't necessarily care as much about
49:21 this because it is, it's like not mission critical.
49:24 There's bigger, more significant problems, right?
49:26 That are just like, we got to solve this for this even matter.
49:29 That's not to say there's not a lot of people out there.
49:30 I know there's a ton of people that use DVC.
49:32 Their community is huge.
49:33 It's a great tool.
49:35 But I didn't think about that sometimes.
49:38 Like, yeah, there's, there's mission critical things that are really top of mind.
49:43 And everybody likes to talk about how reproducibility is very important for them.
49:47 But unless you're in like banking and you need to because of the law, like it's not necessarily
49:54 the, the most important thing for you.
49:56 Sure.
49:57 I would say in my experience, versioning and lineage and stuff tends to be the province of
50:01 data engineering teams a lot more than it is machine learning engineers.
50:04 And the way that we think about data in a machine learning context as sort of one-off artifacts
50:11 that we need to associate with a particular experiment is not compatible with the way a lot of data
50:16 engineering team think about data engineering teams think about their data sets as being sort
50:20 of like a baked cake, you know, that is ready for consistent use going forward, right?
50:24 To use that sort of cake, cake analogy that's common in data engineering.
50:28 I personally, I use DVC.
50:30 I think it's a well-created tool.
50:32 It didn't exactly solve my problems.
50:35 I'm a big believer.
50:35 Like if you're a machine learning engineer and you're having data versioning problems, like
50:38 go learn a little bit about data engineering and learn about data warehouses and databases
50:43 and how you can leverage the existing tools in that field, rather than trying to use, you
50:47 know, maybe something that might have been spun up for a very specific purpose or adopt a platform
50:52 like a pachyderm that has, you know, an end-to-end approach to thinking about the entire
50:56 machine learning pipeline that does data set versioning that helps you do your experiment
51:00 tracking and helps you with, you know, sort of your deployment.
51:03 That's kind of my approach to thinking about data set versioning as a component of the overall ML workflow.
51:08 Yeah, we have similar, yeah.
51:10 So we don't use any specific tools.
51:12 We are going as many larger companies building our own set of tools, of course, based by people
51:20 who might not ever touch machine learning in their whole lives with some other people who
51:25 complain about how their files don't work.
51:27 Trying to make everyone happy.
51:28 But essentially, yeah, it's basically our data engineers trying to integrate some kind of
51:33 extra metadata in our deployment tools so that we know that we can backtrack those specific
51:38 training sets whatsoever to the models that we're training or deploying and so forth.
51:43 So basically, it's, as Vishen said, in the realm of data engineers who are making things happen,
51:50 they're the greatest people, all of them all.
51:53 Two thoughts.
51:54 One, before people get too tied up about trying to be perfect, like putting just your notebooks
52:00 and your scripts into version control and straight to get is like 80%.
52:04 It's got to be so much better than not doing anything.
52:08 And then two, I'm fascinated by this comment, Kate, about like built by people who don't do
52:13 machine learning for machine learning.
52:15 Like, is that what, how's that sort of interaction?
52:17 Is that challenging to cross that boundary or whatever to work with folks at companies like
52:23 that that are not quite totally experienced at what you're building?
52:25 I think it's funny for me because I come from consulting where we kind of closely work together
52:30 with data engineers.
52:31 And there was kind of always an overlap of your tasks.
52:33 And now I'm coming to a larger organization where people's tasks are pretty well outlined and
52:38 their responsibilities are more narrow.
52:41 And sometimes people treat me like, I don't know, something between an alien and a dinosaur
52:45 because they have never interacted with like a data scientist or machine learning person
52:50 and ask me questions.
52:53 Like, I don't know, I'm going to just tell them what's the meaning of life all of a sudden.
52:57 We have this data, find the answers.
52:58 And usually it's very basic stuff, but I think it's very important to have those people in
53:04 the organization.
53:04 Some people call them data strategists.
53:06 Some people call them just fun people to talk to, like Demetrius, you know, who will bridge
53:11 that gap and make sure that everybody's heard and translated to each other's language to just
53:17 get that set of requirements and not build what data engineers want to build or build just what
53:23 data scientists want to build and create that connection between the two.
53:26 So that is important.
53:27 Yeah, very cool.
53:28 So let's move over to the operation side.
53:31 Now, on the ops side, I guess the first thing I really want to think about is where do you
53:36 run your code?
53:37 Where do you run your models and APIs that back them and stuff?
53:40 And I looked over around here and I see a lot of stuff about Kubernetes on the community for
53:48 you all.
53:48 And so Kubernetes, Linux virtual machines run in the cloud.
53:53 Does that take too much to run it in terms of sort of compute costs?
53:57 You would rather run it locally and you own servers.
53:58 What are your thoughts on this?
54:00 I mean, to me, it depends on how intensive your training efforts are and how many models
54:04 you have in production, right?
54:06 Setting up a Kubernetes cluster for one model seems to me like overkill.
54:10 But, you know, for a company that has tens or hundreds of models in production, it might
54:15 make a lot of sense, right, to try and set up some sort of distributed training architecture,
54:19 et cetera.
54:19 Especially if someone else is maintaining that cluster for you.
54:22 Yeah.
54:22 Yeah.
54:23 Yeah.
54:24 It's one thing to say, I want to have, I want to create and babysit a Kubernetes cluster versus
54:28 I want to be able to give my job to Kubernetes.
54:30 Yeah.
54:30 It's not the same, right?
54:32 Exactly.
54:32 Yeah.
54:32 Yeah.
54:33 So I think it really depends on, you know, what scale you're at.
54:36 That's kind of what we talk a lot about in the community and confused about what scale
54:40 you're at or what maturity level you're at.
54:42 I invite you, please join the Slack, the MLOps community Slack.
54:45 And, you know, I'm sure people will be able to provide some very technical suggestions
54:49 for your size.
54:49 But I guess my answer to you is it depends.
54:51 Okay.
54:52 I would say I use the path of least resistance.
54:54 So whatever is there, whatever is available, what is the easiest way to source your data?
54:59 What is already, what are the connections that are already set up?
55:02 Again, because sometimes you work with different teams or in consulting different companies.
55:07 That's about it.
55:08 And if I'm lucky enough, like right now to have a cluster that is babysitted by another
55:13 person, well, good.
55:15 Then it's a deal, you know, just take it.
55:18 This runs on my machine.
55:19 Exactly.
55:20 We should like pour out some whiskey or whatever, some drink for lots of data scientists that
55:27 have gone in to the Kubernetes world thinking they were going to pick it up and get their
55:32 models out because they needed to.
55:34 For some reason, they got brainwashed into thinking like, oh yeah, I should just learn Kubernetes
55:39 real fast.
55:39 And they never were able to come back.
55:42 Yeah.
55:43 And it's like such a huge detour, right?
55:46 And so we talk about this a lot and that is like, what, first of all, what should the
55:51 team composition look like as a machine learning like squad or team or someone trying to just
55:58 get value out of it?
55:59 And should a data scientist have to know Kubernetes?
56:02 And so that's kind of like the joke here is like Kubernetes is a gateway drug.
56:07 If you can get past it, if you can get into it, then you're probably going to go a lot
56:11 deeper.
56:12 But it's really hard to pick up, especially if you're coming from a data science background.
56:17 Yeah.
56:17 Yeah.
56:18 It's a whole nother thing to learn.
56:19 Yes.
56:20 Right on, Kate.
56:21 Exactly.
56:22 Pamphal in the audience has an interesting comment that I would love to echo as well.
56:26 Also, don't forget you're not Google, right?
56:28 And you're not Facebook and you're not Instagram and you're not, you know, all these companies
56:33 that are trying to run at such an insane scale, right?
56:36 That they have to come up and use very interesting deployment in DevOps, whereas maybe just running
56:43 on a server is fine.
56:44 I don't know, you know?
56:44 That's probably the most shared blog in the community is that you are not Google blog.
56:49 Yeah.
56:50 Yeah.
56:50 That was a great article.
56:52 It came out like a year and a half ago or something like that.
56:54 He's a community member in our community, but also is very well known in the general ML
56:58 community.
56:58 Jacopo Talibue, who's the director of AI at CoView.
57:02 He's been putting out a series of articles known as ML Ops at reasonable scale.
57:06 And I think it talks a lot about this entire concept, which is like a lot of what the discussion
57:12 is driven by about ML is about ML at unreasonable scale.
57:16 It's the Googles and the Ubers that are doing hundreds of millions and billions of events.
57:21 And for many of us, that's not the reality and what, you know, Mike just pulled up here.
57:24 It's a great series that I encourage everyone to read who feels like, hey, that's not applicable
57:28 to me.
57:28 What is?
57:29 This answers that.
57:30 And we had him on for a podcast as well.
57:32 Nice.
57:32 Yeah.
57:32 The subtitle here is ML Ops without too much ops.
57:35 That's good.
57:37 I like it.
57:38 And it's the same idea.
57:39 It sounds just very focused on machine learning rather than just broad software development.
57:43 Do any of you all use like software as a service type places?
57:48 I'm thinking of stuff like Streamlit, for example, like Streamlit Cloud and these types
57:53 of things where kind of build your model and you're just like, here, go run.
57:56 I mean, Heroku would be sort of the similar website equivalent of that.
58:00 So we do use Streamlit, but we don't use the cloud.
58:03 We use it just as an interface.
58:05 Right.
58:06 You can self-host it yourself.
58:08 We host it ourselves because then authentication, everything else is kind of linked to it.
58:13 And then we forget just a nice interface.
58:16 Yeah.
58:16 What are your thoughts on Streamlit?
58:17 It's awesome.
58:18 Yeah.
58:18 I haven't had to use it for anything, but my experience is that like it's the amount of
58:23 code that you have to write to create these really interactive dashboards.
58:27 It's super short.
58:28 Yeah.
58:28 And I mean, the alternative usually in the organization is that, oh, we have this Tableau
58:32 dashboard.
58:33 Right.
58:33 And that requires licenses and training for people.
58:37 And again, it has its own limitations at some points with certain amount of data, it just
58:43 starts crashing.
58:43 And this is an easy way to, for machine learning professionals to showcase whatever they're working
58:50 on in a way.
58:51 And it's been a lifesaver for us.
58:53 Sometimes people just don't get what the hell you're talking about.
58:56 So we were working on an NLP tool that would analyze a lot of unstructured text in a lot of
59:02 different ways.
59:03 And we just plugged all the possible ways to visualize that result.
59:07 And it just clicked for people.
59:09 And then you had a nice little video next to it.
59:11 You know, it's just so flexible.
59:12 It's really cool.
59:13 Yeah.
59:14 The nice part is it's not just a static notebook output, right?
59:17 You can sort of play with the dropdowns, click in the places, and it becomes a live
59:21 little interactive thing without you having to learn some sort of front-end programming
59:25 framework like Vue.js or something.
59:26 Yeah, exactly.
59:27 And also, on the other hand, we had developed this package to do the analysis.
59:31 But this gives a no-code interface for people to completely just do it by themselves.
59:36 And they feel really cool about interfacing with machine learning without having that background,
59:41 without having to learn.
59:43 That's cool.
59:43 Think of it as like a hybrid, right?
59:45 Like, build the real software development bit and then expose it to things like Streamlight.
59:50 Yeah.
59:51 Cool.
59:51 Vishnu, what are your thoughts?
59:52 No, I agree with everything Kate said.
59:54 I think it's...
59:55 Remember when it came out?
59:56 It came out with a lot of great examples.
59:58 And it came out with a lot of...
01:00:00 You know, at the time, it was still TensorFlow 1.0.
01:00:04 Two hadn't come out.
01:00:05 And it was really, really hard to work with machine learning models.
01:00:08 You know, it was like, you know, you had these, like, you know, these underlying graphs that
01:00:12 you had to kind of struggle with.
01:00:13 And people spent so much time just wrangling, like, how do I get an output for this model
01:00:19 training exercise?
01:00:20 But to have this, like, really neatly written package that would just take that as a backend
01:00:24 and then let me make, you know, interactive visualizations felt like magic.
01:00:28 You know, all great tools feel like magic.
01:00:29 And I think that's what Streamlit did well.
01:00:31 As Kate said, like, it allows, you know, non-technical users to interact with outputs of
01:00:35 models very naturally, which I think is a really important part of connecting some of
01:00:39 those technical to business value.
01:00:41 There's a whole set of tools like Streamlit that have since come out, you know, that are
01:00:44 making various parts of the sort of machine learning workflow or the operations workflow
01:00:48 a lot easier, you know, whether it's sort of, you know, letting you set up APIs on top
01:00:53 of those models or visualization or all kinds of other sort of aspects.
01:00:58 So starting a revolution.
01:00:59 Yeah, absolutely.
01:01:00 You know, for me, the magic is you write a function that looks like it just takes arguments
01:01:04 and you don't have to deal with callbacks and interactive stuff.
01:01:07 And it just kind of adds that into there.
01:01:10 What are some of the other tools like this that people might be using out there to get
01:01:14 their ML models online?
01:01:15 FastAPI right now is really popular as like a way of, you know, serving your model.
01:01:19 FastAPI?
01:01:20 Not fast AI, right?
01:01:21 Yeah, FastAPI.
01:01:22 Yeah.
01:01:22 Yeah, absolutely.
01:01:23 Yeah.
01:01:24 FastAPI is really popular.
01:01:25 There's a new tool I heard of called Banana, which also does this.
01:01:28 Banana.dev.
01:01:29 Banana.
01:01:30 Okay.
01:01:30 Yeah.
01:01:30 I mean, I think they're still early on.
01:01:33 I'm struggling to remember off the top of my head right now, but just in general, this
01:01:37 paradigm of saying like, hey, if I can like write a neat sort of like modern Python library
01:01:43 where I can write, as you said, a function with a couple arguments and then do things that
01:01:48 are associated with the machine learning workflow.
01:01:50 That's a model that works.
01:01:51 So like Great Expectations kind of does that with data testing.
01:01:53 Right, right, right.
01:01:54 You know, it kind of works like pytest for data.
01:01:56 And so I think what we're trying to do overall, the big picture is bring, you know, very Pythonic
01:02:02 ways of working with code into the realm of data and machine learning models to the extent
01:02:07 that makes sense.
01:02:08 Yeah.
01:02:08 Banana.
01:02:09 Banana.dev is totally new to me.
01:02:13 Interesting.
01:02:13 ML model hosting as an API.
01:02:16 Instantly, instant scalable interface hosting, inference hosting for your machine learning
01:02:21 models on serverless GPUs.
01:02:23 There's a lot of, a lot of words there that mean interesting stuff about running and production.
01:02:27 Kate, are you familiar with Banana or do you have other tools like this you think are neat?
01:02:30 No, I cannot name anything on top of my head.
01:02:32 Just like, you know, I always blank out the question.
01:02:35 I'm like, what's the main thing?
01:02:36 I have no idea.
01:02:37 Google's the most amazing thing that ever happened, I think, to data scientists.
01:02:43 Apart from like the original notebooks, because it's just like, it's right there with all your
01:02:47 random files in the drive.
01:02:49 You can just hook it up, test something quickly.
01:02:51 So, but that's not related to this.
01:02:53 What roles do places like Google CoLab and these other hosted notebook places serve in the
01:03:00 production side?
01:03:01 Like, I clearly see the value when I'm doing like some development and prototyping and trying
01:03:06 to figure stuff out.
01:03:07 But would you ever try to make that the final version in production?
01:03:11 I'm not sure how it interacts with GCP production.
01:03:15 So I tried it for my pet projects.
01:03:17 It was very fun because I had to transcribe some random lectures and I decided, hmm, what
01:03:24 if Google can do that for me?
01:03:25 And I literally had it all in my drive and ran some code in CoLab, linked it to GCP and it
01:03:31 gave me all the results.
01:03:32 So that was like pretty neat because I really didn't have to do any kind of configuration,
01:03:37 any kind of thought into it.
01:03:39 So for pet projects, that was amazing.
01:03:41 For production, Databricks cell books kind of work.
01:03:45 Well, again, especially if you're in Azure, I'm not sure how it interacts with other, if
01:03:51 it's as convenient.
01:03:52 But with Azure cloud, it was, it was just amazing.
01:03:55 It was like literally as soon as you have your credentials figured out, everything works in
01:04:00 the background, you just do it as you would do it in a normal notebook and you can put
01:04:05 it in prod, no problem.
01:04:06 Yeah.
01:04:06 Fantastic.
01:04:07 That's a good recommendation.
01:04:07 All right.
01:04:08 I think we are just out of time.
01:04:10 So I want to let you all get back to things before you got to run.
01:04:14 But Kate, Vishnu, thank you for being here.
01:04:17 Demetrius had to run off, but thank you to him as well.
01:04:19 Final thoughts.
01:04:21 Maybe people want to get started with the ML Ops community and maybe get their models in
01:04:26 production.
01:04:26 Give us your final thoughts.
01:04:28 Kate, you want to start?
01:04:29 I think the best way, if you're, if you're really just starting out, go and meet people
01:04:32 in person and the ML Ops community or any other community that is available in your area
01:04:38 will give you the right boost of motivation and also knowledge and just create a support
01:04:43 network for you to be able to get through the blocks that you will definitely have on your
01:04:49 path.
01:04:49 So that would be my word of advice.
01:04:51 Don't stay alone, isolated.
01:04:53 Yeah, for sure.
01:04:54 Vishnu?
01:04:55 I think that's a great tip, Kate.
01:04:56 I mean, to anybody starting out in ML Ops, I would say certainly join the community and
01:05:02 search the answers to that question because it's been asked before in the Slack community.
01:05:06 And there are a lot of great answers.
01:05:07 I would also say like, you know, check out some established resources.
01:05:10 Like I think a guy named Goku Mohandas has been like, you know, putting together a website.
01:05:15 Made with ML.
01:05:16 Made with ML is a great resource.
01:05:17 Chip Huyan's Stanford class on machine learning systems design.
01:05:22 Eugene Yan, you know, our Slack community.
01:05:24 These are places that you can just pick up, you know, years worth of knowledge very quickly.
01:05:28 It's organized for you.
01:05:29 It's structured for you.
01:05:30 So definitely make use of those existing resources.
01:05:32 Thank you for being here.
01:05:33 It's been really great to hear your experience and thoughts.
01:05:36 Thanks for having me.
01:05:37 Thanks for coming by.
01:05:37 Thanks for having me too.
01:05:38 My pleasure.
01:05:39 This has been another episode of Talk Python to Me.
01:05:43 Thank you to our sponsors.
01:05:45 Be sure to check out what they're offering.
01:05:46 It really helps support the show.
01:05:48 Take some stress out of your life.
01:05:50 Get notified immediately about errors and performance issues in your web or mobile applications with
01:05:55 Sentry.
01:05:55 Just visit talkpython.fm/sentry and get started for free.
01:06:00 And be sure to use the promo code talkpython, all one word.
01:06:04 For over a dozen years, the Stack Overflow podcast has been exploring what it means to be a developer
01:06:09 and how the art and practice of software programming is changing the world.
01:06:13 Join them on that adventure at talkpython.fm/Stack Overflow.
01:06:17 Want to level up your Python?
01:06:19 We have one of the largest catalogs of Python video courses over at Talk Python.
01:06:23 Our content ranges from true beginners to deeply advanced topics like memory and async.
01:06:28 And best of all, there's not a subscription in sight.
01:06:31 Check it out for yourself at training.talkpython.fm.
01:06:34 Be sure to subscribe to the show.
01:06:36 Open your favorite podcast app and search for Python.
01:06:39 We should be right at the top.
01:06:40 You can also find the iTunes feed at /itunes, the Google Play feed at /play,
01:06:45 and the direct RSS feed at /rss on talkpython.fm.
01:06:49 We're live streaming most of our recordings these days.
01:06:53 If you want to be part of the show and have your comments featured on the air,
01:06:56 be sure to subscribe to our YouTube channel at talkpython.fm/youtube.
01:07:01 This is your host, Michael Kennedy.
01:07:03 Thanks so much for listening.
01:07:04 I really appreciate it.
01:07:05 Now get out there and write some Python code.
01:07:07 Bye.
01:07:07 Bye.
01:07:08 Bye.
01:07:08 Thank you.