Learn Python with Talk Python's 270 hours of courses

#298: Building ML teams and finding ML jobs Transcript

Recorded on Wednesday, Nov 18, 2020.

00:00 Are you building or running an internal machine learning team? How about looking for a new ml position? On this episode I talk with chip hewan from snorkle ai about building ml teams finding ml positions and teaching machine learning. This is talked by them to me Episode 298, recorded November 18 2020.

00:31 Welcome to talk Python, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter, where I'm at m Kennedy, and keep up with the show and listen to past episodes at talk python.fm. And follow the show on Twitter via at talk Python. This episode is sponsored by data, dog and linode. Please check out what they're sponsoring during their episodes. It really helps support the show. Chip, welcome to talk Python to me.

00:57 Hi, Michael. Nice to meet you. Thanks so much for having me.

01:00 Nice to meet you. And thanks for coming on the show. It's gonna be a lot of fun to talk about ml and putting ml into production and building ml teams. We're gonna talk a lot probably cover a lot of buzzwords, right like AI in an hour or so Top of Mind in all of the

01:16 niche impress people by throwing out all the buzzwords. Yeah,

01:19 that's right, right. IBM had this really funny commercial, which is ironic that it was IBM. But it had this really funny commercial like 10 years ago called buzzword Bingo. I don't know if you ever really hilarious. I'll see if I can link to the show notes. If I can dig it back up on YouTube. But yeah, so we could definitely win that one today, just because it's such a growing and interesting topic. Yeah. But before we get to all that, of course, let's start with your story. How'd you get into working with programming and Python and machine learning?

01:48 It's really funny, because like, when I was younger, I thought like being a programmer was like the most boring job in the world. Like, why would anyone want to spend the rest of their life sitting in like a basement looking at a screen, you know? So

02:02 this is for the antisocial people. They don't want to go outside. They just like you said sitting ovation. Yeah,

02:08 don't I have friends? Like Yeah, no. Yes,

02:10 exactly. Yeah. So

02:12 the views changed over the last 20 years or so. Right? Like it's Yeah, society is sort of viewed that differently. But yeah, it's how it was.

02:20 I think as far as growing up, you just realize it becomes a person that used to make fun of you know, so there's a story of my life. So when I was younger, I actually come from a writing background. So I was traveling the war I noticed I was like, nice and stuff. Like, not that nice. I was traveling the world and writing a lot about culture, people a lot of food. So I can use them for thinking I would major in creative writing, because I thought it would be fun. But then I took some CS courses, and they talked to CS friends, and they were like, what is this what you make for an internship? That's like what my family makes like the entire year someday. So I was like, What is so cool about it? So I took it took? Yes, yes. So yes, I took some CS courses. And I think Stanford did a really great job of like, getting people interested in computer science because on the injury, majority courses are extremely well designed. And exercises are not just like boring things is like trying to design a button to increase on the piano. And a what boring lectures are versus the exercise like building games, I could play games there. So a lot of fun things. So I took them to the courses really enjoyed them. And they took more courses. I think initially it was more of like financial needs because of a need to ta to make the to get some pocket money. But then a TA and I met some wonderful people who are so ta I think I stand for for the course we couldn't actually, it was really fun. And I think I just getting sucked into it. And yeah, fast forward four years. I'm majoring in computer science.

03:57 Yeah, that's amazing. Do you still do any writing?

04:00 Oh, yes, I still write a lot. So I think it's gonna be tricky because of writing in critical writing is very different from the kind of writing I did before. So I can't really switch among them easily. So like, I think, I think like, for me, it was easy, like, a month, just focus on like, technical writing, like blog posts, or documentation. So I also write a lot about, for example, like paper summarizations, or some new content mixer I learned, I think recently did something about so so I look up like 200 machine learning tools, I could fight and I try to analyze Oh, no, no. Wow, it took me so much time. So let's get rolling is very different. Because in theory, technical writing, they want to get to the point, right. But also sometimes I still like to write stories like nonfiction like and when you write stories, people want you to take on a journey. Digital is the way to show the destination right away. So I tried and I said to a mother that this switch the mindset to another, so yes, it's the railroad.

04:59 That's cool. It sounds Like a lot of fun. So you one of the things that is interesting about your story is like you decided to do some programming classes and get into it. And at first, you were not so sure that that was the full on path that you wanted to go down. And as you got deeper into it, you saw it as more interesting. You made friends and connections, and you sort of saw the human side of it and got sucked more and more into it, right?

05:22 Yeah, like, yeah, programmers do have friends. Yes, I learned that. Oh,

05:26 yeah, that's good. Right.

05:27 Encouraging. Yeah.

05:29 Well, the thing that's interesting to me is, a lot of times you hear people early in their career thinking about like, well, what should they study? What should they go into? Or if they're changing careers, what should they go into? And a lot of times the advice is, follow your passion, like, what are you passionate about? Like, well, I'm really passionate about soccer. Okay, well go into soccer, like, what I've think I've seen over the years is a lot of people who are actually super passionate about what they're doing. They didn't go to it, because they just knew from the beginning that that was it. It's like somehow, as you get pulled in, as you master a topic, and you learn more about it, like it is like this mastery and understanding leads to passion, not the other way around.

06:07 Yeah, I totally agree with you. So something I think about sometimes. So I realized over the years is there things I thought I could enjoy doing, I thought they were my passion. So at some point thing, like I would totally want to do a research. And they took three months off in travel and they realize it like and read every day and write everyday. But I didn't read a single paper in the three months. So it's not something I enjoy doing. So I realized that I wanted to be the person who do my research, but don't want to do actually be doing it. Yeah. So I think I think those things you just need to like spend time and think about it and like to just observe what you do and have retention know was a passionate, and so don't think of passion as something you fight. I think passion is something you cultivate. So it might not know so no, have you ever had that moment, like you study, you learn something? And it's like so weird. I don't know, anything is just like miserable. It has a gotcha moment. Everything just makes so much sense. And just like going deeper and deeper into it, and it slowly becomes a passion. Yeah. So I do think that in the beginning, I think it's really useful, like to just try a new thing, but like, don't just do too short, like give it some time to actually see how much you learn how much growth in it. And and he knows there's no pain, you know, I just there's no shame. It's like leaving something you don't think is for you. But you definitely need to give things, Tom, that's good advice.

07:23 Yeah, a lot of people want to find that thing that they love and just go for it. And I think actually the answer might be just experience a lot of things. Yeah. And then decide, right, which is awesome. So you took some CS courses and did a bunch of writing. And now all of a sudden, you're on the other side of the podium. All right, you're doing a little teaching as well.

07:42 Yes, I don't think learning and teaching have to be mutually exclusive. So I started teaching when I was a student. And I started out not because I was an expert,

07:51 as a TA

07:52 No, as an instructor for the course. So I started out not because I started doing it not because I was an expert, but rather because I wanted to become an expert. And so in the beginning, it was like so my first course I taught as an instructor was TensorFlow for deep learning research. So at that point, TensorFlow was fairly new. And I was using it in my own internship, and I couldn't really find good training material. So it was I went to my professors and was like, Hey, can you like, take a course on it? They should close on it. And they were like, We don't have Tom, like, you know, full professorship was a name on the line, they could help you like, Yeah, and I was invited to make a lot of investment and make it good. And they were like, why don't you do it? I was like, What? And so yeah, they have never had this thing. Like you allow students to like initiate course. So I did it and in the process, so it's not really it's like having a group of people who also want to learn TensorFlow and learn to quit them. So I just try to like, anticipate a lot of questions by just googling a lot. I was like, No, I spent, like half of my waking hours, like a flow or something. So that's how I started anything. Now I still still continue doing it.

08:54 That's cool. I think you could just learn so much. When you try to teach something. It's a really valuable way to just get deeper and deeper into it. And this is at Stanford, right?

09:02 Yeah, it was Stanford, I think it started like learning was it you realize what you don't know. Because you know, sometimes you think that you know, something, but then you start explaining to people it was like, You have no idea how it goes.

09:13 Yeah, for sure. The way that I think about it is like if you were say, a consultant at a company, and you had a programming problem to solve, like, let's say you do something with multi threading, right? If you find one way that works, you're done. You move on to the next thing. Yeah. But if there's three ways you could have done it as a teacher, you have to know what the three ways are. When should you use one versus the other? What are the trends? These are questions that just a lot of times you don't have time or energy to dig into. But once you start teaching, you're like, Well, I better know it because they're gonna ask me there's more ways than one How do I do it and why right like it really just makes you give it like this other perspective on trying to learn about stuff.

09:49 Yeah, no, I think it's a must be like some like teaching rule. Like if there's a question you don't want to answer students and ask it.

09:55 Oh, they can like Yes, yes. teaching classes Well, like, you know that they're just gonna hone right in on that one. Please don't ask me this thing. Don't Don't

10:06 take it smell fear like,

10:08 exactly. Yeah, exactly. Alright, so you are doing teaching at Stanford right now. But you're also working at a new startup. Right. So what are you doing today?

10:17 That is a great question I've been asking myself a question ever seen is Georgia startup. So I think startup life is great. I think it's so dynamic, we have been growing so much like the company has increased in size and multiple times in a joy in December. So something like less than a year. So it's a blessing of things. When I joined the snorkel a, I told the team that I am looking for environment where I can learn like different aspects of business, because eventually I want to start my own company. And they've been extremely supportive. And so in the beginning, before we launched, we were like, very much has Dell buildings, a platform. So my job was like entirely on the engineering side, like building as a modeling service and like other features, and then as a company launch, and we suddenly, like, had a lot of interest from people like and I was humblebrag. We had to reboot, we don't have time to talk, you know, so we say hello, our interest. So I think we need more people to like responding. And I just talked to two potential customers. So I have been spending more and more time on that side. So yeah, so recently, ad, I just decided to switch. Like most of the time on the go to market side.

11:30 He said your ultimate goal in the long run is to do something on your own, potentially, yeah, those two things you talked about, those are the two really hard aspects of starting, like one the technical side, because you have to sort of bootstrap it and get it going. But the other is, like marketing and get the word out positioning, like all that stuff. Just so often on the technical side just gets ignored until you build it, no one comes, you're like, Alright, well, now what are we gonna do? Right? Yeah,

11:54 I think it's definitely these are both like really different difficult aspects of business management. Another aspect is like recurring. I think, like, I think companies can like, like good people can like making a bit higher can basically like bankrupted company early on. And so. So I think one reason why I joined the company, this company, which is a solo was a team. So Wow, how did you manage like convinced, like, brief people and say, it's like, when I say how the magical when great people? I mean, how did you manage to convince me to do either, I feel like everyone on the team is like, it's pretty great. I feel like it's with a really strong team. So I think what a lot about startup life is that like, you don't just stick to an idea, like from beginning to end, and you try out different things in charge of pivot, you try to, like, in the beginning, you have a hypothesis, right? You think that like, this might be something people want, but you don't know for sure. And to actually like working with like customers. So over time, you learn different things, you change ideas, or maybe there's some like giant company launching exact same thing. And I how do you compete with that? Yeah, so sometimes it's not about the idea. It's not even about the property. It's about the people because like with a strong team, even if you throw out the existing product, and a brand new product, it can still has a chance of like competing, but if a bad team and a current idea proves to be like wrong, that you can't, you can't really recover from it.

13:13 Yeah, one of the things I think is really interesting about working with small in a small company like a startup is you get exposure to so many different things, right? You're not just the person that does billing in this way, or build that part of like this pipeline, like you have to really get your hands into many parts. It's stressful, but also I think you grow a lot if you get that opportunity.

13:33 Yeah, I think I think it's the discussion people have been talking about like the difference between working with big company and small company, right? So like big companies you have you can't yell loudly focus on one small thing and go really deep into it and despair like many many of the waking hours on it, but a startup idea many things going on and we had to like maybe like cycle among them like quickly. So let's say like a big companies like big companies can afford to hire specialists can can do one small things really well. But startups like a lot of these, it might want somebody who can do a lot of things. Okay, it might not like expert. Yeah,

14:09 yeah. A lot of prototypes, do it quickly. Try it out. Didn't work. Do something else. Find a gap fill that hole? All that kind of stuff. Right?

14:16 Yeah, I think he's good leg really depends on like, the phases of life. I know. They have people who, who really just want to like keep the head down and focus on one thing. So I mean, there's no, there's no shame in it. They have so much respect for people who can do it as I have so much respect for people who can adapt quickly and learn things quickly. In this like view thing.

14:36 Yeah, you got to find the one that works for you. So one of the things that you've been writing a lot about you're working on a book actually is basically about building teams in the ML space and hiring people. And I think that this is a big challenge right now because so much of the focus in the data science space is so hard so many people are coming from different areas, right like there might be somebody doing ml but three years. years ago, that person was, I don't know, working in finance, or maybe another person. She was like a biologist, but she got into programming. And now she's doing machine learning, because she's sort of right. So it's I think it's actually a little bit challenging to hire people in this space. Because if not just will show me your machine learning PhD, and we'll talk about it right. Like, yeah, there's probably not that many people in that realm. Right. Like, there's not as much traditional education in the workforce yet.

15:27 Yeah. So I think I agree with you that hiring is hard. But I think it is. Hiring is hard right now for machine learning is like, for many reasons, I think, I think the first reason is like, it's probably because companies don't even know what they're hiring for yet. I think machine learning is like, it's really new. And if I imagine a company like you have never deploy a machine learning model before, and now you're trying to start a new team. So what do you need, like view on machinery models? And you have no idea. So you come up with some very generic ways. And like, he like you say, somebody who's like, doing like, state of the art research somebody who can code really well? Yeah, somebody who can explain what they are doing. So this is like, super simple, just don't exist. Yeah, that's a good point. But I think the second thing is that machine learning itself is not new. But machine learning in production, like, especially from the explosion of AI, deep learning. In 2012, I think the first major application of deep learning in industry is probably Google Translate, in 2016. And since then, a lot of companies have been looking into it. So it means that so so I think the industry is like lagging behind, like research, like a few years, right? So like research, like grows, and you have a lot of for knowing like how to do machine learning in academic environment. And then the industry say, Oh, wait, you can actually use that by to improve our business. So let's do it. So I adapted my software interface, and companies are looking into it. But most people who know machine learning comes from an academic environment. So they are familiar with like how to like do machine learning research, but they actually might not be like, familiar with, like doing machine productions. And they are not many people who can teach them because usually you need a hands on experience. So we ask very early phase of a machine adoption in the industry anything like so that's why we are liking people biting in a few years, we will have a lot more. And hopefully, like understanding of machine production, plus availability of people with actually hands on experience will make hiring less a lot less difficult for company.

17:28 Yeah, that's really interesting. I can imagine if I was hiring, say like, a couple more web developers or somebody to do database ETL. Like, bring in the data and clean it up? Yeah, you already have people in your organization who do that. And you can say, Well, what do you need, and please talk to this person. But if you're creating an ml team, like, I know, there's really large companies out there that don't have a single person who's doing like production eyes, machine learning. So it's like you pointed out it gets starting from zero. And a lot of times, you know, is that person who you're talking to really competent to make that decision or make the right trade offs, right. But even know what you're hiring the person for?

18:05 Yeah. So see, so Halloween was a you hire the person for and also like having people show like, evaluate the skills can be very hard. So so I think I see a lot of social but actually do a see some shifts in the future. So I think like a lot of aspects of machine learning are being commoditized. So for example, like you're seeing a lot of pre trained models, right, and people just like train the model for you already. And they open source it. And you have a lot of like pre built model like hugging face. So so you can just make code API, and then you can incorporate, like some machine model in your systems. So actually, it's already got a tool to help allow you like feature engineering or like, runes based systems like monitoring tools and deployment tools. So I do believe that there is a bottleneck for machine learning in production. Now, we'll be in the engineering part. I'm not saying that we stopped doing research on the saying that like, a lot less companies will do research, I think that doing like, the machine part can be like a few very large established company who know what they're doing. And then a lot of other companies will use machine learning when just like use like existing tools and platforms. So the challenges would be lecture and engineering challenges and not machine learning challenges.

19:18 Yeah, there's a lot of stuff that's getting pre built out there. Like I think Apple ships with some pre trained models for running iOS, you've got like Azure cognitive services, stuff like that, right? Where you just yeah, you kind of just bring that in. This portion of talk Python, to me is brought to you by data dog. Are you having trouble visualizing latency and CPU or memory bottlenecks in your app, not sure where the issue is coming from or how to solve it. Data dog seamlessly correlates logs and traces at the level of individual requests, allowing you to quickly troubleshoot your Python application. Plus, their continuous profiler allows you to find the most resource consuming parts of your production code all the time at any scale with minimal overhead. be the hero that got that Back on Track at your company, Get started today with a free trial at talkpython.fm/data dog, or just click the link in your podcast player shownotes. Get the insight you've been missing with data dog. So I guess one question just thinking sort of reversing it a little bit, I guess you could see it in both ways. If I was somebody who was looking for a machine learning job, or I was hiring somebody, how much do you think that engineering side should matter? It sounds like it's pretty important, like so should say like being competent with Git and source control be important. Yeah, tenuous integration should you know something like FastAPI or flask or something like that to build a service around your model? What are the skills you think are really important

20:41 there? I think it's really depends on what type of jobs do you want? So I think there's a like, I think I want to say the term of church and a machining engineering job, but I don't think Machine Learning Engineering is that Oh, to deserve the term traditional, but I think I still have been using the last few machining engineering as in like feature engineering, corroding models, train the model and babysitting models. And I think that part require a lot of machine learning. And unless engineering but like, if you work to, like, set up the pipeline for how to do like how to how can you process data in parallel? How can you print model? How can I deploy model so that it can serve like a lot of requests at the same time with lower latencies? Since then, you probably need more systems and databases, and machine learning. And if you are in the part when when you want to, like monitor the system, like maintenance and monitoring, so you can like, how do you like push updates without like interrupting the service? Or how do you if something happens, like how can you be alerted when some bad things happens, and then you can address as quickly as you can run by the system, then I think a lot of it has, like very similar to DevOps, so they need a lot more things. Right? So so really depends on what goals you want, the company you want, because I think one thing you'll notice is that AI companies have read different structure for the machine learning teams. So companies, like for example, Netflix, so they have just like a separate, like algorithm team, they focus on an aspect algorithmic aspect of machine learning. Right? But so so

22:12 yeah, the recommender engine is so important, right? Yeah. Even had that million dollar prize to see who could recommend Yeah, they do movies, you should watch next best, right?

22:24 Yeah. But is this really funny because like, I think like this competitions are great, but like the result is very hard to actually like be deployed. So so I think like I'm not saying Netflix is not using the winning resort. I'm just saying that I found a lot of these competitions, like the Winning Solutions, or even though the winning solution perform well on the leaderboards, they tend to like be very hard to be deployed. Because associative, way too complex to be reliably deployable. Yes. Oh, interesting.

22:53 Yeah. Or maybe it's overtrained? Exactly on that one thing, and it's perfect for that. But it's not generalized enough for something?

22:59 Yes, that's one thing. I think this one people have been talking about, like how just started competitions or leaderboard driven oriented work is actually not very much close to real life. Because when you have a leaderboard, right, you tend to have one single objective you work toward, for instance, gauging, like how good the model with the best performance. But whereas in production, you don't have one single objective, like, you might have different stakeholders, the company and they have as you want different things, like one person might want, like, Hey, we want the best performance was gonna say, Hey, we want the lowest latencies. And that was like, hey, how do we can like do either whether we can show the most as we're being obnoxious, or so there's a lot of things. And sometimes you just optimize for one thing. Like, you can't really go further. I think there's some interesting example of how like, a machine learning model that can do very well on leaderboards, it's like not going to be usable in real life. So think about that, if you remember that I'm hanging about 10 years ago was like, so So this giant retail companies who have been trying to like, predict whether someone is pregnant, so that they can, like, advertise directly to that, right.

24:05 Yes, yes. Uh huh.

24:06 So and like, and as someone found out, and then this sense, like, owns a baby product to like, this teenage girl to her family, and they, they didn't know about it yet necessarily know about it. So this example, like,

24:18 yeah, they got really angry, like, Oh, my gosh, that's, yeah, that's so good for her.

24:25 No, no, it's not. So that's an example of like, it can be so good. It's creepy. And and you don't want that, or Yeah, I think about like how we have machinery models like that can help so users to like, solve that problem really well. So there are two things that can happen here. Like one is that like it solves the problem. So Well, does it users done they never had to come back to you ever again and use lose business? Or do they exist somewhere? So where's that user just love the system? They keep coming back for more? So it's really hard to find the linear relationship between the model performance and business performance. Yeah,

24:59 I did. Think about the relationship of these, like competition winning algorithms and models and stuff. But yeah, that makes a lot of sense that just because it solves that one problem, it might not be practical to run in production or to maintain or whatever, and evolve it.

25:13 Yeah, I think you need to like, so I think this is why it's important for people who like who in charge to like, give up a good sense of what they want, and how to like balance between different objectives of different stakeholders in a project.

25:27 Yeah. So give me put the hiring hat on for a minute. If you're at a company that does not yet have an internal in house machine learning team, but you think maybe you want to, maybe we we can analyze all this data we have, and we can find some trends and do more interesting stuff. Yeah. And you want to create an in house ml team, like, what advice do you have for those people?

25:47 Okay, so it really depends on who you are like, yo, like, woodmark, or like, McDonnell, right, you can acquire, like very promising Ember startup, just have a in house team. And I think a lot of the big companies are like, going for that approach. Yeah, I think another approach is I think a lot of company is doing is AI to translate to ml, you might want to, like use some existing talent in the company. So machine learning is new, but data science is not. So I think data science teams have been a company has been having data science team for long run. And data science teams also, like work with data. And they do a lot of it, they probably already have a set of data. And they also try to get like patterns from data. So I think a lot of team like, yeah, in the beginning, they transition that use data science team is like, Hey, why don't you learn machine learning as I try to think, oh,

26:36 maybe these maybe a couple of you could learn pytorch?

26:41 You might joke about it. But I think it's the pretty much how people do it. And they see a lot of people in data science into machine learning. And think especially now with abundance of machine learning courses are just so many courses online for free. Yeah, I think it's great that people are taking advantage of it was looking up like so of course, it's like, do you know and do machine learning course?

27:03 Yeah, I haven't taken it. But I've heard of it. Yeah,

27:05 I think it has, like more than 2 million people have taken the course already.

27:09 To my students. Right? Yeah.

27:10 Yeah. Crazy. Yeah.

27:11 Yeah. It's pretty new. Right? I think it's pretty new. Yeah, that's really crazy.

27:15 Yeah, it's new compared to other discipline. But I think my Michelle is like, great. One of the older courses. So I think a lot of teams do that. And I think but I think for for companies who do that, and they want to like hope that they just like to look into the difference between data science and machine learning. So data science is like to look at data, shall I say, output, like insight for us to help make decisions about business? So you can predict how much of the customer demand in the future or like, Yeah, but machine learning is like, the goal is to have like to build product to be like engineering, so so for data science, like you want people with Triangle statistics, skills, because you look at the data and get insight. But for machine learning, is more engineering. So you want somebody with like shrunk injuries, skill and less that. Yeah,

28:03 right. So as a data scientist, maybe your output might be here's a Jupyter Notebook with a plotly. Yeah, analysis of of what we're thinking. Whereas as a machine learning person, your output is here's the API that gives you the answer.

28:17 Yes. You get a can tell you that was something like that? Yes. Yes. Something like that. Yeah. Okay. So I think this is this is where different focus. I'm not saying this, like, I'm trying to make a general statement here. I'm just saying this like, just from talking to other people, I tend to notice that like, data scientists as you much better study fiction's whereas like machine engineers are much better engineers.

28:36 Yeah, yeah, I've seen that as well. And some of the trends. So yeah, it seems totally reasonable. let's reverse this a little bit. So we were talking about if you want to build a team, and you did point out, by the way, bringing someone in from the inside, like, I feel like data science, more than software developer, that role needs to be sort of intimately familiar with the way that the business works, and the way the data is collected, and, and all the little idiosyncrasies around it. And so having somebody who already knows all that stuff, and now he's just like, okay, yeah, adapt that to machine learning might be easier than getting somebody who's good, but has no experience in the business,

29:12 I think like, and make a living out of like, say that I know, machine learning, right. So of course, I want to, like make my shillings as high as possible. But I have to admit that like machine learning for a lot of like simple models, you don't need to learn in either need to spend years and years and years of like learning to like, be able to use simple models. So I think there's like one thing I notice is like, it's actually a lot easier for good engineers to like pick up machine learning, often for machine learning experts to pick up like good engineering, guys. Yeah, yeah. So like, if I was to start a team, I would probably try to get like really good engineers, and have them learn machine learning and then like, apply machine learning, then like to hire machine learning experts and then like having them I spent like several decades to become good engineers.

29:59 Yeah, that's it. Really good perspective? Yeah. All right. So switching the role here to being interviewed for a machine learning job to working on this book for machine learning interviews, and it gives you like a sense of sort of, if you're going to go apply for one of these jobs, what are some of the skills and things you might expect to be asked about? and so on? Right. Want to tell us quickly about that?

30:20 Yeah. So so that's a book I've been working on for like, Oh, my God, oh, yeah. and a half now. Do you know, like, how has so many great plans for 1020? And none of them happened? I think this was a case. That's a case yet my book, I think it has so much great plan for it like and then boom. So yeah, so it's a slow, it's coming along. So I think my book is not just a book for a career, the questions they will ask you like, how would you answer them? I think part of what I want you to do with a book is you have some standardizations, or understanding into the process. I think it's new in the industry. So it's new for both interviewee and interviewer. So first of all, a post you asked me and I will be confused. What is a machining engineer versus a data scientist? There was a difference between big company and small company, whereas the hiring process, what skills do you need? So I think there's just so many skills that one might need, but usually like you don't need all of them for for a single role. So I think my book is free start from it. The difference? I think, last year,

31:21 sorry, I don't know what happened to my network. It just said that it lost and like all my stuff disconnected, but we're back. Yeah. So we were talking about the book and you said it wasn't just for people like to know what the questions the answers were, but that it like, it's such a new industry that it's both new for interviewers, and interviewees. I think we were going from there.

31:40 Okay. Yeah. So part of the bobby Chu some gifts, some understanding sedation issues, process and differences between different type roles. Now, what is a data scientist was a machine learning engineer, or what is a research engineer? So I guess the difference between AI First of all, like machine engineering and data science and ml Ops, so is, once you like, get a good picture of the process, what skills are needed for each process? And as they interview in the pipeline, building pipeline? Yeah,

32:09 a lot more. Yeah, that sounds

32:12 like we need all the help I think we can get for fixing the interview process. Oh, my God, software development, data science, it seems so broken to me. I knew. I've had some friends who have gone through it recently. And it just seemed really, really rough, and actually did an episode. Wait, let me do a quick search. Well, what

32:31 are some of the highlight of the like pain points?

32:33 I think a lot of it is you get asked to work on like low level algorithmic explain or create or recreate low level algorithms, sometimes even just on a whiteboard? Yeah. Where like, you know, go create quicksort. And then never ever in your job, will you ever go and create quicksort or something like that? Right? Yeah, like you were just go to the list and say dot sort. And it would be done. I interviewed Susan tan A while ago, back up way, way, way in Episode 123. And she said, she did a talk called lessons from 100 straight developer job interviews. I think she was in San Francisco as well. I'm pretty sure if I remember correctly.

33:12 100 is so much.

33:14 Yeah, literally did 100. And then, like took notes about what worked and oh my god, yeah, you'll get like these big, like, homework projects, right, like work on this for like a week. And then, you know, there's 100 applicants so like, the energy put into that is often not you anyway, I think helping both sides of that story. It would be really good.

33:33 Yeah, I would definitely love to like read interview because a South exactly like what have been working on. I'm curious, does she like propose, like, what are some things that work?

33:43 She did. And it's been, gosh, it's been like two or three years since I spoke to her about it. But I know she had some advice for like, these things were really bad. And these things I experienced were really good. Yeah. And so she basically laid out like, what are some bad interviews I had? What are some good ones and why? And I think probably in there, you could pull out some good advice.

34:01 Yeah, that's pretty interesting. Because I think like before, it was still like interviewing for jobs. I was like, has so much to complain about the interview process, right? But now isn't part two like a startup and we try to build routing pipelines, we realize it's really hard. Like even though we complain about existing pipelines is really hard to come up with something that is better. So I think I think it's just like too many sources of force upon is like interviews as a proxy to evaluate somebody's skills, right? Yeah. So you know, like how even like when I look at this example, I know it's might be like, not very exact, but like First of all, I think about dating right? You try to find somebody nice just approximate whether the person is a good fit for you and you might go dating for like years and you still end up with like some bad partner if possible, right. So like,

34:50 exactly the divorce rates like 50% like it isn't we're not totally getting this nailed.

34:56 Yeah. So I think my for for for job interviews what you try and do like admitted Like the status lowers, like for a job or for partner, but you still have much less time, right? Like you already have like a resume, in one session, keep the resume longer than one patient hit webpage have that. And then you maybe go on LinkedIn social media logo things, and then you have like, a few hours, like, it's really hard to, like, get a good picture from it. And a lot of it's like biases like because interviewers are human. And even though we try not to like we we alert, we we are taught that you shouldn't let biases you shouldn't decide like judge people based on that. But it was something that we like, we grew it like something is like, we just not do it without even being conscious of doing so. And also it is very different for different people, because something that might work for one group of people may not work for as a group of people. So I think for example, like we have been trying to be bit on take home challenges. So a lot of candidates tours, like oh my god interviews like so stressful, like one on one is really hard. What did I just give a take home challenge, ladies make it like no, make it hard. When I spend like a day on it? Would it be done and you can see the how good we are. But then I would thought about it. And we talked to people and then we'd realize like for people who have a lot of responsibilities outside of work, especially like for simple women or people with small kids. They can't spend a day they don't take home challenges. So I think like it what might work for you.

36:18 Yeah, and if they apply to 100 jobs, then all of a sudden, that's half a year or something like that, right?

36:24 Yeah, so it's very hard. So some company some people told me that Oh, they like this concept like that company. When they bring you on to like, as an intern Ah, for like a month they pay you. And then if you do well, then you can get a job and someone say, Oh, that's great, because now everyone get a chance to like show how good they are the job presented, not everyone can afford to like just go on a job without any commitment for like four months, right? And it's gonna be totally exclude all immigrants lie. For example, if somebody needs visa sponsorship, they can just go and work for

36:52 right. It's a very precarious situation. If your presence in the country is on, you know, you have to have a job. And if it lapses for more than a month or two, then yeah, gotta leave. That's really stressful. Yeah,

37:04 yeah. So I think it's really hard to find a mate at work for everyone.

37:07 Yeah. So a couple of thoughts. One, it's been a very long time since I hired anybody. But I used to help with hiring people to do training people who would be come trainers to teach you basically for personal professional development for software developers. And we obviously go through the resumes and see if they made any sense. And then we would just do a quick, like 30 minute call. And I would say, okay, so imagine this person says, I'm a Python expert, and I've specialized in flask. Alright, we're on a zoom call, share your screen, build me a flask app that has one function that returns JSON. If I give it two numbers, it adds up. I mean, like, anybody who's ever worked with flask, should be able to knock that out in five minutes. And you can tell from like, one minute in is that person on that path to like, get there? Because they know you start with importing Alaska equals flask? Or are they just flailing about? Right? They just have no idea. And there's no way that they can both be an expert in flask and not be able to create like a Hello World. Yeah, app in it. Right. And so, I mean, that was the first sort of filter we use before we actually would ask them the equivalent of like, here's a take home project or something just like show me live that you're semi competent with the tools you claim to be like, top notch and right and that actually worked pretty well. I think I was blown away at how many people would claim to be like I've done five years of this and I'm a super expert and I'm ready to teach it to other people and then they can't even begin to touch it

38:32 so I think that's an interesting interview approach but I also find this like for if we do interviews it's very like tailor I specifically over feature specific to specific tools and we might find that you're really good at one tool but then don't really this like skill like right in orange skill anything there for startups we if like if a fire changing company where you need to like have a lot of new problems I built here we will have to like keep learning new things that he was just like get some he was really good at like one thing as a generalized other

39:02 Sure. What I was trying to more Bible we were trying to discern was they said they were expert at this thing. Yeah. Are they actually like how much can you trust? So if they can show their expert of this thing? They said that probably the other stuff that they said they're pretty good at? They're probably also in that realm. Yeah. But if they're like really far from like, how they describe themselves in one axis, then probably you're not really going to be in a good fit. So yeah, I don't know it worked. Okay, we didn't do that much hiring. This portion of talk Python to me is sponsored by linode. Simplify your infrastructure and cut your cloud bills in half with linode. Linux virtual machines, develop, deploy and scale your modern applications faster and easier. Whether you're developing a personal project or managing large workloads, you deserve simple, affordable and accessible cloud computing solutions. As listeners of talk Python to me, you'll get a $100 free credit. You can find all the details at talkpython.fm/linode. linode has data centers around the world with the same simple Consistent pricing. regardless of location, just choose the data center that's nearest to your users. You'll also receive 20 473 65 human support with no tears or handoffs, regardless of your plan size. You can choose shared and dedicated compute instances. Or you can use your $100 in credit on s3, compatible object storage, managed Kubernetes, clusters, and more. If it runs on Linux, it runs on the node, visit talkpython.fm/ linode or click the link in your show notes. Then click that create free account button to get started. The other thing that I wanted to bring up is did you hear that Guido van Rossum just joined Microsoft?

40:37 Oh, my God? Yes, I think yes, yeah. I was like, it was really interesting. What What was the thought on it?

40:44 So he said, basically, he's been retired for six months. I want to go back. There's a ton of cool open source stuff on there now. And you know, he gets to work with some of the other language teams and make Python basically just focus on Python and be around it. Yeah. And that's all interesting. And I think it's actually kind of a big deal that that happened. And it's like a really big contrast from Microsoft 10 years ago that this is even possible. But yeah, the thing that I want to bring up specifically now is somebody on Twitter asked him, so did you actually have to send in a resume? They hired you? And he said, Yes, yeah. He had to send in a resume. He went through a bunch of interviews, the interviews make sense, but he had to send in his resume, and he had to provide his degree he got in university, and like his transcript, like his grades and stuff I got in college.

41:31 Yeah, don't get

41:33 and I'm just thinking like, What? What if a what? Who cares if he got an F in in like literature? Or, or didn't like, look what he's accomplished?

41:41 Yeah.

41:42 Since then, it doesn't matter. But that's just another one of these hiring things. Right? Well, we got to check the box that we need is like university degree and transcript.

41:49 This is so funny. This technical fellow does it reminds me of like, a few years ago, do you know Malala? She was like the youngest novel, recipient from Nepal, like in case. No. marala?

42:01 Yes, yes, I do. Huh?

42:02 Yeah. It's really funny, because like, at the time, she was like, Oh, she wanted to study a stand for it. And like central was like, Yes, but wasn't sad score, and then was like, she was like, youngest recipient in Nobel Prize for Peace. And you asking her like, score? Because

42:18 she's gonna cram them. Yeah. All right. So what are some of the other takeaways that you're hoping to give in this book? And you also have a chapter that's open on GitHub people can download, right?

42:32 Yes. So this is this chapter. So think one part of the interview a lot of people ask is a machine learning system design? And so the questions usually like here, like if you want to, like build a system to do that, how would you do it? So it's very designed high level questions. And I think so first of all, one, one question could be like, if you try to build a system to predict what keywords is trending on Twitter, then what would you go about it? And what is good, trending and blah, blah? Yeah. So I think this question is very interesting, because it's usually like try to measure so understanding of like, the different part of the system, and not just like, machine learning. But also further questions can be like, pretty, very hard for, like, especially Junior candidates, because they don't have a good grasp of like, what is a production environment? So some company

43:24 Yeah, a lot of times, you have to see examples of that, or have built examples of that to know like, well, these are the five pieces we got to put together. And then then you do it. Right.

43:31 Yeah. So originally, I wrote it as part of the interviews book. But then as I started writing more about it, and I learned more about it, it was like, Oh, my god, there's like so much more in machine learning system design. So now it's actually become like a full blown book on his own. So that's why it's taking me longer. And I'm actually like taking a course on it like machine learning system design, just just on that part.

43:54 Oh, that's cool. And you're teaching that in January, right? Stanford?

43:57 Yes. And I'd be engineering at Stanford. Yes. It used to be strange, because I'm not sure how teaching online is when I go I'm a bit a little bit nervous about that.

44:07 Yeah. It's not the same as standing in front of the class and having that experience, that's for sure.

44:11 Yeah. But mostly people have told me that it's a different experience, because some students like it more, because especially for the introvert. Now, they can just like ask questions anonymously without having to raise their hands and having anyone stare them. So it's gonna be interesting. I'm looking forward to it.

44:26 Yeah, it should definitely be interesting. All right. I think we're just about out of time, but maybe just real quickly, you could give us the elevator pitch on historical, local AI and what you guys got going on there?

44:36 Oh, okay. So I think like for the pitch, I think you can just say why I decided to join snorkel. So it's funny because it's a setup that comes out from a Stanford AI lab, and I've heard them for a while and I was like, and when it first approached me, it was like, Oh my god, another setup from Stanford. I know it sounds Super Sport, but I was like, startup AI, whatever. But listen, like I came across the paper. Hey, they so most of hiring teams are like PhD students. And they have been publishing a lot. I read one of their papers, and he was like, this is really smart. So So the key idea for the paper was that like, instead of manually label on the data, right, you can have some heuristics and causal heuristic into a bunch of programming functions and apply to onto data at once.

45:22 That's a really hard about thing about training your models is you get like all this data, you have to say, car bicycle, ball tree, right? You just got to like, go through it and teach it, basically. Yeah. So So

45:33 you notice a helpful like labels a force or you see like an email with a spam and a spam, right? You breathe. And notice if we have some heuristic like, hey, ABC, like, hey, you're gonna have like, hey, please send me money to like Nigerian prince or something like your previous spam? So So yes, I'm a terroristic in the brain. But so if you can find which in causal heuristics, and you don't have to manually do it all at once, so I think they uncover them. So like, how should I combine, because I'm still here, we're still going to be noisy and like, overlapping and accompany each other. So the current Gordon was like how to, like, combine all of them and Jerry, like, what are the most likely to be correct gratitudes because you don't have gratitude, ethnic compare for you. So then you generate a set of brochures, and then you install the the open source report. So like, anyone can just go on GitHub and use it. So I went to the coding part a while because these people are like good engineers, because you think of like PhD students, or like bad engineers, but then they call this like, Good. Very clean. We have one testing and everything like unit test. Oh, my God. No. So no, no say so I think the prep now is a it's not just a part because actually a lot more thing of smokers thing of that labeling part. But we actually like a phone on like, end to end platform. So we have you Format Data to like modeling training, like we do a lot with like, monitoring and analysis, because we believe that you can't in machine learning is because it's changing policy updates or models constantly in so we believe in each of development. So like you have you train model is here and is not good. So you go back and see what's wrong, and how do you improve it and that you manage more data. So so we version everything is a process by the way. And so that's cool.

47:16 It's like agile.

47:18 We don't use a giant, but it is one of the buzzwords ourselves. And maybe we can adopt. Die just do this, because I'm a buzzword. Let's get it. My God. Please. Somebody from Chicago, please don't fire me. But but yeah, so So we do a lot of so either end to end platform for people to build machine of AI applications. And it goes from beta model monitoring alysus. And I think it's pretty dope. You guys should totally check it out.

47:44 Yeah. Right on Awesome. Well, it sounds like a cool company to work for. And definitely nice applied machine learning stuff. So building tools for machine learning, folks, right?

47:54 Yeah. Awesome. machining forks making like right away, we sold a bunch. It lowers entry barriers for a bunch of applications. So I think like, so our platform, actually no code. So like you have the option to just use the application without any code at all. But we also have our SDK. So like, for people who want more like flexibility, then you can isolate like,

48:15 yeah, very nice. All right. Well, good luck with the whole company in a startup. Hopefully, it takes off as well. It sounds nice.

48:21 Yeah. Yeah. Thank you.

48:22 So we're pretty much out of time. But yeah, thanks for all the advice on building machine learning teams or getting to be part of one. Now before you go, there's always the two questions I asked at the end of the show. And one is if you're gonna write some code, some Python code, what editor would you use these days?

48:37 So I have some time I really want to be smart and safe. I use vim, but actually just use vs. Code.

48:44 vs. Code is definitely the most popular and it's all good. And notable pipe UI package like something some Python library or package that you've come across, like, Oh, this was so cool. People should know about x.

48:56 I'm not sure. Is this a de novo? I think I typed like paper mill. So is this allows you to format it? Yeah, I think it's pretty cool. So now usually I do a lot experiments with like you did notebook

49:07 thing.

49:08 Yeah. And paper mill comes out in Netflix. So it's a neighbor of yours. and nice. Yeah, the idea is you can almost treat notebooks like functions, right? Like you can pass arguments to them, run them and get like something out and then even chain them together. Yeah, one of the things I heard was really nice about it is if you create sort of data pipelines of one notebook, going to the next to the next and if something goes wrong, like the notebook actually contains, like all the data that came in and what it tried to do and how far it got. Yeah, it's like almost a record instead of just like server failed with 500. Like, no, like, here's all the details. You can go back and look at it.

49:40 Yeah, it's pretty dope. I think. I think it's really cool. I think there's been like so many exciting work in the notebook space, and add some real life to trim late and then surely it is like, it's like really cool. Like you create like very quick applications. I mean, there's so many.

49:55 Yeah, that's really nice as well.

49:56 Yeah. What is your wishes, your favorite,

49:59 my favorite Oh my gosh, you know, there's all these different ones that always blow me away. I go through so many of them one that I came across recently that was pretty neat is called a back off. Someone told me about that and back off what you do just put a decorator on one of your functions, you say I get this kind of error like this type of exception. Yeah, like wait five seconds and then try again. And then wait 10 seconds and then try again. So if you're doing like testing against like an API, and you get like too many requests error, use a Is it a fail the test? Just wait one second and try again.

50:29 This sounds pretty dope, I think and if you check it out,

50:33 yeah. Yeah, it's super easy to use. But it kind of solves that problem of like, mostly reliable, but not all the time reliable stuff. Do you

50:40 like stars? things on GitHub, when you see in repos that you like?

50:43 I do? Yeah, i star stuff all the time.

50:46 Yeah. Can I just go and choose us gonna see you go to the star list? And let's see, like, What haven't been like, looking at?

50:51 Yeah, so github.com slash Mike C. Kennedy. And let's see, I'll put my stars and see. Where are the things that I've starred? There we go. So the things that I have appear right now that's a really good question. By the way, that really cool way to look at it. So I have pip chill, which is like, Pip, you know, if you do pip freeze, it'll show you what you've installed. pip chill will like pip freeze will include everything that was installed, including the dependencies, pip chill will just show you just what you manually installed, not the dependencies, which is cool, nice, then link it Li and cue it adds like link functionality to the Python language is cool.

51:28 I love the name, by the way, pip chill.

51:34 And then I have a FastAPI chameleon and FastAPI Jinja, which adds like those templating languages to FastAPI as a decorator.

51:42 Oh, you show me black is cool.

51:44 Yeah, yeah. And B black adds black to notebooks. Yeah. So those are the ones I started recently. I guess.

51:49 I'm so funny. So So those are all good. So you saw like a lot of FastAPI. So start out a flask. Do you have like you prefer one or another?

51:58 Yeah, I do. I'm really like FastAPI. I've

52:00 been liking it a lot. Oh, it's brilliant. Yeah,

52:03 I think it's Yeah, it's so brilliant. It takes all the cool modern features of Python and puts it together. So let me make one recommendation for you. If Check this out, I just want to get your reaction to this for people as Yeah, a data science machine learning person handle calcs. Why is that? Have you have you seen hand calc snow to hand calc is crazy. So what this does is you write you create a Jupyter Notebook. Yeah. And you write some sort of math equation that is actually just the computation. And then you can ask it to show you and it'll show you as if it wrote it in law tech, what like, step by step like how Yeah, like how it solved the problem. So you'd have like, like the nice square root two, if you're doing some kind of like computation that's somewhat technical and hard. Yeah, in Jupyter, it'll actually show you like what you would put into like a math textbook or two physics textbook to derive the equations and even the steps you might take to go from one

52:52 didn't do proof for you.

52:54 I don't know how far it can go with a proof. But if you just go to like Google hand calc,

52:58 wait, how do you spell it?

52:59 There's a bunch of animated gifts. How could get a semi as an f? h A and D CALHANDCS?

53:05 Like OCAAL.

53:09 cs by Kane calculation.

53:11 Okay, this this is from Connor first. Yes. Oh, that's dope. Oh, yeah.

53:17 If you just page down here, you can see like, all these amazing steps you can like, render like symbolic mathematics and like the steps between various things. Yeah, it's really really so if you're doing like complex calculations, well, you want to make sure you got right, like reading the Python code to do it is harder than like reading the symbolic mathematics

53:36 of it. Wait, how does it do this? Yes, I'd

53:38 have no idea but it uses like sim pi and a bunch of others let hack and all sorts of crazy stuff. So as a data scientist, like this thing is killer. I think

53:47 that is pretty Joe. Nice. Thanks for showing me on a show to my friends.

53:51 Oh, yeah. Oh, yeah. There you go. So there's, there's a topical recommendation. How's that?

53:56 Nice. That's That's how fun Thank you. That's still of course.

54:01 Cool. All right. Well, I think we're about out of time but I just want to say thank you for being on the show and sharing your all of your advice. And I guess one final question if people are interested if they're out there looking to do some machine learning. Are you guys hiring?

54:14 Yes, we are hiring a lot actually. That's actually one of our challenges like what you give on your hiring a large quantity of like quantity of very good people.

54:27 Yeah, that's that is definitely a challenge but I will put a link maybe over to like the jobs page or something if you want people can check it out. This has been another episode of talk Python. To me. Our guest in this episode has been chip hewn has been brought to you by data dog and linode data dog gives you visibility into the whole system running your code, visit talkpython.fm/ data dog and see what you've been missing. Throw in a free t shirt with your free trial. Simplify your infrastructure and cut your club bills in half with linode Linux virtual machines develop, deploy and scale your modern apps. faster and easier. Visit talkpython.fm/ low node and click the Create free account button to get started. Want to level up your Python if you're just getting started, try my Python jumpstart by building 10 apps course. Or if you're looking for something more advanced, check out our new async course the digs into all the different types of async programming you can do in Python. And of course, if you're interested in more than one of these, be sure to check out our everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show, open your favorite pod catcher and search for Python. We should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play in the direct RSS feed at /rss on talk python.fm. This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Don't get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon