Monitor performance issues & errors in your code

#455: Land Your First Data Job Transcript

Recorded on Thursday, Jan 18, 2024.

00:00 Are you interested in data science but you're not quite working in it yet?

00:03 In software, getting that very first job can truly be the hardest one to land.

00:08 On this episode, we have Avery Smith from Data Career Jumpstart here to share his advice

00:13 for getting your first data job. This is Talk Python to Me, episode 455, recorded January 18th, 2024.

00:21 [Music]

00:35 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy.

00:40 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,

00:45 both on mastodon.org. Keep up with the show and listen to over seven years of past episodes at

00:51 talkpython.fm. We've started streaming most of our episodes live on YouTube. Subscribe to our

00:57 YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of

01:03 that episode. This episode is brought to you by Sentry. Don't let those errors go unnoticed.

01:09 Use Sentry like we do here at Talk Python. Sign up at talkpython.fm/sentry. And it's brought to you

01:17 by Posit Connect from the makers of Shiny. Publish, share, and deploy all of your data projects that

01:22 you're creating using Python. Streamlet, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports,

01:29 Dashboards, and APIs. Posit Connect supports all of them. Try Posit Connect for free by going to

01:35 talkpython.fm/posit. Hey folks, before we jump in and talk about data science jobs and careers,

01:44 I want to tell you really quickly about some awesome news. Back in February, I gave the

01:48 keynote at PyCon Philippines. It was entitled "The State of Python in 2024." Well, that is now out on

01:56 YouTube. The team at PyCon Philippines did a great job. The video came out great. If you want to

02:01 check out "The State of Python in 2024," according to me, just click on the link in the show notes to

02:06 watch it over on YouTube. Now let's talk to Avery. Avery, welcome to Talk Python to Me.

02:11 Thanks so much. I'm so excited to be here and be part of the show.

02:15 I'm excited to have you here as well. You know, one of the things that people reach out to me

02:19 often is, you know, how do you get into data science? How do you get into programming? How

02:25 do you get into Python? You know, I've been trying, or maybe they got a degree or they

02:30 took some training program, bootcamp or something, and going from zero to one, I think, is the biggest

02:38 career step you have to make. That next job and the one after that, it only gets to be smaller

02:44 steps, not bigger steps. And it's really tough because that first big step, you're brand new at

02:48 it. You have no experience, right? It's your first data science job or your first programming job.

02:53 So hopefully we can give some folks out there a little bit of a hand up to help them make that

02:59 jump. Yeah, totally. I like to show this graphic that says, it's a circle and it's a circle of

03:04 text. And it says, I can't get a job because I don't have experience because, and then it restarts,

03:11 I can't get a job. And that's the tricky part. It's like, how do you get a data science job when

03:15 you have no data science experience? Because to get data science experience, that seems like you

03:20 have to have a job as the prerequisite and vice versa. So it is very tricky. So happy to chime

03:25 in on that today. The industry can take it too far. They can take it way too far. So a few years

03:31 ago, there was a really funny tweet that went around back when they call them tweets. I don't

03:35 know what they're called anymore. Sebastian Ramirez, the guy who created FastAPI, saw a job

03:42 posting when FastAPI was like a year and a half old. It said, you must have four years of experience

03:47 with FastAPI to apply. And he said, Hey, look, I'm the creator of FastAPI and I'm unqualified for

03:53 this job. What kind of world are we living in? Yeah. I don't want to live in that world, but

03:58 that's unfortunately where we're at. That's so tough. And it's hilarious. These job descriptions

04:04 are getting out of hand. That's for sure. Yeah. Well, with AI, it's probably not going to get

04:08 better. We can talk about that more later. But before we get into that, let's just jump in with

04:13 a little bit of background on you before we get to the topic. Tell us a bit about yourself. What

04:18 do you do? How'd you get into Python? Things like that. Yeah, absolutely. So I'm currently

04:23 a data science consultant and also a data science instructor. I run some online programs where I

04:29 teach people to become data analysts mostly is what I'm focused on. But I also have this practice

04:35 where I help companies solve data problems with different techniques. I started actually by

04:41 studying chemical engineering in college in my undergraduate degree. And about a semester in,

04:47 I realized, crap, I hate this. This is not for me. Do you agree? Have you felt something similar?

04:54 I did a semester of chemical engineering as well. I thought, I love chemistry. I love math.

04:58 Put them together. Somehow they don't go together. It's like ice cream and eggs or something. No,

05:04 they don't go together for me at least. Yeah. It wasn't good for me either. I was just like,

05:08 oh man, I'm actually not interested in refineries or manufacturing. But I, like you,

05:14 liked chemistry. I liked math. I thought this is perfect. But I quickly realized, oh man,

05:18 I really like this whole programming part that I get to do in MATLAB at the time when I was an

05:23 undergrad. And I was on a time crunch to get through college quickly through eight semesters.

05:29 And the other issue I had was I didn't know what to do instead. It was like, I don't really want

05:34 to study computer science. Part of the reason why is they can't have this weed out course at

05:39 the beginning, which you had to build Excel from scratch, basically like some sort of a

05:44 spreadsheeting tool. I was like, wow, why would I rebuild something that already exists that I

05:48 don't even like using in the first place? I wasn't really into it. So I didn't know what to do. And

05:53 luckily I was working as a lab technician at this company, the really cool company that makes the

06:00 sensors that basically have the ability to smell. So they can sniff what's in the air and it has

06:05 applications for finding drugs or bombs and airports and stuff like that. And there was

06:10 a data scientist on staff and that data scientist was awesome. He was showing me all these cool

06:15 algorithms he was writing for these sensors. And then one day he got up and left and he left the

06:20 company. And we tried to hire another data scientist for six months, but they were really

06:25 expensive. We were a small company and none of them really wanted to move to Utah where I lived

06:30 in Salt Lake City. And so we couldn't really find someone that would be able to do it.

06:33 And so finally I was like, well, I really like this programming stuff. And the data scientist

06:38 showed me a thing or two, maybe I could take a stab at this. And I wrote my first machine learning

06:43 algorithm. And I was like, oh my gosh, I'm addicted to this. And then I never looked back

06:47 and had been data science since basically. What a great story. I think a lot of people

06:52 fall into programming that way. And for some reason, not unexpectedly, but for some reason,

06:57 a lot of people fall into Python that way as well. They're like, I have a job and I got this thing I

07:02 got to do. I just need a little bit more than maybe like an Excel spreadsheet or something.

07:08 And put it together and like, actually, this is cool. After a while, like this is cooler than

07:11 what I've been doing. Or maybe I'll make it a good part of what I do. Right?

07:15 Yeah. A hundred percent. Even just making, it was in MATLAB, which is basically

07:19 engineer's version of Python or college version of Python 10 years ago. Right?

07:24 Yeah.

07:24 And I made like tic-tac-toe and I remember playing tic-tac-toe against the computer.

07:28 I think that's what it was, or maybe it was hangman. I can't remember. But I remember the

07:31 idea of like being able to play, to program games and play against the computer. And I built it.

07:37 I was like, this is the coolest thing ever. I got to do more of this.

07:40 Absolutely. I've done some MATLAB too when I was younger and it's not that different from Python,

07:45 but it's, I think one of the big differences other than it just being like embedded in a

07:50 big expensive app is it's not a general purpose programming language. Right? You wouldn't go,

07:55 that was fun, but let me go build this website in MATLAB or let me create Airbnb in MATLAB or,

08:02 you know, like there's, you just don't want to sort of, Azure has this like

08:06 self-prescribed limit to what you can do with it.

08:08 That's one of the coolest parts about Python is it's really a Swiss army knife

08:12 and you can pretty much do, I don't want to say anything, but pretty close to anything

08:17 in Python, which makes it really neat. And obviously one of the huge limitations of MATLAB

08:21 is one, it costs thousands of dollars, but two, you're right. It's not going to do cybersecurity

08:26 for you. It's not going to build websites, but the syntax at the end of the day was,

08:29 was really quick. It was, it was easy for me to transition from MATLAB to Python because the

08:34 syntax isn't all that different.

08:36 No, it's not all that different. More math focused, but pretty similar. So I think maybe

08:40 that's a good place to start discussing and exploring this topic of your first data science

08:45 job. And wouldn't necessarily plan on starting here, but let's, let's start with before you

08:49 even necessarily know programming language, right? Maybe you've dabbled in MATLAB or you've

08:55 dabbled in Excel or even dabbled in, I don't know, JavaScript or something. This thing

08:59 we've been talking about with MATLAB and it applies to other areas as well, like through

09:03 programming languages per se, like Julia or something like that is how, if you invest

09:10 your time into learning one of these things really well, like how broadly industry-wide

09:16 of a skill, high demand skill is that going to be, right? You learn MATLAB, you put yourself

09:20 in a box, you learn a more general programming language, you kind of have more options afterwards,

09:25 right?

09:25 Yeah, totally. I think like the more broad of a language you learn, the more useful you

09:30 are to, to more industries in general. But I might take that even a step further and

09:36 just say, you know, learning MATLAB, not a whole lot of companies use MATLAB, but just

09:41 like landing your first data job, going from zero to one is the hardest. Learning your

09:45 first language zero to one is the hardest as well. And then once you have that first

09:49 language, the next language becomes so much easier. So one of the first things I learned

09:54 was, was MATLAB and then I moved to Python and that was easier. And then I learned SQL

09:58 and then I learned R and then I learned JavaScript. And every time I added like a new tool to

10:02 my toolkit, it was quite, not what I want to say, it was easy, but it got easier with

10:06 each one. I think that's true with foreign languages as well. Once you learn one foreign

10:10 language, then the third and the fourth become quite easy. At least that's, that's what

10:14 I heard. I speak kind of two and a half languages, but like there's people who speak like seven

10:19 and they always say like the sixth and the seventh become easy.

10:21 Yeah. You wonder how could you possibly, because learning the first one is so hard,

10:25 first foreign language. So you're like, well, how could you possibly take that on for this

10:29 many languages? And it's that it's not the same challenge each time, right?

10:32 Yeah, exactly.

10:32 Yeah. So I think when people are considering getting into data science, they really want

10:37 to consider what language they choose and where they go. Like you're coming out of a

10:42 college program, you might feel like MATLAB or something like that's real popular. And

10:46 yet that's because it's popular amongst professors who force their students to do it. That doesn't

10:51 necessarily mean that's the world, the broad worldview. What do you think about R? You

10:56 know, both.

10:57 I like R. I'm not, I sometimes troll R on LinkedIn. So I guess that's another thing

11:02 I should say is I post a lot on LinkedIn, kind of a LinkedIn guy. And so a lot of the

11:06 times, honestly, just for jokes and kicks and giggles, I'll kind of roast R on LinkedIn

11:11 just to get the trolls angry in the comments. And it's quite fun. It's quite a fun experience,

11:15 but I'm not that big of a hater. I definitely think R has its place.

11:18 The one thing that's really interesting about R versus Python is obviously a big debate

11:22 in the data science community is R is kind of that does one thing really well. And it's

11:28 getting a little less of that as more packages and libraries are added to R, but R does the

11:33 statistics and machine learning very well. But obviously I don't think once again, I

11:37 don't know any websites, any like super functioning websites that are built on R.

11:42 I don't know any cybersecurity that's really done, done via R. So I think R does what it

11:46 does well. The syntax sometimes is a lot easier for people to go from Excel, which a lot of

11:51 people are more familiar with in the finance or banking world. For example, the syntax

11:56 in R is a little bit more similar to those Excel formulas than it is to Python. So I

12:01 think sometimes people have a little bit more success just because, oh, this kind of feels

12:04 like R formulas, or sorry, this feels like Excel formulas. And so people really get there.

12:09 I think what you're kind of alluding to is if you're going to learn one skill, you

12:13 might as well learn the one skill that's applicable to the most, the widest net. Right?

12:18 And so that way you're fishing in the biggest lake you possibly could versus in a smaller

12:24 pond of R. I think that's worth looking at. And one of the things I actually really enjoy

12:28 doing, because you mentioned, oh, you might think MATLAB is popular because that's what

12:32 the professors taught you. And there's actually not a whole lot of data out there about, well,

12:37 what should you learn? So I don't know if you know who Luke Bruce is. He's a data analyst

12:41 YouTuber. I was going to say YouTuber on YouTube, but that's kind of redundant on YouTube.

12:45 And one of the things he's done is he's actually built this tool where he's web scraping

12:49 thousands of jobs, different data jobs every week, and then displaying and analyzing the

12:55 skills required for those jobs. So it's actually like a data-driven way of saying, if you want

12:59 to be a data scientist, what skills should you actually be focusing on as you go, as

13:04 opposed to just listening to what a professor will say or what a LinkedIn influencer will

13:10 say or what your bootcamp will say. Like actually getting some data on, I think is pretty neat.

13:14 That is super cool. And I'm not familiar with Luke, so we're going to dig him up and put

13:20 him in the show notes for later so people can check that out. Do you remember any of

13:23 the trends he's recently talked about?

13:24 It's datanerd.tech, I think is the website there. I look at it mostly for data analysts

13:30 because that's who I work with the most. So I know the data analyst data very well.

13:35 SQL is number one at 50%. I think Python is number two at like 30%.

13:40 I think Python might've jumped it.

13:43 Well, this is for all data positions right here. So the job title you can choose.

13:47 Ah, okay. So which one are you thinking I should pick here? Data?

13:51 Maybe data scientist.

13:52 Data scientist, yeah. Right.

13:53 That's what Suda says.

13:54 Yeah, you're right.

13:55 Wow.

13:56 Whoa, Python 69%. Look at that.

13:59 That's huge. So like that's even, that's even what? 20% more than SQL, which a lot of people

14:04 are like, if you want to be a data scientist, you have to know SQL. If you look at the job

14:07 descriptions, Python's mentioned a lot more. So if you're going to learn, if you're brand new

14:12 and you're going to learn one, you might as well start with Python because that's probably the

14:16 most in-demand skill that there is right now for a data scientist.

14:18 Yeah. And it's pretty easy, right? It's not like, well, why don't you just learn C++

14:22 for embedded devices? You're like, you know what? Maybe I'll pick something else to start with.

14:26 Right. But you know, Python's pretty easy.

14:27 I agree with you. I think Python's great. I actually think, I think SQL is probably easier

14:32 to learn if I'm being honest, because really, especially for like data science stuff, there's

14:37 only about like 20 commands that you need to know in SQL, but it's once again, SQL is a lot more,

14:41 there's no websites built on SQL. I'll tell you that much. So it's a lot more limited on what it

14:46 can do.

14:46 It's a skill, but not the language. It's not enough on its own, generally. I mean,

14:52 you can do reports and quite a bit with it, but you know, it's like when you see these programming

14:57 popular, you're like, what's the most popular language? Oh, look, CSS is the third most

15:01 popular. Like that's not a language. That's a thing that you use with other languages, right?

15:05 Like use it with all the other languages. That's why it's high up, but that doesn't mean it's high

15:09 in demand. Exactly. It's just like table stakes, you know? So you kind of got to distinguish table

15:14 stakes from like picking an area, I think.

15:16 That's totally true. And really I think Pythonistas could make the argument that there's

15:22 really nothing in SQL that you couldn't do in Python. That's a little somewhat true, depending

15:28 on data size and stuff like that. But regardless, there is ways that you can do most of the SQL

15:33 commands in Python one way or another. It could be, when I first became a data scientist, I didn't

15:39 even know SQL and I was doing SQL commands or I was doing the aggregations or the where functions

15:44 or the window functions using Python. So you definitely can, as long as your data is not like

15:49 super big, then you'll totally be fine.

15:51 Right. Like some kind of generator or even slices or yeah, things like that, right? List

15:56 comprehensions, set comprehensions, all that kind of stuff. Kind of like, gosh, I really wish,

16:02 a little bit of a sidebar, but I wish like list comprehensions and all those things had just a

16:07 few more SQL features, right? Like in a list comprehension, I say, give me this thing,

16:12 maybe give me this property of this class modified, like give me the user's name,

16:16 uppercase, right? So that's like select. And then for thing in collection, that's like

16:22 from table or whatever. Right. And then you have the where clause with the if statement, but boy,

16:28 wouldn't it be cool to have like a sort also in there and other things like that, you know?

16:34 Oh, well, totally.

16:35 It's so close.

16:36 The cool thing is, is if you want that sort, it's what one extra line, like it's not too bad. So

16:41 it, Python, I mean, I don't want to say this necessarily to hate all, to make all the data

16:46 scientists and SQL lovers mad, but, but really Python can do a lot of the things that SQL.

16:52 That's for sure.

16:52 Yeah, that's for sure.

16:53 This portion of Talk Python to Me is brought to you by Sentry. Code breaks. It's a fact of life.

16:59 With Sentry, you can fix it faster. As I've told you all before, we use Sentry on many of our apps

17:05 and APIs here at Talk Python. I recently used Sentry to help me track down one of the weirdest

17:10 bugs I've run into in a long time. Here's what happened. When signing up for our mailing list,

17:16 it would crash under a non-common execution paths, like situations where someone was already

17:21 subscribed or entered an invalid email address or something like this. The bizarre part was that our

17:28 logging of that unusual condition itself was crashing. How is it possible for her log to crash?

17:35 It's basically a glorified print statement. Well, Sentry to the rescue.

17:39 I'm looking at the crash report right now, and I see way more information than you'd expect to find

17:44 in any log statement. And because it's production, debuggers are out of the question. I see the

17:50 trace back, of course, but also the browser version, client OS, server OS, server OS version,

17:56 whether it's production or Q&A, the email and name of the person signing up. That's the person who

18:01 actually experienced the crash. Dictionaries of data on the call stack and so much more.

18:05 What was the problem? I initialized the logger with the string info for the level rather than

18:11 the enumeration.info, which was an integer based enum. So the logging statement would crash

18:18 saying that I could not use less than or equal to between strings and ints. Crazy town. But with

18:25 Sentry, I captured it, fixed it, and I even helped the user who experienced that crash.

18:31 Don't fly blind. Fix code faster with Sentry. Create your Sentry account now at talkpython.fm/Sentry.

18:37 And if you sign up with the code TALKPYTHON, all capital, no spaces, it's good for two free months

18:44 of Sentry's business plan, which will give you up to 20 times as many monthly events as well as

18:49 other features. Probably the biggest, it's a bit of a diversion, but the biggest similarity to that

18:57 I've seen in the languages is C#'s link, where they actually have almost all the query operators,

19:04 including joins and stuff like that, built into the programming language. I'd love to see more of

19:08 that kind of inspiration into Python, but you know, that's all right. It's still really good.

19:12 Got a lot of cool SQL-like features, but you're right, once you are no longer working with data

19:17 and memory or you want indexes, right? Like this concept of indexes is not sufficiently well

19:23 understood, I think. Every time I hit a website that takes five seconds to load, I'm like,

19:27 somebody is not doing all the things they should be doing. I just know it.

19:31 That's totally true.

19:33 What about SQL? You know, let's talk about that for a bit, right? The SQL, the query language

19:38 for databases and other things. There's ways to SQL query, not just relational databases.

19:44 But you said you got away with not quite learning that, but do you think if you could start over,

19:50 maybe making an effort to learn that would be really valuable? Like how important is this

19:54 a thing in the beginning of your career?

19:56 The interesting thing, you know, about landing a data job is your skills only plays, I say,

20:03 a third of the role. Your portfolio or the way that you portray your skills and your network,

20:09 I think, are the other two thirds and are actually more important than your skills. And that's kind

20:12 of how I got away with not knowing SQL and not even being, to be honest, that good at Python

20:18 at the time was because I used my network to be in the situation to get my lab technician job in

20:24 the first place. And then once again, I use that same network, in this case, my coworkers to land

20:29 that first data scientist position after we couldn't hire anyone. And if I would have been

20:34 applying externally for that role, chances are I wouldn't have gotten that role. I probably didn't

20:39 know enough at the time to land that type of a role, but because they knew I was hardworking,

20:43 they knew I wasn't like a total idiot and I really liked to learn, they took that chance on me.

20:49 It paid off really well for them because at the time I was still in college. And so I wasn't

20:53 getting paid that much and I was not getting paid like a data scientist, but I was getting results

20:58 like a data scientist for them. So I think it paid off for both of us. But I think if that was

21:03 an external job and I applied for it, I probably didn't have enough skills for it. So I definitely

21:07 think learning SQL, if you want to land a data science job, isn't a bad place to start, especially

21:12 because like I said, I mean, any programming language, I like to think of like the iceberg,

21:17 kind of like the Titanic, right? There's the parts that you see and then there's the parts that you

21:22 don't even know that are there. And really you could spend the rest of your life trying to master

21:27 SQL or the rest of your life trying to learn Python. But the cool thing is, is a lot of the

21:31 time you only need that top little bit that's sitting at the top of the surface of the water

21:36 to actually get stuff done. And so for SQL, I think that's like 20 commands. And I think you

21:41 could learn it honestly in like a month. You could learn those 20 commands pretty easily.

21:47 But it worked out for me and I didn't have to use it that much at the time until I was probably about

21:51 almost three years into my job. And I actually had switched jobs to a bigger company. The other

21:56 thing that I was working for a smaller company where we didn't have a ton of data. So we could

22:00 use CSVs kind of as our database, which is not great practice. But when I eventually became a

22:05 data scientist at ExxonMobil, I was going to say they didn't use Excel as a database, but they

22:10 still did. But the point is they had much larger SQL databases with hundreds of thousands, actually

22:15 millions of rows of data that I had to query. Yeah. Then you got to be really, you need to

22:20 understand it at a much deeper level. You're like, if you do a query like this, it's going to be

22:24 super slow. But if you do it like that, it can use the composite index for the sort and then

22:28 all right, then you're getting to the bottom of the iceberg in SQL, or maybe not the bottom,

22:33 maybe like the middle chunk under the water, but there's so much to learn for both of them.

22:37 Amira, the audience asks, when you talk data job, what kind of jobs are out there? So we talked

22:43 both about how we did chemical engineering, and then we saw chemical factories. You're like,

22:47 yeah, I don't really want to work here anymore. I'm out. So thinking about, well, what are the

22:52 kind of jobs you do? I think that's really important because it's easy to get focused in

22:58 on the FAANG companies. I want to work for some super big tech company. I want to move to San

23:04 Francisco and like that, that, that, right. Like there's not just plenty of other jobs,

23:09 but the opportunities, just like you described, and as well as my first job, I worked at a company

23:13 that had like eight people and it was awesome. Right. They didn't expect me to be, you know,

23:19 running Kubernetes clusters and doing all sorts of crazy. They're just like, I need you to make

23:23 this thing happen. Can you do like, I'm pretty new, but that thing I can make that happen.

23:27 Like, let's go. Right. And I feel like the, the possibilities to get in, especially with these,

23:32 maybe more niche type of industries and companies might even be easier for a first job.

23:38 People seem to be really obsessed with, with the FAANG. And I don't know if that's like a

23:42 societal thing or if it's just, those are the companies that we use a lot. And so we're excited

23:46 about them, but yeah, there's so many more data jobs outside of FAANG than there are inside of

23:51 FAANG, even though there's, there's quite a bit inside of FAANG. And oftentimes those roles can be

23:57 much more interesting and you can do a lot bigger of an impact. When, when I was working at this

24:02 small company, VaporSense, I like, I had so much power. I didn't even realize it. I had such a big

24:08 effect on the company. I was presenting to, you know, fortune 500 companies and what I did really

24:13 made a difference. And when it came to the point where ExxonMobil offered me to go be a data

24:19 scientist for Exxon, I said, Oh, I want to go work for the big company with the nice desk and the

24:24 nice laptop and, you know, try something new. And when I got there, I really, I had some pretty cool

24:30 opportunities when I was at ExxonMobil, but ultimately I left pretty shortly after two years

24:34 of being there because I just felt like a cog in the machine. And I didn't feel like I was actually

24:39 making a difference. And that was really important to my work satisfaction of like, is what I'm doing

24:43 being used? Is it being used to better the world? Do I feel valued? And the answer was kind of no

24:48 for me when I was there. So there's definitely a trade-off between the small companies and the,

24:51 and the big companies, but also to go back to your original question, there are so many freaking

24:56 roles in the data world that you're not even thinking of that. Like, I'm not even thinking of,

25:01 I saw a new one the other day when I was helping one of my students, it was like, it wasn't data

25:05 janitor, but it was something like that where I was like, I don't even know what that role is,

25:09 but there's, there's so many roles. When I was, when I was a data scientist,

25:12 VaporSense, the small company, my actual title was junior chemo-matrician, which basically means

25:19 you're doing data science with chemistry. When I was at ExxonMobil, when I was first there,

25:24 I was doing data science, but my actual title was optimization engineer. And so there's so

25:28 many titles that we don't even think to search of, or even to look up, but those are all data

25:33 science roles. I was doing machine learning every day in both those roles. And you would maybe never

25:37 guess from those titles. - Yeah. You would never guess. No, that's awesome. What machine learning libraries, frameworks were using?

25:43 - At VaporSense, once again, because it's a smaller company, I had a lot more say in what

25:48 I was doing. We were building a bunch of machine, we were building classification models to basically

25:53 to take the data from our sensors and sniff if something was in the air. Sometimes that was a

25:57 yes, no, like, oh yes, there is ammonia in the semiconductor factory and that's bad. So that's

26:03 a yes classification kind of binary, right? Other times it was, what drug is this? Is this meth or

26:10 is this heroin? One of the use cases we had was, this is binary once again, but is this recreational

26:16 marijuana or medicinal marijuana? And can we tell the difference between those? So we were usually

26:21 using classification models, usually built in scikit-learn in Python the majority of the time

26:27 there. When I was at Exxon, we had a lot less say, like the data scientists had a lot less say

26:32 in the decision making process. We were doing a lot of multivariate linear regression with a lot of

26:38 crazy hacks and transformations kind of in the meantime for one of my positions there.

26:43 And then the other time, the other position I did there, we were doing a lot of auto ML

26:48 using PyCharm and letting it kind of decide what type of models to do. So.

26:52 - Okay. The unsupervised learning type stuff, huh?

26:54 - It was awesome. It was really fun to, I love PyCharm because it's like, okay,

26:58 go make 25 models and tell me which one's the best. It's like, makes my job easy, I guess.

27:03 - We're going to be creative with sheer numbers. That's how we're going to come up with a solution.

27:09 Got it.

27:09 - Exactly.

27:10 - Well, Diego is asking like, what are some of the common stats methods as in mathematical type stuff you would use? So one of the things I know that some people

27:19 getting into programming think is you got to be really good at math to be a programmer. I think

27:24 you got to be really good at logical thinking, but you need to almost zero math, be like a web

27:30 developer. You know, we're talking percents for CSS, incrementing numbers from one to two to two

27:37 to three for IDs and stuff like that. But for data science, maybe there's a little bit more like,

27:42 where do you see that kind of background?

27:45 - I like what you said. You have to think logically, but maybe the math isn't as important.

27:49 And I think it's actually somewhat similar in data science. I will say you probably need a

27:53 little bit more math than a web developer, but I think it's a lot less than most people think.

27:57 And it's probably less about being able to do the math and maybe more about understanding

28:02 the mathematical concepts. And what I mean by that is a lot of, a lot of, so I also have a

28:08 master's degree in data analytics. A lot of master's degrees in data science and data analytics

28:13 will say you need calculus and linear algebra as kind of a background for your math. And that kind

28:18 of stops people. I don't want to do any calculus. I don't want to do any linear algebra. And while

28:22 both those concepts do exist in data science principles, the majority of the time the computer,

28:28 Python is doing the math. You just have to be able to interpret the results of the math and kind of

28:34 know what different directions, like this is going down on an optimization problem. Okay. That's the

28:39 derivative, getting closer to zero. It's really less about knowing how to do the math by hand

28:44 and more just understanding what the math of the computer is actually doing.

28:48 So I think it's actually a lot easier than most people say. That being said,

28:51 knowing how to do a derivative or take an integral, those concepts, I think is probably underlying

28:56 pretty important. But other than that, a lot of the times I'm doing linear regression because

29:02 it's awesome. It gets the job done. A lot of the time I'm doing hypothesis testing and statistics,

29:08 which you have to look like at a P score, nothing all that crazy. At Exxon, I had to do a lot of

29:13 linear programming, but that's honestly, that's like the exception versus the rule. There's not

29:18 a whole lot of linear programming for most data scientists. So I really don't think the math is

29:23 all that hard. Now, of course, that's coming from someone who got a chemical engineering degree,

29:29 who had to take all the calculus, all the linear algebra. So I did go through those courses. I

29:33 haven't really done it from scratch from a lot of my students are teachers, for example, who never

29:38 took those courses in college. So I can't speak from that perspective, but a lot of my students

29:42 are able to figure it out at the end of the day and transfer. So it happens.

29:45 Yeah, yeah, for sure. I think you make a good point. I think it's about knowing, okay, this

29:51 formula or this algorithm or this test means this thing. It applies in this situation. It doesn't

29:57 apply in that situation. Here's what you're trying to get from it, right? Like I know I need to do a

30:01 fast Fourier transform. So, and this is what it tells me when I get out the other side, but do I

30:07 need to be able to sit down and recreate the integral and the calculus behind it and do that

30:15 on like a home, like as a homework example, and like, give me a function and I'll do the Fourier

30:19 transform and I'll actually do the symbolic integration. Like, no, you probably don't need

30:23 that. Right. But you need to know, I do the Fourier transform in this situation, and this is

30:28 why. And then I just say, call the function, do it. Right.

30:31 And interpret the results. Really? That's what being a data scientist is all about is, yeah.

30:36 What does the business use case, what's the desired business use case? How do I relate that

30:41 use case to the data? What technique can I use to get the outcome that I need? Computer go do it,

30:48 interpret results, present to stakeholders. That's a data scientist right there.

30:52 I think one of the challenges with that is going to be, not that it's not good, but I think it's

30:57 going to be challenging because how do you learn when to use a certain statistical test or when to

31:04 do some kind of funky transformation, like a Fourier transform without more traditional

31:10 mathematical backgrounds and all the academics will not just go, oh, we're just going to give

31:14 you like five minute overview and they'll help you understand. They're like, nope, we're going

31:18 to start with this axiom or this theorem from differential equations. I'm going to work up.

31:23 You're like, no, no, no, no, no. I don't need that. I don't, I'm not on a four year plan.

31:28 I'm on a four week plan. How do I, how do I get value from a couple of the mathematical things

31:33 without being sucked into like, yeah, now I'm in differential equations at Harvard online and I

31:38 don't understand how I got there. It's such a big problem. And I'm so glad you brought this up.

31:43 And I'll be vulnerable because yeah, I felt the same, the same way. And I was like, there has to

31:47 be a better way. And so about, what was it? Three years ago now, two and a half years ago, three

31:51 years ago, I said, oh my gosh, I'm going to solve this problem. And I'm going to start my own data

31:55 science bootcamp. And so I spent about six months making the curriculum, making all the videos.

31:59 I opened it up. I got some students in there and I ran it for about six months and I looked at the

32:04 results and man, we weren't getting anyone into data science jobs. And I thought, ah, what the

32:09 heck am I doing wrong? I had this brilliant idea of like, we're going to be less theory, more

32:13 project, more hands-on. And I realized, man, the truth is people just learn better at work.

32:20 That's where you learn that whole technique that you just like, how does someone learn that? The

32:24 answer is by getting experience and learning it at work. And when I looked back and I said, okay,

32:28 well, we have had students get jobs. What jobs did they get? And it turns out most of them were

32:33 getting like business intelligence engineer jobs or data analysts or financial analyst jobs that

32:39 were a little bit below a data scientist job. And I realized, oh man, if we can just help people go

32:45 from zero to one and get their foot in the door, they can go from one to five much quicker at work

32:51 because work is just, I don't know, it's this magical place where like you said, whatever you

32:56 were working at earlier, and they're like, hey, can you do this Kubernetes thing? They just kind

33:00 of throw you in the fire and you're like, figure it out. And that's somehow you do. I don't know

33:04 what it is about work, but you figure it out and that's where you learn. So that's kind of why I

33:08 changed my curriculum to be more focused on, okay, maybe people aren't going to become data

33:12 scientists, but can we get them to zero to one quickly? And then they can get paid to learn the

33:17 rest of the data science stuff when they're actually in that first position. - How much do

33:20 you know about what you actually want to do in the industry before you've done it as well?

33:25 - Right. - You're like, oh, I thought everybody said machine learning was awesome and I've used ChatGPT and I loved it, but it turns

33:33 out actually like API is better, but I've never had a chance to build an API. So until I started,

33:38 I didn't even learn that one, it was a thing, two, that it was cool or vice versa, right? Whatever.

33:43 But until you get kind of in, you don't even know like, actually this part is where I really

33:48 am enjoying it. And so just getting that first step, that's a big deal. - A hundred percent. You

33:53 don't know what you don't know until you know it. That's why, I mean, really when it comes to,

33:58 if we go back to just SQL or just Python, you could spend, I tell people this, if you tried

34:04 to master Python before you applied to a job, you'd be like 80 years old before you ever applied

34:09 to a job. Same with SQL, same with machine learning. The cool thing about data is we're

34:13 never going to know it all. And so just learn the bare minimum to get your foot in the door.

34:18 And then you have this place where you're going to get paid to learn what you want to learn

34:22 eventually. If you learn, oh, I love APIs. I promise you that there's a company out there

34:26 that will hire you and you can learn APIs on the job. Like that's going to happen. But that first

34:31 step is so true. - There's a company out there that it doesn't know it needs APIs, but you could

34:35 help them. And you know, they don't have huge expectations because this is the thing they just

34:39 learned they needed. Right? - A hundred percent. - Yeah. It's wild. Right? This portion of Talk

34:45 Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio, and especially

34:51 Shiny for Python. Let me ask you a question. Are you building awesome things? Of course you are.

34:56 You're a developer or data scientist. That's what we do. And you should check out Posit Connect.

35:01 Posit Connect is a way for you to publish, share, and deploy all the data products that you're

35:06 building using Python. People ask me the same question all the time. Michael, I have some cool

35:12 data science project or notebook that I built. How do I share it with my users, stakeholders,

35:17 teammates? Do I need to learn FastAPI or Flask or maybe Vue or ReactJS? Hold on now. Those are cool

35:24 technologies and I'm sure you'd benefit from them, but maybe stay focused on the data project.

35:29 Let Posit Connect handle that side of things. With Posit Connect, you can rapidly and securely

35:33 deploy the things you build in Python. Streamlet, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto,

35:40 Ports, Dashboards, and APIs. Posit Connect supports all of them. And Posit Connect comes with all the

35:47 bells and whistles to satisfy IT and other enterprise requirements. Make deployment the

35:52 easiest step in your workflow with Posit Connect. For a limited time, you can try Posit Connect for

35:57 free for three months by going to talkpython.fm/posit. That's talkpython.fm/posit. The

36:04 link is in your podcast player show notes. Thank you to the team at Posit for supporting Talk Python.

36:09 All right. Let's talk about some career advice. I mean, I know you talked about being connected on

36:18 LinkedIn pretty well and certainly having some kind of social network support. And maybe it's

36:24 not that you would call it not social, but a real world network of actual human beings that you

36:29 physically know somehow. What's that? I don't know what that is.

36:32 I know. We gave that up back in 2020, I thought. Anyway, there was some stat that I saw somewhere

36:38 that over half of the jobs are fulfilled before it even becomes a job posting. Maybe some of the

36:45 best ones is like, "Hey, who knows somebody who can do this?" We need some like your data scientist,

36:50 data scientist example. They quit. We need somebody. Anybody know a good data scientist?

36:55 I don't want to just go put it out on the open job market and have to have a hundred interviews and

36:59 who knows what I'm going to get. If you can recommend somebody, let's start there. Right?

37:04 So being in that group to be recommended, it's important.

37:07 It's the key. There was a really interesting survey done on LinkedIn and they said,

37:12 it was done by the same person and Jordan Nelson, by the way. He said, "How do you approach getting a job?" And then the next day he said, "How did you get your last job?"

37:21 And 80% of people, they use what I call the spray and pray method, which basically means you go and

37:27 you apply to as many jobs as you possibly can and hope for the best, cross your fingers. That was 80%

37:32 of what people were doing. And then on the next poll the next day, it was a total, I think of what,

37:36 70% were either headhunted, recruited or referred. Yeah. Yeah. Yeah.

37:41 And so it's like the Pareto principle here where 80% of the effort is only getting you 20% of the

37:47 results and really 20% of the effort gets 80% of the results. So it's okay. We know networking and

37:52 getting recruited is really important, but how do we do it? It's easier said than done. And like

37:56 you said- In the industry, how do I make friends who are, right? It's like, well, my neighbors

38:01 don't do it, so I guess I'm out. That's the tricky thing is, yeah, if you're not in the industry yet,

38:05 how do you get recruited into it or how do you know someone? And what I've come to learn is it

38:10 actually doesn't even matter. So like, for instance, let's take your neighbor, right?

38:14 Your neighbor is probably not a data scientist. Maybe you're lucky and they are, and they can

38:18 refer you to a company. But what's really cool is I've learned that companies really come to trust

38:23 their employees and their employees' recommendations. And so even if your neighbor, let's

38:28 say, is a web developer or maybe even less technical, let's just say your recruiter is in

38:32 finance, right? And if there's an opening, like a data science opening at that company, a lot of

38:38 the times they will actually take their employee referrals much more seriously than any sort of

38:43 cold application that they get. And so a lot of the times I've had students who just know someone

38:48 that works at the company. They saw a job opening pop up. They're quickly, they message their

38:52 friends, "Hey, do you know a recruiter or a hiring manager? I could talk more about this role. Could

38:57 you do an internal referral for me?" And they were able to land jobs that they probably wouldn't

39:01 have. No, they definitely wouldn't have without that internal referral. So it is tricky. It's

39:06 that old cliche. It's not what you know, it's who you know.

39:08 I think there's still plenty of ways, COVID notwithstanding. I think that these days there's

39:13 plenty of ways to get those connections, right? But maybe people don't know. Like

39:17 meetup.com is really good. If you live in a non-tiny city, there's many, many things going on

39:24 around data science, around Python, around other data engineering, whatever, right? You could go

39:30 to those things. They're typically even free. Often they are free with food. They even feed

39:34 you, right? And make connections or regional conferences or national conferences, right?

39:40 We probably, many people have heard of PyCon, right? There's US PyCon, there's EuroPython,

39:46 and then there's, but those are the ones that are often talked about, but there's 10, 20 little

39:52 smaller regional ones in the US and many more that I'm not aware of throughout the world. Probably

39:56 one of those within driving distance, right? That you could go to, make connections and just also

40:01 kind of take the temperature of actually what you see on the internet versus what you see

40:06 and actually talking to real people. So I'd also say just get out there.

40:10 A hundred percent. Those places have the people who probably want to hire you because they're

40:16 local, right? Which is one thing that's trouble on LinkedIn. I'm big on networking on LinkedIn,

40:21 but a lot of the times you're going to be networking with people who in all likelihood

40:25 might never have a role that's even open to you. But the people that you're like, for instance,

40:29 we have, I'm in Utah and we have Silicon Slopes that has a tech meetup. We have a local Python

40:36 meetup chapter. We have the big data and developers conference that's free every year

40:41 with tons of food. And the people who go there are people from companies around there that have

40:46 the openings that you're trying to find and they want to hire people like you who are in the area.

40:51 So at least you can maybe come to the office once a week or maybe once a month or whatever, right?

40:55 And so really, like you said, going to those meetups, it's tough because networking is always

41:00 difficult either online or in person, but at least in those situations, you know, Hey, these are

41:04 people that are tied to real companies that exist around me that do make data hires. So I have a

41:10 chance. Definitely a much higher chance than just shooting out a resume. All right, well, let's see.

41:14 We talked about job hunting already. What about like applications and resumes? What are your

41:20 thoughts on that? I think once again, with the applications, the more targeted that you can make

41:25 it, the better, right? So if you can really hone in on, I really want this job, I'm going to cold

41:31 message five people at this company and see if I can get that internal referral one way or another,

41:36 make a real connection with them. I think that's really key. And then with resumes, resumes are

41:41 more of an art than they are a science. I feel like they are so difficult to figure out. And

41:46 these ATSs that are trying to match you and see if you're a good fit. I've tried a lot of them and a

41:51 lot of them suck. Whoever's the data scientist behind those, we need to have a conversation

41:55 with them because it's a little tricky sometimes. But one of the coolest concepts I've been

41:59 introduced to recently, and I have a whole episode on my podcast about it, is A/B testing your resume.

42:06 And basically the idea is a resume's job is just to get you a screener interview or like a beginner

42:14 interview, basically, right? That's all an interview. No one's seeing a resume and then

42:17 hiring you. They're always going to interviews. So if you think about it, a resume's job,

42:21 the only job it has is to convince someone to get on the phone and talk to you. And it's just a

42:26 piece of paper. And guess what? You can put whatever you want on that piece of paper.

42:30 Now, I'm not saying to lie, but I'm just saying you could theoretically make a perfect resume for

42:36 whatever job you're trying to go for and send it out there and see what happens, right? But I'm not

42:40 saying to do that. I'm not saying to lie. My point in saying this is that the resume is just to get

42:45 you the interview. And if you're not getting interviews, something's probably wrong with your

42:49 resume. And so tweak something, apply to 10 more jobs, see what happens. Tweak something, apply to

42:55 10 more jobs, see what happens. Until you finally have the right combination of skills, of experiences,

43:00 of different keywords. Because a lot of the time you're just trying to beat the ATS. And that's the

43:05 sad part about it. It's like, how do I prove to this random computer algorithm that they should

43:10 talk to me on the phone? That's a hard game to beat. And there's a whole bunch of advice from

43:14 all these different people. What I've come to learn is it's different for every company. It's

43:18 different for every person. You kind of a numbers game until you get lucky and you figure it out.

43:22 That's good advice. I guess two thoughts. One is I know that speaking specifically to anyone,

43:28 but in general, women wait until they match all the requirements of a position where a guy's like,

43:34 I know three of those things. I'm taking a flyer. I'm sending it. I would just like to

43:39 encourage the women out there to just send it as well.

43:43 I a hundred percent agree with that. And I think if you reach 60% of the requirements,

43:48 I think you have a chance. A lot of the times those are wishlists and not actual requirements.

43:53 And depending on, are you local to the area? Do you have a domain experience in this company?

43:59 There's lots of other factors. What about contributing to open source or having GitHub repos that can be projects that you can show off? Or what's your advice there?

44:08 I'm a huge proponent of projects in the portfolio. I think if you don't have experience with

44:13 something, you create your own by building a project. And if you can do that with open source,

44:18 I think you should totally do that because I've benefited so much from open source.

44:22 I have not given back as much as I should to open source development and projects. I definitely

44:29 should do that. But if you can find a project that you're passionate about that you can help with,

44:33 I think you should totally do that. Even if it's not open source and you're just building a project

44:37 to showcase your skills, I'm all about that. I think you can do projects that are super fun,

44:42 maybe that are good for your community or good for your life. I'm a huge fan of personal projects.

44:47 I've put a Fitbit on my dog before and looked at her steps. I've found the healthiest meal

44:52 at McDonald's. I've looked at, visualized my weight over time and tried to create different

44:58 forecasting models and stuff like that. There's so much data in our lives that you can use to

45:02 make really cool projects. Oh, absolutely. You talked about, okay, you get your first job and that's where you kind of really learn. But if you don't have

45:09 your first job, you can effectively simulate that. Say I would have gone to a job and been given a

45:15 project to analyze something. I'm just interested in this thing. I've got two hours a day until I

45:20 get a job that I can be inspired about this and just get going on it. Maybe create a website and

45:24 publish your results and it can draw more people in to actually see that and start to appreciate.

45:31 They could even ask, all right, who's behind this cool project? Maybe I want them to come

45:35 work for me. Little did they know you're doing all this work because you got some spare time

45:40 and you're trying to build up your experience and a self-guided study, right?

45:43 Yeah. If you can build a cool project and flip the job hunt where you're not applying for jobs,

45:48 but jobs start to apply for you, you're in such a good position and doing really cool projects

45:53 can help you get there. Now it's hard to do cool projects. It's hard to publish projects,

45:58 which is one of the things that people really struggle with. For all you Python listeners out

46:02 there, let me just tell you, Streamlit is absolutely amazing because it makes the deployment

46:08 process so easy. It's free. It's a little tricky to deploy at first, but compared to what you used

46:15 to have to do back in the day, I'm saying back in the day, like four years ago, basically,

46:21 but it was really hard to deploy something where you could send someone a URL, "Hey,

46:25 check out my web application, machine learning application." Streamlit is such a cool app that

46:29 makes it so easy and so intuitive to make these cool little apps that you can just put on your

46:35 resume, put on your portfolio, send to recruiters. I'm such a fan of the Streamlit app. I love it.

46:40 Yeah. It's super cool. There's a couple of those and Streamlit's definitely one of the really nice

46:44 ones there. They host it. There's also, there's like some hosting behind Streamlit as well these

46:50 days, right? Like you don't even have to set up a server or anything. You just create it and put it

46:54 up there. That's what I'm saying is like back in the day, I use Dash a lot and I'm still a big fan

46:58 of Dash. Dash is more customizable than Streamlit and can do quite a bit more, but it's a lot more

47:05 work to deploy it. It's more like programming. Yeah, it is. It is more, a lot more programming.

47:10 Programming the UI rather than just the behind the scenes. Yeah.

47:12 It's, you have to do, yeah, you have to do both and you have to like know a little bit about like

47:16 systems and data engineering and stuff like that versus Streamlit kind of takes that,

47:20 abstracts that away. But yeah, back in the day, I used to make Dash web applications and deploy

47:24 them on Heroku back when they had a free tier of hosting and they've taken that away. So I don't

47:29 even know what the go-to free hosting platform is nowadays. I just, I moved most of my things

47:34 to Streamlit and it's so nice. Yeah. We got Shiny for Python now, which is also nice.

47:38 I haven't checked that out. How is it? I haven't done too much with it either,

47:41 but they're, Joe and the team over there are doing pretty cool stuff, like adding more like

47:46 dynamic interactive stuff to Jupyter, like running it inside Jupyter and things. Yeah. Pretty cool.

47:51 I'll have to check it out. I think they also do a bunch of hosting stuff

47:54 over there as well, is why it came to mind. What other advice do you got for folks out there? So

47:58 AI, is AI, not studying AI or learning to use AI, machine learning, but is there a benefit of trying

48:05 to, you know, like use ChatGPT to help you get this job? Or is there a danger? Like I'm thinking,

48:10 for example, like have a ChatGPT, write me a awesome resume. And then the tools are like,

48:16 well, we've detected this as AI generated and it's out. You know what I mean? Like,

48:20 what do you see happening there? A lot of people see AI as like an all or nothing tool as in it's,

48:27 it's either you, the human doing the work or it's the AI doing the work. But whenever,

48:32 I don't know about you, but whenever I'm using ChatGPT for anything, it's very rare.

48:36 It's copy and paste for me, or at least not iterative where I'm doing multiple prompts,

48:41 prompt after prompt, after prompt, trying to tweak it exactly what I want. And so the way I look at

48:46 ChatGPT and other gen AI that will be coming out, that's only inevitable is instead of looking at,

48:51 does this replace me? Does this, like, for instance, am I going to build my whole resume

48:56 using ChatGPT? Am I going to build, can ChatGPT build, you know, take a data scientist's job and

49:01 build the whole model for them? I like to see it more as like a hammer. It's like a tool for the

49:06 data scientist or a tool for the job searcher to use in conjunction with your screwdriver or

49:13 it's like something to be wielded by human, not replaced for the human. That makes sense.

49:18 You know, it's really good for stuff like, Hey, here's, I know a regular expression will do this.

49:22 Yeah. The last time I studied, I completely forgot what this is about and I know it's gnarly,

49:27 but if I just ask, here's an example, here's what I want. Boom. And traditionally what you would end

49:32 up doing is you'd be on stack overflow. Yeah. Be all over the internet. You'd be trying to piece

49:36 it together from external information anyway. And so code is something that's a little bit

49:42 more in the wheelhouse of the generative AI because it can't really make it up as much.

49:47 I know it could like do something insecure and you didn't know it was or whatever, but

49:51 it's not like asking for legal advice where it makes up cases that didn't exist. Like

49:56 it gives you code, you put it in the runtime or the compiler and it runs or it doesn't,

50:00 the output comes like you did. Yeah. It works or not. Yeah. So it's, it's pretty,

50:04 pretty effective for that. But yeah, for resumes, I would be more like, let me ask it,

50:09 what are the in demand things? And if I know these three skills, what other skills should I know to

50:15 get a, you could sort of use it in an explorative way to then come up with what you might write for

50:20 yourself. Right. Something like this. I find it really useful for brainstorming, like action

50:26 verbs on your resume bullets. Like, I think it's really good at that. What's 10 different ways to

50:30 say led. So I don't say led five times on my resume and I use some different action bullets.

50:35 I think it's great at that. I personally, it's pretty rare that I start any Python code from

50:41 scratch nowadays. I'm either starting hopefully from a template that I've already written,

50:46 or I'm starting from a ChatGPT. Like this is what I kind of want to accomplish,

50:50 right? Like the outline for it. Like one of the things I hate doing is I make a lot of

50:55 streamlet apps. I probably make a streamlet app a month right now. And I hate starting from scratch

50:59 with streamlet. It's super easy to start from scratch, but I'll say, Hey, ChatGPT, I want to

51:03 build a streamlet app. This is like the component I want here. This is the component I want here.

51:06 This is the component I want here. And it's almost like a warmup for me as a programmer

51:11 and it will create, it'll create something that works. It's not what I want. And I spend the

51:16 next five hours trying to make it what I want, you know, without ChatGPT, but it kind of gives me a

51:21 warm start to my programming process. So I really like it. I think it's something that everyone

51:27 should use. And I think if you're thinking about getting into any sort of programming, you know,

51:31 whether it's data science or, or web development, I think you should be a little bit less worried

51:37 about it taking your job and job security. I think you should almost be more excited that,

51:42 wow, the bar has never been lower to break into tech. Like this is a step up gift from the

51:48 programming gods that I get to use to break in the deck.

51:51 Another thing to keep in mind is I imagine a lot of people listening to this podcast are not just

51:56 starting a college program, right? They're coming from possibly other experiences, other specialties.

52:03 You know, what's really good for job security, knowing the intersection of two things,

52:07 the intersection of chemistry and programming, the intersection of geology and programming

52:13 for Exxon potentially, right? Like those things take you from a pool of a thousand to a pool of

52:19 tens. Right. And so what's awesome about that is it means two things. You don't throw away.

52:24 If you got a degree in something else like biology or whatever, you don't throw away like,

52:28 well, that was wasted four years. That's out. And it slices the pool of people who could apply

52:33 for certain jobs way, way smaller. Right. Sounds like you agree.

52:36 Oh, a thousand percent. I'll just tell a quick little anecdote. When I was at ExxonMobil,

52:41 there was a lot of not things, things I did not like at ExxonMobil, but this is something I really

52:44 liked is about once a quarter, they would do a crowdsourced data science competition for the

52:51 whole organization, like around the entire world. And they would say, this is a business problem

52:54 we're trying to solve. And at Exxon, we have data scientists all over the world and like all

52:59 sorts of different teams and things like that. So I did not know all the data scientists at Exxon.

53:04 And they'd say, this is the problem we're facing. Here's the data, go. Right.

53:08 Nice. Yeah.

53:09 I loved participating in these. It was like right up my wheelhouse of like,

53:13 I really enjoy exploration and all this stuff. At the time I was getting my master's degree,

53:17 but I didn't have my master's degree and I was competing against, so I'm just a chemical

53:21 engineering grad, right? And I'm competing against people with PhDs in computer science

53:26 and in data science and all these people who have way more experience than me.

53:30 And I actually won a few of these competitions. Thank you. I appreciate it. And it's not because

53:36 I was a better programmer or a better data scientist. It's because I majored in chemical

53:40 engineering and I knew the business problem, the domain extremely well. And I kind of knew

53:46 the programming and the data science stuff, but the combination of them made me very valuable.

53:51 Like one of the best examples I have is we were looking at crude oil properties. And I remember

53:57 there was a forum where you'd ask your questions. And one of the data scientists asked, "Hey,

54:01 is sulfur bad? There's lots of sulfur in this. Is it bad?" And to a chemical engineer,

54:06 that's like the most obvious thing. No, you, yes, sulfur is very bad in crude oil. That's very no,

54:11 no. That's like such a fundamental thing to me and to him or her that was like groundbreaking.

54:17 And so, yeah, your domain can become your superpower in your career.

54:20 Yeah. And it makes it way harder for ChatGPT and other types of tools to just automate you out of

54:26 a job because you bring in all these skills together, which is awesome. But it also makes

54:30 it easier for you to get the job. It makes it easier for you to continue your momentum of

54:34 whatever you've been up to. It's good all around. Yeah. I think it's more fun too,

54:38 because once again, when I was trying to decide if I should study computer science, I was like,

54:43 "Man, I don't really want to build an Excel workbook for building an Excel workbook sake."

54:50 That's still true for me today. I don't want to do data science for data science sake.

54:54 I only like machine learning or data science when I'm doing it to solve a really fun problem I'm

54:59 passionate about. That's why it's more fun. So if you can be excited about the domain

55:03 and excited about the algorithms, I think that's a great place to be.

55:06 Absolutely agree. All right. We're getting short on time, but maybe tell us a bit about

55:11 your Data Career Jumpstart. You've referred to it a couple of times.

55:14 Yeah. I have a company called Data Career Jumpstart. I just try to do a lot of education.

55:18 So the education happens on LinkedIn, happens on YouTube. And I actually forgot to mention this at

55:23 the beginning, but I have my own podcast called the Data Career Podcast, where I help people

55:28 land their first data job. We're about at a hundred episodes. So not quite the groundwork

55:32 that you've put in. That's still a ton. That's awesome.

55:35 Yeah. We're getting there. And then, yeah, I also have a bootcamp where I try to affordably help

55:40 people land their first data analyst position by teaching them the skills, the networking,

55:46 and the project and portfolio building that they need to do so.

55:50 Like the long version of this show. Yeah. Basically, yeah. Just take what

55:53 we talked about today, expand on it, make it like 350 unique lessons. And that's exactly what it is.

56:00 Yeah. Very cool. All right. Well, we're about out of time. So maybe just every final call to action,

56:06 people, maybe you're inspired. I see Diego out in the audience said awesome talk very much. So

56:11 what's next? It's easy to be inspired, but you got to take action.

56:15 Yeah. I love that. I think it's always fun to listen to podcasts, but you probably benefit way

56:21 more from the action you take after a podcast. So for you guys who are maybe interested in a data

56:26 analytics or a data science career, explore that. If you're like, yes, I'm in, make a plan,

56:31 make a roadmap. If you need help, I have a webinar that will help you make a roadmap.

56:36 What skills should you learn? How should you be networking and stuff like that?

56:39 But really probably if you're just getting started, trying to figure out what skills you

56:42 should learn, what are the top skills that you should be learning? And then learning those

56:45 skills. And then not only learning those skills, but take action and learning and build some sort

56:49 of a project that we talked about that you could put on a portfolio, make a Streamlit app or

56:53 something like that. That's probably the best action you could possibly take. If you need any

56:58 ideas, advice, feel free to check out my website, datacareerjumpster.com or the podcast,

57:02 Data Career Podcast. Hopefully there's lots of free resources for you guys to check that out.

57:06 If you've never seen Streamlit before, I have some YouTube videos about Streamlit that you

57:10 guys can check out, but I love it. Just take action somehow, do something.

57:13 That's one of the huge, huge differentiators is like, you might be inspired, but you just

57:18 got to start taking those steps and it becomes a snowball. So thanks for sharing all your experience

57:23 and your advice. Hopefully some people out there are taking action and yeah, I'll put

57:27 out everything we talked about in the show notes, of course. So thanks for being here, Avery.

57:30 Yeah, thank you. Thanks for having me. Appreciate it.

57:32 You bet. Bye all.

57:34 This has been another episode of Talk Python to Me. Thank you to our sponsors. Be sure to check

57:39 out what they're offering. It really helps support the show. Take some stress out of your life. Get

57:44 notified immediately about errors and performance issues in your web or mobile applications with

57:49 Sentry. Just visit talkpython.fm/sentry and get started for free. And be sure to use the promo

57:56 code, talkpython, all one word. This episode is sponsored by Posit Connect from the makers of

58:02 Shiny. Publish, share and deploy all of your data projects that you're creating using Python.

58:07 Streamlet, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports, Dashboards and APIs. Posit Connect

58:15 supports all of them. Try Posit Connect for free by going to talkpython.fm/posit. P-O-S-I-T.

58:22 Want to level up your Python? We have one of the largest catalogs of Python video courses over at

58:27 Talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async.

58:33 And best of all, there's not a subscription in sight. Check it out for yourself at training.talkpython.fm.

58:39 Be sure to subscribe to the show, open your favorite podcast app and search for Python.

58:43 We should be right at the top. You can also find the iTunes feed at /iTunes, the Google Play feed

58:49 at /play, and the direct RSS feed at /rss on talkpython.fm. We're live streaming most of our

58:56 recordings these days. If you want to be part of the show and have your comments featured on the

59:00 air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. This is your host,

59:06 Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and

59:10 write some Python code.

59:12 [Music]

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon