00:00 Are you interested in data science but you're not quite working in it yet?
00:03 In software, getting that very first job can truly be the hardest one to land.
00:08 On this episode, we have Avery Smith from Data Career Jumpstart here to share his advice
00:13 for getting your first data job. This is Talk Python to Me, episode 455, recorded January 18th, 2024.
00:21 [Music]
00:35 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy.
00:40 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,
00:45 both on mastodon.org. Keep up with the show and listen to over seven years of past episodes at
00:51 talkpython.fm. We've started streaming most of our episodes live on YouTube. Subscribe to our
00:57 YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of
01:03 that episode. This episode is brought to you by Sentry. Don't let those errors go unnoticed.
01:09 Use Sentry like we do here at Talk Python. Sign up at talkpython.fm/sentry. And it's brought to you
01:17 by Posit Connect from the makers of Shiny. Publish, share, and deploy all of your data projects that
01:22 you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports,
01:29 Dashboards, and APIs. Posit Connect supports all of them. Try Posit Connect for free by going to
01:35 talkpython.fm/posit. Hey folks, before we jump in and talk about data science jobs and careers,
01:44 I want to tell you really quickly about some awesome news. Back in February, I gave the
01:48 keynote at PyCon Philippines. It was entitled "The State of Python in 2024." Well, that is now out on
01:56 YouTube. The team at PyCon Philippines did a great job. The video came out great. If you want to
02:01 check out "The State of Python in 2024," according to me, just click on the link in the show notes to
02:06 watch it over on YouTube. Now let's talk to Avery. Avery, welcome to Talk Python to Me.
02:11 Thanks so much. I'm so excited to be here and be part of the show.
02:15 I'm excited to have you here as well. You know, one of the things that people reach out to me
02:19 often is, you know, how do you get into data science? How do you get into programming? How
02:25 do you get into Python? You know, I've been trying, or maybe they got a degree or they
02:30 took some training program, bootcamp or something, and going from zero to one, I think, is the biggest
02:38 career step you have to make. That next job and the one after that, it only gets to be smaller
02:44 steps, not bigger steps. And it's really tough because that first big step, you're brand new at
02:48 it. You have no experience, right? It's your first data science job or your first programming job.
02:53 So hopefully we can give some folks out there a little bit of a hand up to help them make that
02:59 jump. Yeah, totally. I like to show this graphic that says, it's a circle and it's a circle of
03:04 text. And it says, I can't get a job because I don't have experience because, and then it restarts,
03:11 I can't get a job. And that's the tricky part. It's like, how do you get a data science job when
03:15 you have no data science experience? Because to get data science experience, that seems like you
03:20 have to have a job as the prerequisite and vice versa. So it is very tricky. So happy to chime
03:25 in on that today. The industry can take it too far. They can take it way too far. So a few years
03:31 ago, there was a really funny tweet that went around back when they call them tweets. I don't
03:35 know what they're called anymore. Sebastian Ramirez, the guy who created FastAPI, saw a job
03:42 posting when FastAPI was like a year and a half old. It said, you must have four years of experience
03:47 with FastAPI to apply. And he said, Hey, look, I'm the creator of FastAPI and I'm unqualified for
03:53 this job. What kind of world are we living in? Yeah. I don't want to live in that world, but
03:58 that's unfortunately where we're at. That's so tough. And it's hilarious. These job descriptions
04:04 are getting out of hand. That's for sure. Yeah. Well, with AI, it's probably not going to get
04:08 better. We can talk about that more later. But before we get into that, let's just jump in with
04:13 a little bit of background on you before we get to the topic. Tell us a bit about yourself. What
04:18 do you do? How'd you get into Python? Things like that. Yeah, absolutely. So I'm currently
04:23 a data science consultant and also a data science instructor. I run some online programs where I
04:29 teach people to become data analysts mostly is what I'm focused on. But I also have this practice
04:35 where I help companies solve data problems with different techniques. I started actually by
04:41 studying chemical engineering in college in my undergraduate degree. And about a semester in,
04:47 I realized, crap, I hate this. This is not for me. Do you agree? Have you felt something similar?
04:54 I did a semester of chemical engineering as well. I thought, I love chemistry. I love math.
04:58 Put them together. Somehow they don't go together. It's like ice cream and eggs or something. No,
05:04 they don't go together for me at least. Yeah. It wasn't good for me either. I was just like,
05:08 oh man, I'm actually not interested in refineries or manufacturing. But I, like you,
05:14 liked chemistry. I liked math. I thought this is perfect. But I quickly realized, oh man,
05:18 I really like this whole programming part that I get to do in MATLAB at the time when I was an
05:23 undergrad. And I was on a time crunch to get through college quickly through eight semesters.
05:29 And the other issue I had was I didn't know what to do instead. It was like, I don't really want
05:34 to study computer science. Part of the reason why is they can't have this weed out course at
05:39 the beginning, which you had to build Excel from scratch, basically like some sort of a
05:44 spreadsheeting tool. I was like, wow, why would I rebuild something that already exists that I
05:48 don't even like using in the first place? I wasn't really into it. So I didn't know what to do. And
05:53 luckily I was working as a lab technician at this company, the really cool company that makes the
06:00 sensors that basically have the ability to smell. So they can sniff what's in the air and it has
06:05 applications for finding drugs or bombs and airports and stuff like that. And there was
06:10 a data scientist on staff and that data scientist was awesome. He was showing me all these cool
06:15 algorithms he was writing for these sensors. And then one day he got up and left and he left the
06:20 company. And we tried to hire another data scientist for six months, but they were really
06:25 expensive. We were a small company and none of them really wanted to move to Utah where I lived
06:30 in Salt Lake City. And so we couldn't really find someone that would be able to do it.
06:33 And so finally I was like, well, I really like this programming stuff. And the data scientist
06:38 showed me a thing or two, maybe I could take a stab at this. And I wrote my first machine learning
06:43 algorithm. And I was like, oh my gosh, I'm addicted to this. And then I never looked back
06:47 and had been data science since basically. What a great story. I think a lot of people
06:52 fall into programming that way. And for some reason, not unexpectedly, but for some reason,
06:57 a lot of people fall into Python that way as well. They're like, I have a job and I got this thing I
07:02 got to do. I just need a little bit more than maybe like an Excel spreadsheet or something.
07:08 And put it together and like, actually, this is cool. After a while, like this is cooler than
07:11 what I've been doing. Or maybe I'll make it a good part of what I do. Right?
07:15 Yeah. A hundred percent. Even just making, it was in MATLAB, which is basically
07:19 engineer's version of Python or college version of Python 10 years ago. Right?
07:24 Yeah.
07:24 And I made like tic-tac-toe and I remember playing tic-tac-toe against the computer.
07:28 I think that's what it was, or maybe it was hangman. I can't remember. But I remember the
07:31 idea of like being able to play, to program games and play against the computer. And I built it.
07:37 I was like, this is the coolest thing ever. I got to do more of this.
07:40 Absolutely. I've done some MATLAB too when I was younger and it's not that different from Python,
07:45 but it's, I think one of the big differences other than it just being like embedded in a
07:50 big expensive app is it's not a general purpose programming language. Right? You wouldn't go,
07:55 that was fun, but let me go build this website in MATLAB or let me create Airbnb in MATLAB or,
08:02 you know, like there's, you just don't want to sort of, Azure has this like
08:06 self-prescribed limit to what you can do with it.
08:08 That's one of the coolest parts about Python is it's really a Swiss army knife
08:12 and you can pretty much do, I don't want to say anything, but pretty close to anything
08:17 in Python, which makes it really neat. And obviously one of the huge limitations of MATLAB
08:21 is one, it costs thousands of dollars, but two, you're right. It's not going to do cybersecurity
08:26 for you. It's not going to build websites, but the syntax at the end of the day was,
08:29 was really quick. It was, it was easy for me to transition from MATLAB to Python because the
08:34 syntax isn't all that different.
08:36 No, it's not all that different. More math focused, but pretty similar. So I think maybe
08:40 that's a good place to start discussing and exploring this topic of your first data science
08:45 job. And wouldn't necessarily plan on starting here, but let's, let's start with before you
08:49 even necessarily know programming language, right? Maybe you've dabbled in MATLAB or you've
08:55 dabbled in Excel or even dabbled in, I don't know, JavaScript or something. This thing
08:59 we've been talking about with MATLAB and it applies to other areas as well, like through
09:03 programming languages per se, like Julia or something like that is how, if you invest
09:10 your time into learning one of these things really well, like how broadly industry-wide
09:16 of a skill, high demand skill is that going to be, right? You learn MATLAB, you put yourself
09:20 in a box, you learn a more general programming language, you kind of have more options afterwards,
09:25 right?
09:25 Yeah, totally. I think like the more broad of a language you learn, the more useful you
09:30 are to, to more industries in general. But I might take that even a step further and
09:36 just say, you know, learning MATLAB, not a whole lot of companies use MATLAB, but just
09:41 like landing your first data job, going from zero to one is the hardest. Learning your
09:45 first language zero to one is the hardest as well. And then once you have that first
09:49 language, the next language becomes so much easier. So one of the first things I learned
09:54 was, was MATLAB and then I moved to Python and that was easier. And then I learned SQL
09:58 and then I learned R and then I learned JavaScript. And every time I added like a new tool to
10:02 my toolkit, it was quite, not what I want to say, it was easy, but it got easier with
10:06 each one. I think that's true with foreign languages as well. Once you learn one foreign
10:10 language, then the third and the fourth become quite easy. At least that's, that's what
10:14 I heard. I speak kind of two and a half languages, but like there's people who speak like seven
10:19 and they always say like the sixth and the seventh become easy.
10:21 Yeah. You wonder how could you possibly, because learning the first one is so hard,
10:25 first foreign language. So you're like, well, how could you possibly take that on for this
10:29 many languages? And it's that it's not the same challenge each time, right?
10:32 Yeah, exactly.
10:32 Yeah. So I think when people are considering getting into data science, they really want
10:37 to consider what language they choose and where they go. Like you're coming out of a
10:42 college program, you might feel like MATLAB or something like that's real popular. And
10:46 yet that's because it's popular amongst professors who force their students to do it. That doesn't
10:51 necessarily mean that's the world, the broad worldview. What do you think about R? You
10:56 know, both.
10:57 I like R. I'm not, I sometimes troll R on LinkedIn. So I guess that's another thing
11:02 I should say is I post a lot on LinkedIn, kind of a LinkedIn guy. And so a lot of the
11:06 times, honestly, just for jokes and kicks and giggles, I'll kind of roast R on LinkedIn
11:11 just to get the trolls angry in the comments. And it's quite fun. It's quite a fun experience,
11:15 but I'm not that big of a hater. I definitely think R has its place.
11:18 The one thing that's really interesting about R versus Python is obviously a big debate
11:22 in the data science community is R is kind of that does one thing really well. And it's
11:28 getting a little less of that as more packages and libraries are added to R, but R does the
11:33 statistics and machine learning very well. But obviously I don't think once again, I
11:37 don't know any websites, any like super functioning websites that are built on R.
11:42 I don't know any cybersecurity that's really done, done via R. So I think R does what it
11:46 does well. The syntax sometimes is a lot easier for people to go from Excel, which a lot of
11:51 people are more familiar with in the finance or banking world. For example, the syntax
11:56 in R is a little bit more similar to those Excel formulas than it is to Python. So I
12:01 think sometimes people have a little bit more success just because, oh, this kind of feels
12:04 like R formulas, or sorry, this feels like Excel formulas. And so people really get there.
12:09 I think what you're kind of alluding to is if you're going to learn one skill, you
12:13 might as well learn the one skill that's applicable to the most, the widest net. Right?
12:18 And so that way you're fishing in the biggest lake you possibly could versus in a smaller
12:24 pond of R. I think that's worth looking at. And one of the things I actually really enjoy
12:28 doing, because you mentioned, oh, you might think MATLAB is popular because that's what
12:32 the professors taught you. And there's actually not a whole lot of data out there about, well,
12:37 what should you learn? So I don't know if you know who Luke Bruce is. He's a data analyst
12:41 YouTuber. I was going to say YouTuber on YouTube, but that's kind of redundant on YouTube.
12:45 And one of the things he's done is he's actually built this tool where he's web scraping
12:49 thousands of jobs, different data jobs every week, and then displaying and analyzing the
12:55 skills required for those jobs. So it's actually like a data-driven way of saying, if you want
12:59 to be a data scientist, what skills should you actually be focusing on as you go, as
13:04 opposed to just listening to what a professor will say or what a LinkedIn influencer will
13:10 say or what your bootcamp will say. Like actually getting some data on, I think is pretty neat.
13:14 That is super cool. And I'm not familiar with Luke, so we're going to dig him up and put
13:20 him in the show notes for later so people can check that out. Do you remember any of
13:23 the trends he's recently talked about?
13:24 It's datanerd.tech, I think is the website there. I look at it mostly for data analysts
13:30 because that's who I work with the most. So I know the data analyst data very well.
13:35 SQL is number one at 50%. I think Python is number two at like 30%.
13:40 I think Python might've jumped it.
13:43 Well, this is for all data positions right here. So the job title you can choose.
13:47 Ah, okay. So which one are you thinking I should pick here? Data?
13:51 Maybe data scientist.
13:52 Data scientist, yeah. Right.
13:53 That's what Suda says.
13:54 Yeah, you're right.
13:55 Wow.
13:56 Whoa, Python 69%. Look at that.
13:59 That's huge. So like that's even, that's even what? 20% more than SQL, which a lot of people
14:04 are like, if you want to be a data scientist, you have to know SQL. If you look at the job
14:07 descriptions, Python's mentioned a lot more. So if you're going to learn, if you're brand new
14:12 and you're going to learn one, you might as well start with Python because that's probably the
14:16 most in-demand skill that there is right now for a data scientist.
14:18 Yeah. And it's pretty easy, right? It's not like, well, why don't you just learn C++
14:22 for embedded devices? You're like, you know what? Maybe I'll pick something else to start with.
14:26 Right. But you know, Python's pretty easy.
14:27 I agree with you. I think Python's great. I actually think, I think SQL is probably easier
14:32 to learn if I'm being honest, because really, especially for like data science stuff, there's
14:37 only about like 20 commands that you need to know in SQL, but it's once again, SQL is a lot more,
14:41 there's no websites built on SQL. I'll tell you that much. So it's a lot more limited on what it
14:46 can do.
14:46 It's a skill, but not the language. It's not enough on its own, generally. I mean,
14:52 you can do reports and quite a bit with it, but you know, it's like when you see these programming
14:57 popular, you're like, what's the most popular language? Oh, look, CSS is the third most
15:01 popular. Like that's not a language. That's a thing that you use with other languages, right?
15:05 Like use it with all the other languages. That's why it's high up, but that doesn't mean it's high
15:09 in demand. Exactly. It's just like table stakes, you know? So you kind of got to distinguish table
15:14 stakes from like picking an area, I think.
15:16 That's totally true. And really I think Pythonistas could make the argument that there's
15:22 really nothing in SQL that you couldn't do in Python. That's a little somewhat true, depending
15:28 on data size and stuff like that. But regardless, there is ways that you can do most of the SQL
15:33 commands in Python one way or another. It could be, when I first became a data scientist, I didn't
15:39 even know SQL and I was doing SQL commands or I was doing the aggregations or the where functions
15:44 or the window functions using Python. So you definitely can, as long as your data is not like
15:49 super big, then you'll totally be fine.
15:51 Right. Like some kind of generator or even slices or yeah, things like that, right? List
15:56 comprehensions, set comprehensions, all that kind of stuff. Kind of like, gosh, I really wish,
16:02 a little bit of a sidebar, but I wish like list comprehensions and all those things had just a
16:07 few more SQL features, right? Like in a list comprehension, I say, give me this thing,
16:12 maybe give me this property of this class modified, like give me the user's name,
16:16 uppercase, right? So that's like select. And then for thing in collection, that's like
16:22 from table or whatever. Right. And then you have the where clause with the if statement, but boy,
16:28 wouldn't it be cool to have like a sort also in there and other things like that, you know?
16:34 Oh, well, totally.
16:35 It's so close.
16:36 The cool thing is, is if you want that sort, it's what one extra line, like it's not too bad. So
16:41 it, Python, I mean, I don't want to say this necessarily to hate all, to make all the data
16:46 scientists and SQL lovers mad, but, but really Python can do a lot of the things that SQL.
16:52 That's for sure.
16:52 Yeah, that's for sure.
16:53 This portion of Talk Python to Me is brought to you by Sentry. Code breaks. It's a fact of life.
16:59 With Sentry, you can fix it faster. As I've told you all before, we use Sentry on many of our apps
17:05 and APIs here at Talk Python. I recently used Sentry to help me track down one of the weirdest
17:10 bugs I've run into in a long time. Here's what happened. When signing up for our mailing list,
17:16 it would crash under a non-common execution paths, like situations where someone was already
17:21 subscribed or entered an invalid email address or something like this. The bizarre part was that our
17:28 logging of that unusual condition itself was crashing. How is it possible for her log to crash?
17:35 It's basically a glorified print statement. Well, Sentry to the rescue.
17:39 I'm looking at the crash report right now, and I see way more information than you'd expect to find
17:44 in any log statement. And because it's production, debuggers are out of the question. I see the
17:50 trace back, of course, but also the browser version, client OS, server OS, server OS version,
17:56 whether it's production or Q&A, the email and name of the person signing up. That's the person who
18:01 actually experienced the crash. Dictionaries of data on the call stack and so much more.
18:05 What was the problem? I initialized the logger with the string info for the level rather than
18:11 the enumeration.info, which was an integer based enum. So the logging statement would crash
18:18 saying that I could not use less than or equal to between strings and ints. Crazy town. But with
18:25 Sentry, I captured it, fixed it, and I even helped the user who experienced that crash.
18:31 Don't fly blind. Fix code faster with Sentry. Create your Sentry account now at talkpython.fm/Sentry.
18:37 And if you sign up with the code TALKPYTHON, all capital, no spaces, it's good for two free months
18:44 of Sentry's business plan, which will give you up to 20 times as many monthly events as well as
18:49 other features. Probably the biggest, it's a bit of a diversion, but the biggest similarity to that
18:57 I've seen in the languages is C#'s link, where they actually have almost all the query operators,
19:04 including joins and stuff like that, built into the programming language. I'd love to see more of
19:08 that kind of inspiration into Python, but you know, that's all right. It's still really good.
19:12 Got a lot of cool SQL-like features, but you're right, once you are no longer working with data
19:17 and memory or you want indexes, right? Like this concept of indexes is not sufficiently well
19:23 understood, I think. Every time I hit a website that takes five seconds to load, I'm like,
19:27 somebody is not doing all the things they should be doing. I just know it.
19:31 That's totally true.
19:33 What about SQL? You know, let's talk about that for a bit, right? The SQL, the query language
19:38 for databases and other things. There's ways to SQL query, not just relational databases.
19:44 But you said you got away with not quite learning that, but do you think if you could start over,
19:50 maybe making an effort to learn that would be really valuable? Like how important is this
19:54 a thing in the beginning of your career?
19:56 The interesting thing, you know, about landing a data job is your skills only plays, I say,
20:03 a third of the role. Your portfolio or the way that you portray your skills and your network,
20:09 I think, are the other two thirds and are actually more important than your skills. And that's kind
20:12 of how I got away with not knowing SQL and not even being, to be honest, that good at Python
20:18 at the time was because I used my network to be in the situation to get my lab technician job in
20:24 the first place. And then once again, I use that same network, in this case, my coworkers to land
20:29 that first data scientist position after we couldn't hire anyone. And if I would have been
20:34 applying externally for that role, chances are I wouldn't have gotten that role. I probably didn't
20:39 know enough at the time to land that type of a role, but because they knew I was hardworking,
20:43 they knew I wasn't like a total idiot and I really liked to learn, they took that chance on me.
20:49 It paid off really well for them because at the time I was still in college. And so I wasn't
20:53 getting paid that much and I was not getting paid like a data scientist, but I was getting results
20:58 like a data scientist for them. So I think it paid off for both of us. But I think if that was
21:03 an external job and I applied for it, I probably didn't have enough skills for it. So I definitely
21:07 think learning SQL, if you want to land a data science job, isn't a bad place to start, especially
21:12 because like I said, I mean, any programming language, I like to think of like the iceberg,
21:17 kind of like the Titanic, right? There's the parts that you see and then there's the parts that you
21:22 don't even know that are there. And really you could spend the rest of your life trying to master
21:27 SQL or the rest of your life trying to learn Python. But the cool thing is, is a lot of the
21:31 time you only need that top little bit that's sitting at the top of the surface of the water
21:36 to actually get stuff done. And so for SQL, I think that's like 20 commands. And I think you
21:41 could learn it honestly in like a month. You could learn those 20 commands pretty easily.
21:47 But it worked out for me and I didn't have to use it that much at the time until I was probably about
21:51 almost three years into my job. And I actually had switched jobs to a bigger company. The other
21:56 thing that I was working for a smaller company where we didn't have a ton of data. So we could
22:00 use CSVs kind of as our database, which is not great practice. But when I eventually became a
22:05 data scientist at ExxonMobil, I was going to say they didn't use Excel as a database, but they
22:10 still did. But the point is they had much larger SQL databases with hundreds of thousands, actually
22:15 millions of rows of data that I had to query. Yeah. Then you got to be really, you need to
22:20 understand it at a much deeper level. You're like, if you do a query like this, it's going to be
22:24 super slow. But if you do it like that, it can use the composite index for the sort and then
22:28 all right, then you're getting to the bottom of the iceberg in SQL, or maybe not the bottom,
22:33 maybe like the middle chunk under the water, but there's so much to learn for both of them.
22:37 Amira, the audience asks, when you talk data job, what kind of jobs are out there? So we talked
22:43 both about how we did chemical engineering, and then we saw chemical factories. You're like,
22:47 yeah, I don't really want to work here anymore. I'm out. So thinking about, well, what are the
22:52 kind of jobs you do? I think that's really important because it's easy to get focused in
22:58 on the FAANG companies. I want to work for some super big tech company. I want to move to San
23:04 Francisco and like that, that, that, right. Like there's not just plenty of other jobs,
23:09 but the opportunities, just like you described, and as well as my first job, I worked at a company
23:13 that had like eight people and it was awesome. Right. They didn't expect me to be, you know,
23:19 running Kubernetes clusters and doing all sorts of crazy. They're just like, I need you to make
23:23 this thing happen. Can you do like, I'm pretty new, but that thing I can make that happen.
23:27 Like, let's go. Right. And I feel like the, the possibilities to get in, especially with these,
23:32 maybe more niche type of industries and companies might even be easier for a first job.
23:38 People seem to be really obsessed with, with the FAANG. And I don't know if that's like a
23:42 societal thing or if it's just, those are the companies that we use a lot. And so we're excited
23:46 about them, but yeah, there's so many more data jobs outside of FAANG than there are inside of
23:51 FAANG, even though there's, there's quite a bit inside of FAANG. And oftentimes those roles can be
23:57 much more interesting and you can do a lot bigger of an impact. When, when I was working at this
24:02 small company, VaporSense, I like, I had so much power. I didn't even realize it. I had such a big
24:08 effect on the company. I was presenting to, you know, fortune 500 companies and what I did really
24:13 made a difference. And when it came to the point where ExxonMobil offered me to go be a data
24:19 scientist for Exxon, I said, Oh, I want to go work for the big company with the nice desk and the
24:24 nice laptop and, you know, try something new. And when I got there, I really, I had some pretty cool
24:30 opportunities when I was at ExxonMobil, but ultimately I left pretty shortly after two years
24:34 of being there because I just felt like a cog in the machine. And I didn't feel like I was actually
24:39 making a difference. And that was really important to my work satisfaction of like, is what I'm doing
24:43 being used? Is it being used to better the world? Do I feel valued? And the answer was kind of no
24:48 for me when I was there. So there's definitely a trade-off between the small companies and the,
24:51 and the big companies, but also to go back to your original question, there are so many freaking
24:56 roles in the data world that you're not even thinking of that. Like, I'm not even thinking of,
25:01 I saw a new one the other day when I was helping one of my students, it was like, it wasn't data
25:05 janitor, but it was something like that where I was like, I don't even know what that role is,
25:09 but there's, there's so many roles. When I was, when I was a data scientist,
25:12 VaporSense, the small company, my actual title was junior chemo-matrician, which basically means
25:19 you're doing data science with chemistry. When I was at ExxonMobil, when I was first there,
25:24 I was doing data science, but my actual title was optimization engineer. And so there's so
25:28 many titles that we don't even think to search of, or even to look up, but those are all data
25:33 science roles. I was doing machine learning every day in both those roles. And you would maybe never
25:37 guess from those titles. - Yeah. You would never guess. No, that's awesome. What machine learning libraries, frameworks were using?
25:43 - At VaporSense, once again, because it's a smaller company, I had a lot more say in what
25:48 I was doing. We were building a bunch of machine, we were building classification models to basically
25:53 to take the data from our sensors and sniff if something was in the air. Sometimes that was a
25:57 yes, no, like, oh yes, there is ammonia in the semiconductor factory and that's bad. So that's
26:03 a yes classification kind of binary, right? Other times it was, what drug is this? Is this meth or
26:10 is this heroin? One of the use cases we had was, this is binary once again, but is this recreational
26:16 marijuana or medicinal marijuana? And can we tell the difference between those? So we were usually
26:21 using classification models, usually built in scikit-learn in Python the majority of the time
26:27 there. When I was at Exxon, we had a lot less say, like the data scientists had a lot less say
26:32 in the decision making process. We were doing a lot of multivariate linear regression with a lot of
26:38 crazy hacks and transformations kind of in the meantime for one of my positions there.
26:43 And then the other time, the other position I did there, we were doing a lot of auto ML
26:48 using PyCharm and letting it kind of decide what type of models to do. So.
26:52 - Okay. The unsupervised learning type stuff, huh?
26:54 - It was awesome. It was really fun to, I love PyCharm because it's like, okay,
26:58 go make 25 models and tell me which one's the best. It's like, makes my job easy, I guess.
27:03 - We're going to be creative with sheer numbers. That's how we're going to come up with a solution.
27:09 Got it.
27:09 - Exactly.
27:10 - Well, Diego is asking like, what are some of the common stats methods as in mathematical type stuff you would use? So one of the things I know that some people
27:19 getting into programming think is you got to be really good at math to be a programmer. I think
27:24 you got to be really good at logical thinking, but you need to almost zero math, be like a web
27:30 developer. You know, we're talking percents for CSS, incrementing numbers from one to two to two
27:37 to three for IDs and stuff like that. But for data science, maybe there's a little bit more like,
27:42 where do you see that kind of background?
27:45 - I like what you said. You have to think logically, but maybe the math isn't as important.
27:49 And I think it's actually somewhat similar in data science. I will say you probably need a
27:53 little bit more math than a web developer, but I think it's a lot less than most people think.
27:57 And it's probably less about being able to do the math and maybe more about understanding
28:02 the mathematical concepts. And what I mean by that is a lot of, a lot of, so I also have a
28:08 master's degree in data analytics. A lot of master's degrees in data science and data analytics
28:13 will say you need calculus and linear algebra as kind of a background for your math. And that kind
28:18 of stops people. I don't want to do any calculus. I don't want to do any linear algebra. And while
28:22 both those concepts do exist in data science principles, the majority of the time the computer,
28:28 Python is doing the math. You just have to be able to interpret the results of the math and kind of
28:34 know what different directions, like this is going down on an optimization problem. Okay. That's the
28:39 derivative, getting closer to zero. It's really less about knowing how to do the math by hand
28:44 and more just understanding what the math of the computer is actually doing.
28:48 So I think it's actually a lot easier than most people say. That being said,
28:51 knowing how to do a derivative or take an integral, those concepts, I think is probably underlying
28:56 pretty important. But other than that, a lot of the times I'm doing linear regression because
29:02 it's awesome. It gets the job done. A lot of the time I'm doing hypothesis testing and statistics,
29:08 which you have to look like at a P score, nothing all that crazy. At Exxon, I had to do a lot of
29:13 linear programming, but that's honestly, that's like the exception versus the rule. There's not
29:18 a whole lot of linear programming for most data scientists. So I really don't think the math is
29:23 all that hard. Now, of course, that's coming from someone who got a chemical engineering degree,
29:29 who had to take all the calculus, all the linear algebra. So I did go through those courses. I
29:33 haven't really done it from scratch from a lot of my students are teachers, for example, who never
29:38 took those courses in college. So I can't speak from that perspective, but a lot of my students
29:42 are able to figure it out at the end of the day and transfer. So it happens.
29:45 Yeah, yeah, for sure. I think you make a good point. I think it's about knowing, okay, this
29:51 formula or this algorithm or this test means this thing. It applies in this situation. It doesn't
29:57 apply in that situation. Here's what you're trying to get from it, right? Like I know I need to do a
30:01 fast Fourier transform. So, and this is what it tells me when I get out the other side, but do I
30:07 need to be able to sit down and recreate the integral and the calculus behind it and do that
30:15 on like a home, like as a homework example, and like, give me a function and I'll do the Fourier
30:19 transform and I'll actually do the symbolic integration. Like, no, you probably don't need
30:23 that. Right. But you need to know, I do the Fourier transform in this situation, and this is
30:28 why. And then I just say, call the function, do it. Right.
30:31 And interpret the results. Really? That's what being a data scientist is all about is, yeah.
30:36 What does the business use case, what's the desired business use case? How do I relate that
30:41 use case to the data? What technique can I use to get the outcome that I need? Computer go do it,
30:48 interpret results, present to stakeholders. That's a data scientist right there.
30:52 I think one of the challenges with that is going to be, not that it's not good, but I think it's
30:57 going to be challenging because how do you learn when to use a certain statistical test or when to
31:04 do some kind of funky transformation, like a Fourier transform without more traditional
31:10 mathematical backgrounds and all the academics will not just go, oh, we're just going to give
31:14 you like five minute overview and they'll help you understand. They're like, nope, we're going
31:18 to start with this axiom or this theorem from differential equations. I'm going to work up.
31:23 You're like, no, no, no, no, no. I don't need that. I don't, I'm not on a four year plan.
31:28 I'm on a four week plan. How do I, how do I get value from a couple of the mathematical things
31:33 without being sucked into like, yeah, now I'm in differential equations at Harvard online and I
31:38 don't understand how I got there. It's such a big problem. And I'm so glad you brought this up.
31:43 And I'll be vulnerable because yeah, I felt the same, the same way. And I was like, there has to
31:47 be a better way. And so about, what was it? Three years ago now, two and a half years ago, three
31:51 years ago, I said, oh my gosh, I'm going to solve this problem. And I'm going to start my own data
31:55 science bootcamp. And so I spent about six months making the curriculum, making all the videos.
31:59 I opened it up. I got some students in there and I ran it for about six months and I looked at the
32:04 results and man, we weren't getting anyone into data science jobs. And I thought, ah, what the
32:09 heck am I doing wrong? I had this brilliant idea of like, we're going to be less theory, more
32:13 project, more hands-on. And I realized, man, the truth is people just learn better at work.
32:20 That's where you learn that whole technique that you just like, how does someone learn that? The
32:24 answer is by getting experience and learning it at work. And when I looked back and I said, okay,
32:28 well, we have had students get jobs. What jobs did they get? And it turns out most of them were
32:33 getting like business intelligence engineer jobs or data analysts or financial analyst jobs that
32:39 were a little bit below a data scientist job. And I realized, oh man, if we can just help people go
32:45 from zero to one and get their foot in the door, they can go from one to five much quicker at work
32:51 because work is just, I don't know, it's this magical place where like you said, whatever you
32:56 were working at earlier, and they're like, hey, can you do this Kubernetes thing? They just kind
33:00 of throw you in the fire and you're like, figure it out. And that's somehow you do. I don't know
33:04 what it is about work, but you figure it out and that's where you learn. So that's kind of why I
33:08 changed my curriculum to be more focused on, okay, maybe people aren't going to become data
33:12 scientists, but can we get them to zero to one quickly? And then they can get paid to learn the
33:17 rest of the data science stuff when they're actually in that first position. - How much do
33:20 you know about what you actually want to do in the industry before you've done it as well?
33:25 - Right. - You're like, oh, I thought everybody said machine learning was awesome and I've used ChatGPT and I loved it, but it turns
33:33 out actually like API is better, but I've never had a chance to build an API. So until I started,
33:38 I didn't even learn that one, it was a thing, two, that it was cool or vice versa, right? Whatever.
33:43 But until you get kind of in, you don't even know like, actually this part is where I really
33:48 am enjoying it. And so just getting that first step, that's a big deal. - A hundred percent. You
33:53 don't know what you don't know until you know it. That's why, I mean, really when it comes to,
33:58 if we go back to just SQL or just Python, you could spend, I tell people this, if you tried
34:04 to master Python before you applied to a job, you'd be like 80 years old before you ever applied
34:09 to a job. Same with SQL, same with machine learning. The cool thing about data is we're
34:13 never going to know it all. And so just learn the bare minimum to get your foot in the door.
34:18 And then you have this place where you're going to get paid to learn what you want to learn
34:22 eventually. If you learn, oh, I love APIs. I promise you that there's a company out there
34:26 that will hire you and you can learn APIs on the job. Like that's going to happen. But that first
34:31 step is so true. - There's a company out there that it doesn't know it needs APIs, but you could
34:35 help them. And you know, they don't have huge expectations because this is the thing they just
34:39 learned they needed. Right? - A hundred percent. - Yeah. It's wild. Right? This portion of Talk
34:45 Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio, and especially
34:51 Shiny for Python. Let me ask you a question. Are you building awesome things? Of course you are.
34:56 You're a developer or data scientist. That's what we do. And you should check out Posit Connect.
35:01 Posit Connect is a way for you to publish, share, and deploy all the data products that you're
35:06 building using Python. People ask me the same question all the time. Michael, I have some cool
35:12 data science project or notebook that I built. How do I share it with my users, stakeholders,
35:17 teammates? Do I need to learn FastAPI or Flask or maybe Vue or ReactJS? Hold on now. Those are cool
35:24 technologies and I'm sure you'd benefit from them, but maybe stay focused on the data project.
35:29 Let Posit Connect handle that side of things. With Posit Connect, you can rapidly and securely
35:33 deploy the things you build in Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto,
35:40 Ports, Dashboards, and APIs. Posit Connect supports all of them. And Posit Connect comes with all the
35:47 bells and whistles to satisfy IT and other enterprise requirements. Make deployment the
35:52 easiest step in your workflow with Posit Connect. For a limited time, you can try Posit Connect for
35:57 free for three months by going to talkpython.fm/posit. That's talkpython.fm/posit. The
36:04 link is in your podcast player show notes. Thank you to the team at Posit for supporting Talk Python.
36:09 All right. Let's talk about some career advice. I mean, I know you talked about being connected on
36:18 LinkedIn pretty well and certainly having some kind of social network support. And maybe it's
36:24 not that you would call it not social, but a real world network of actual human beings that you
36:29 physically know somehow. What's that? I don't know what that is.
36:32 I know. We gave that up back in 2020, I thought. Anyway, there was some stat that I saw somewhere
36:38 that over half of the jobs are fulfilled before it even becomes a job posting. Maybe some of the
36:45 best ones is like, "Hey, who knows somebody who can do this?" We need some like your data scientist,
36:50 data scientist example. They quit. We need somebody. Anybody know a good data scientist?
36:55 I don't want to just go put it out on the open job market and have to have a hundred interviews and
36:59 who knows what I'm going to get. If you can recommend somebody, let's start there. Right?
37:04 So being in that group to be recommended, it's important.
37:07 It's the key. There was a really interesting survey done on LinkedIn and they said,
37:12 it was done by the same person and Jordan Nelson, by the way. He said, "How do you approach getting a job?" And then the next day he said, "How did you get your last job?"
37:21 And 80% of people, they use what I call the spray and pray method, which basically means you go and
37:27 you apply to as many jobs as you possibly can and hope for the best, cross your fingers. That was 80%
37:32 of what people were doing. And then on the next poll the next day, it was a total, I think of what,
37:36 70% were either headhunted, recruited or referred. Yeah. Yeah. Yeah.
37:41 And so it's like the Pareto principle here where 80% of the effort is only getting you 20% of the
37:47 results and really 20% of the effort gets 80% of the results. So it's okay. We know networking and
37:52 getting recruited is really important, but how do we do it? It's easier said than done. And like
37:56 you said- In the industry, how do I make friends who are, right? It's like, well, my neighbors
38:01 don't do it, so I guess I'm out. That's the tricky thing is, yeah, if you're not in the industry yet,
38:05 how do you get recruited into it or how do you know someone? And what I've come to learn is it
38:10 actually doesn't even matter. So like, for instance, let's take your neighbor, right?
38:14 Your neighbor is probably not a data scientist. Maybe you're lucky and they are, and they can
38:18 refer you to a company. But what's really cool is I've learned that companies really come to trust
38:23 their employees and their employees' recommendations. And so even if your neighbor, let's
38:28 say, is a web developer or maybe even less technical, let's just say your recruiter is in
38:32 finance, right? And if there's an opening, like a data science opening at that company, a lot of
38:38 the times they will actually take their employee referrals much more seriously than any sort of
38:43 cold application that they get. And so a lot of the times I've had students who just know someone
38:48 that works at the company. They saw a job opening pop up. They're quickly, they message their
38:52 friends, "Hey, do you know a recruiter or a hiring manager? I could talk more about this role. Could
38:57 you do an internal referral for me?" And they were able to land jobs that they probably wouldn't
39:01 have. No, they definitely wouldn't have without that internal referral. So it is tricky. It's
39:06 that old cliche. It's not what you know, it's who you know.
39:08 I think there's still plenty of ways, COVID notwithstanding. I think that these days there's
39:13 plenty of ways to get those connections, right? But maybe people don't know. Like
39:17 meetup.com is really good. If you live in a non-tiny city, there's many, many things going on
39:24 around data science, around Python, around other data engineering, whatever, right? You could go
39:30 to those things. They're typically even free. Often they are free with food. They even feed
39:34 you, right? And make connections or regional conferences or national conferences, right?
39:40 We probably, many people have heard of PyCon, right? There's US PyCon, there's EuroPython,
39:46 and then there's, but those are the ones that are often talked about, but there's 10, 20 little
39:52 smaller regional ones in the US and many more that I'm not aware of throughout the world. Probably
39:56 one of those within driving distance, right? That you could go to, make connections and just also
40:01 kind of take the temperature of actually what you see on the internet versus what you see
40:06 and actually talking to real people. So I'd also say just get out there.
40:10 A hundred percent. Those places have the people who probably want to hire you because they're
40:16 local, right? Which is one thing that's trouble on LinkedIn. I'm big on networking on LinkedIn,
40:21 but a lot of the times you're going to be networking with people who in all likelihood
40:25 might never have a role that's even open to you. But the people that you're like, for instance,
40:29 we have, I'm in Utah and we have Silicon Slopes that has a tech meetup. We have a local Python
40:36 meetup chapter. We have the big data and developers conference that's free every year
40:41 with tons of food. And the people who go there are people from companies around there that have
40:46 the openings that you're trying to find and they want to hire people like you who are in the area.
40:51 So at least you can maybe come to the office once a week or maybe once a month or whatever, right?
40:55 And so really, like you said, going to those meetups, it's tough because networking is always
41:00 difficult either online or in person, but at least in those situations, you know, Hey, these are
41:04 people that are tied to real companies that exist around me that do make data hires. So I have a
41:10 chance. Definitely a much higher chance than just shooting out a resume. All right, well, let's see.
41:14 We talked about job hunting already. What about like applications and resumes? What are your
41:20 thoughts on that? I think once again, with the applications, the more targeted that you can make
41:25 it, the better, right? So if you can really hone in on, I really want this job, I'm going to cold
41:31 message five people at this company and see if I can get that internal referral one way or another,
41:36 make a real connection with them. I think that's really key. And then with resumes, resumes are
41:41 more of an art than they are a science. I feel like they are so difficult to figure out. And
41:46 these ATSs that are trying to match you and see if you're a good fit. I've tried a lot of them and a
41:51 lot of them suck. Whoever's the data scientist behind those, we need to have a conversation
41:55 with them because it's a little tricky sometimes. But one of the coolest concepts I've been
41:59 introduced to recently, and I have a whole episode on my podcast about it, is A/B testing your resume.
42:06 And basically the idea is a resume's job is just to get you a screener interview or like a beginner
42:14 interview, basically, right? That's all an interview. No one's seeing a resume and then
42:17 hiring you. They're always going to interviews. So if you think about it, a resume's job,
42:21 the only job it has is to convince someone to get on the phone and talk to you. And it's just a
42:26 piece of paper. And guess what? You can put whatever you want on that piece of paper.
42:30 Now, I'm not saying to lie, but I'm just saying you could theoretically make a perfect resume for
42:36 whatever job you're trying to go for and send it out there and see what happens, right? But I'm not
42:40 saying to do that. I'm not saying to lie. My point in saying this is that the resume is just to get
42:45 you the interview. And if you're not getting interviews, something's probably wrong with your
42:49 resume. And so tweak something, apply to 10 more jobs, see what happens. Tweak something, apply to
42:55 10 more jobs, see what happens. Until you finally have the right combination of skills, of experiences,
43:00 of different keywords. Because a lot of the time you're just trying to beat the ATS. And that's the
43:05 sad part about it. It's like, how do I prove to this random computer algorithm that they should
43:10 talk to me on the phone? That's a hard game to beat. And there's a whole bunch of advice from
43:14 all these different people. What I've come to learn is it's different for every company. It's
43:18 different for every person. You kind of a numbers game until you get lucky and you figure it out.
43:22 That's good advice. I guess two thoughts. One is I know that speaking specifically to anyone,
43:28 but in general, women wait until they match all the requirements of a position where a guy's like,
43:34 I know three of those things. I'm taking a flyer. I'm sending it. I would just like to
43:39 encourage the women out there to just send it as well.
43:43 I a hundred percent agree with that. And I think if you reach 60% of the requirements,
43:48 I think you have a chance. A lot of the times those are wishlists and not actual requirements.
43:53 And depending on, are you local to the area? Do you have a domain experience in this company?
43:59 There's lots of other factors. What about contributing to open source or having GitHub repos that can be projects that you can show off? Or what's your advice there?
44:08 I'm a huge proponent of projects in the portfolio. I think if you don't have experience with
44:13 something, you create your own by building a project. And if you can do that with open source,
44:18 I think you should totally do that because I've benefited so much from open source.
44:22 I have not given back as much as I should to open source development and projects. I definitely
44:29 should do that. But if you can find a project that you're passionate about that you can help with,
44:33 I think you should totally do that. Even if it's not open source and you're just building a project
44:37 to showcase your skills, I'm all about that. I think you can do projects that are super fun,
44:42 maybe that are good for your community or good for your life. I'm a huge fan of personal projects.
44:47 I've put a Fitbit on my dog before and looked at her steps. I've found the healthiest meal
44:52 at McDonald's. I've looked at, visualized my weight over time and tried to create different
44:58 forecasting models and stuff like that. There's so much data in our lives that you can use to
45:02 make really cool projects. Oh, absolutely. You talked about, okay, you get your first job and that's where you kind of really learn. But if you don't have
45:09 your first job, you can effectively simulate that. Say I would have gone to a job and been given a
45:15 project to analyze something. I'm just interested in this thing. I've got two hours a day until I
45:20 get a job that I can be inspired about this and just get going on it. Maybe create a website and
45:24 publish your results and it can draw more people in to actually see that and start to appreciate.
45:31 They could even ask, all right, who's behind this cool project? Maybe I want them to come
45:35 work for me. Little did they know you're doing all this work because you got some spare time
45:40 and you're trying to build up your experience and a self-guided study, right?
45:43 Yeah. If you can build a cool project and flip the job hunt where you're not applying for jobs,
45:48 but jobs start to apply for you, you're in such a good position and doing really cool projects
45:53 can help you get there. Now it's hard to do cool projects. It's hard to publish projects,
45:58 which is one of the things that people really struggle with. For all you Python listeners out
46:02 there, let me just tell you, Streamlit is absolutely amazing because it makes the deployment
46:08 process so easy. It's free. It's a little tricky to deploy at first, but compared to what you used
46:15 to have to do back in the day, I'm saying back in the day, like four years ago, basically,
46:21 but it was really hard to deploy something where you could send someone a URL, "Hey,
46:25 check out my web application, machine learning application." Streamlit is such a cool app that
46:29 makes it so easy and so intuitive to make these cool little apps that you can just put on your
46:35 resume, put on your portfolio, send to recruiters. I'm such a fan of the Streamlit app. I love it.
46:40 Yeah. It's super cool. There's a couple of those and Streamlit's definitely one of the really nice
46:44 ones there. They host it. There's also, there's like some hosting behind Streamlit as well these
46:50 days, right? Like you don't even have to set up a server or anything. You just create it and put it
46:54 up there. That's what I'm saying is like back in the day, I use Dash a lot and I'm still a big fan
46:58 of Dash. Dash is more customizable than Streamlit and can do quite a bit more, but it's a lot more
47:05 work to deploy it. It's more like programming. Yeah, it is. It is more, a lot more programming.
47:10 Programming the UI rather than just the behind the scenes. Yeah.
47:12 It's, you have to do, yeah, you have to do both and you have to like know a little bit about like
47:16 systems and data engineering and stuff like that versus Streamlit kind of takes that,
47:20 abstracts that away. But yeah, back in the day, I used to make Dash web applications and deploy
47:24 them on Heroku back when they had a free tier of hosting and they've taken that away. So I don't
47:29 even know what the go-to free hosting platform is nowadays. I just, I moved most of my things
47:34 to Streamlit and it's so nice. Yeah. We got Shiny for Python now, which is also nice.
47:38 I haven't checked that out. How is it? I haven't done too much with it either,
47:41 but they're, Joe and the team over there are doing pretty cool stuff, like adding more like
47:46 dynamic interactive stuff to Jupyter, like running it inside Jupyter and things. Yeah. Pretty cool.
47:51 I'll have to check it out. I think they also do a bunch of hosting stuff
47:54 over there as well, is why it came to mind. What other advice do you got for folks out there? So
47:58 AI, is AI, not studying AI or learning to use AI, machine learning, but is there a benefit of trying
48:05 to, you know, like use ChatGPT to help you get this job? Or is there a danger? Like I'm thinking,
48:10 for example, like have a ChatGPT, write me a awesome resume. And then the tools are like,
48:16 well, we've detected this as AI generated and it's out. You know what I mean? Like,
48:20 what do you see happening there? A lot of people see AI as like an all or nothing tool as in it's,
48:27 it's either you, the human doing the work or it's the AI doing the work. But whenever,
48:32 I don't know about you, but whenever I'm using ChatGPT for anything, it's very rare.
48:36 It's copy and paste for me, or at least not iterative where I'm doing multiple prompts,
48:41 prompt after prompt, after prompt, trying to tweak it exactly what I want. And so the way I look at
48:46 ChatGPT and other gen AI that will be coming out, that's only inevitable is instead of looking at,
48:51 does this replace me? Does this, like, for instance, am I going to build my whole resume
48:56 using ChatGPT? Am I going to build, can ChatGPT build, you know, take a data scientist's job and
49:01 build the whole model for them? I like to see it more as like a hammer. It's like a tool for the
49:06 data scientist or a tool for the job searcher to use in conjunction with your screwdriver or
49:13 it's like something to be wielded by human, not replaced for the human. That makes sense.
49:18 You know, it's really good for stuff like, Hey, here's, I know a regular expression will do this.
49:22 Yeah. The last time I studied, I completely forgot what this is about and I know it's gnarly,
49:27 but if I just ask, here's an example, here's what I want. Boom. And traditionally what you would end
49:32 up doing is you'd be on stack overflow. Yeah. Be all over the internet. You'd be trying to piece
49:36 it together from external information anyway. And so code is something that's a little bit
49:42 more in the wheelhouse of the generative AI because it can't really make it up as much.
49:47 I know it could like do something insecure and you didn't know it was or whatever, but
49:51 it's not like asking for legal advice where it makes up cases that didn't exist. Like
49:56 it gives you code, you put it in the runtime or the compiler and it runs or it doesn't,
50:00 the output comes like you did. Yeah. It works or not. Yeah. So it's, it's pretty,
50:04 pretty effective for that. But yeah, for resumes, I would be more like, let me ask it,
50:09 what are the in demand things? And if I know these three skills, what other skills should I know to
50:15 get a, you could sort of use it in an explorative way to then come up with what you might write for
50:20 yourself. Right. Something like this. I find it really useful for brainstorming, like action
50:26 verbs on your resume bullets. Like, I think it's really good at that. What's 10 different ways to
50:30 say led. So I don't say led five times on my resume and I use some different action bullets.
50:35 I think it's great at that. I personally, it's pretty rare that I start any Python code from
50:41 scratch nowadays. I'm either starting hopefully from a template that I've already written,
50:46 or I'm starting from a ChatGPT. Like this is what I kind of want to accomplish,
50:50 right? Like the outline for it. Like one of the things I hate doing is I make a lot of
50:55 streamlet apps. I probably make a streamlet app a month right now. And I hate starting from scratch
50:59 with streamlet. It's super easy to start from scratch, but I'll say, Hey, ChatGPT, I want to
51:03 build a streamlet app. This is like the component I want here. This is the component I want here.
51:06 This is the component I want here. And it's almost like a warmup for me as a programmer
51:11 and it will create, it'll create something that works. It's not what I want. And I spend the
51:16 next five hours trying to make it what I want, you know, without ChatGPT, but it kind of gives me a
51:21 warm start to my programming process. So I really like it. I think it's something that everyone
51:27 should use. And I think if you're thinking about getting into any sort of programming, you know,
51:31 whether it's data science or, or web development, I think you should be a little bit less worried
51:37 about it taking your job and job security. I think you should almost be more excited that,
51:42 wow, the bar has never been lower to break into tech. Like this is a step up gift from the
51:48 programming gods that I get to use to break in the deck.
51:51 Another thing to keep in mind is I imagine a lot of people listening to this podcast are not just
51:56 starting a college program, right? They're coming from possibly other experiences, other specialties.
52:03 You know, what's really good for job security, knowing the intersection of two things,
52:07 the intersection of chemistry and programming, the intersection of geology and programming
52:13 for Exxon potentially, right? Like those things take you from a pool of a thousand to a pool of
52:19 tens. Right. And so what's awesome about that is it means two things. You don't throw away.
52:24 If you got a degree in something else like biology or whatever, you don't throw away like,
52:28 well, that was wasted four years. That's out. And it slices the pool of people who could apply
52:33 for certain jobs way, way smaller. Right. Sounds like you agree.
52:36 Oh, a thousand percent. I'll just tell a quick little anecdote. When I was at ExxonMobil,
52:41 there was a lot of not things, things I did not like at ExxonMobil, but this is something I really
52:44 liked is about once a quarter, they would do a crowdsourced data science competition for the
52:51 whole organization, like around the entire world. And they would say, this is a business problem
52:54 we're trying to solve. And at Exxon, we have data scientists all over the world and like all
52:59 sorts of different teams and things like that. So I did not know all the data scientists at Exxon.
53:04 And they'd say, this is the problem we're facing. Here's the data, go. Right.
53:08 Nice. Yeah.
53:09 I loved participating in these. It was like right up my wheelhouse of like,
53:13 I really enjoy exploration and all this stuff. At the time I was getting my master's degree,
53:17 but I didn't have my master's degree and I was competing against, so I'm just a chemical
53:21 engineering grad, right? And I'm competing against people with PhDs in computer science
53:26 and in data science and all these people who have way more experience than me.
53:30 And I actually won a few of these competitions. Thank you. I appreciate it. And it's not because
53:36 I was a better programmer or a better data scientist. It's because I majored in chemical
53:40 engineering and I knew the business problem, the domain extremely well. And I kind of knew
53:46 the programming and the data science stuff, but the combination of them made me very valuable.
53:51 Like one of the best examples I have is we were looking at crude oil properties. And I remember
53:57 there was a forum where you'd ask your questions. And one of the data scientists asked, "Hey,
54:01 is sulfur bad? There's lots of sulfur in this. Is it bad?" And to a chemical engineer,
54:06 that's like the most obvious thing. No, you, yes, sulfur is very bad in crude oil. That's very no,
54:11 no. That's like such a fundamental thing to me and to him or her that was like groundbreaking.
54:17 And so, yeah, your domain can become your superpower in your career.
54:20 Yeah. And it makes it way harder for ChatGPT and other types of tools to just automate you out of
54:26 a job because you bring in all these skills together, which is awesome. But it also makes
54:30 it easier for you to get the job. It makes it easier for you to continue your momentum of
54:34 whatever you've been up to. It's good all around. Yeah. I think it's more fun too,
54:38 because once again, when I was trying to decide if I should study computer science, I was like,
54:43 "Man, I don't really want to build an Excel workbook for building an Excel workbook sake."
54:50 That's still true for me today. I don't want to do data science for data science sake.
54:54 I only like machine learning or data science when I'm doing it to solve a really fun problem I'm
54:59 passionate about. That's why it's more fun. So if you can be excited about the domain
55:03 and excited about the algorithms, I think that's a great place to be.
55:06 Absolutely agree. All right. We're getting short on time, but maybe tell us a bit about
55:11 your Data Career Jumpstart. You've referred to it a couple of times.
55:14 Yeah. I have a company called Data Career Jumpstart. I just try to do a lot of education.
55:18 So the education happens on LinkedIn, happens on YouTube. And I actually forgot to mention this at
55:23 the beginning, but I have my own podcast called the Data Career Podcast, where I help people
55:28 land their first data job. We're about at a hundred episodes. So not quite the groundwork
55:32 that you've put in. That's still a ton. That's awesome.
55:35 Yeah. We're getting there. And then, yeah, I also have a bootcamp where I try to affordably help
55:40 people land their first data analyst position by teaching them the skills, the networking,
55:46 and the project and portfolio building that they need to do so.
55:50 Like the long version of this show. Yeah. Basically, yeah. Just take what
55:53 we talked about today, expand on it, make it like 350 unique lessons. And that's exactly what it is.
56:00 Yeah. Very cool. All right. Well, we're about out of time. So maybe just every final call to action,
56:06 people, maybe you're inspired. I see Diego out in the audience said awesome talk very much. So
56:11 what's next? It's easy to be inspired, but you got to take action.
56:15 Yeah. I love that. I think it's always fun to listen to podcasts, but you probably benefit way
56:21 more from the action you take after a podcast. So for you guys who are maybe interested in a data
56:26 analytics or a data science career, explore that. If you're like, yes, I'm in, make a plan,
56:31 make a roadmap. If you need help, I have a webinar that will help you make a roadmap.
56:36 What skills should you learn? How should you be networking and stuff like that?
56:39 But really probably if you're just getting started, trying to figure out what skills you
56:42 should learn, what are the top skills that you should be learning? And then learning those
56:45 skills. And then not only learning those skills, but take action and learning and build some sort
56:49 of a project that we talked about that you could put on a portfolio, make a Streamlit app or
56:53 something like that. That's probably the best action you could possibly take. If you need any
56:58 ideas, advice, feel free to check out my website, datacareerjumpster.com or the podcast,
57:02 Data Career Podcast. Hopefully there's lots of free resources for you guys to check that out.
57:06 If you've never seen Streamlit before, I have some YouTube videos about Streamlit that you
57:10 guys can check out, but I love it. Just take action somehow, do something.
57:13 That's one of the huge, huge differentiators is like, you might be inspired, but you just
57:18 got to start taking those steps and it becomes a snowball. So thanks for sharing all your experience
57:23 and your advice. Hopefully some people out there are taking action and yeah, I'll put
57:27 out everything we talked about in the show notes, of course. So thanks for being here, Avery.
57:30 Yeah, thank you. Thanks for having me. Appreciate it.
57:32 You bet. Bye all.
57:34 This has been another episode of Talk Python to Me. Thank you to our sponsors. Be sure to check
57:39 out what they're offering. It really helps support the show. Take some stress out of your life. Get
57:44 notified immediately about errors and performance issues in your web or mobile applications with
57:49 Sentry. Just visit talkpython.fm/sentry and get started for free. And be sure to use the promo
57:56 code, talkpython, all one word. This episode is sponsored by Posit Connect from the makers of
58:02 Shiny. Publish, share and deploy all of your data projects that you're creating using Python.
58:07 Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports, Dashboards and APIs. Posit Connect
58:15 supports all of them. Try Posit Connect for free by going to talkpython.fm/posit. P-O-S-I-T.
58:22 Want to level up your Python? We have one of the largest catalogs of Python video courses over at
58:27 Talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async.
58:33 And best of all, there's not a subscription in sight. Check it out for yourself at training.talkpython.fm.
58:39 Be sure to subscribe to the show, open your favorite podcast app and search for Python.
58:43 We should be right at the top. You can also find the iTunes feed at /iTunes, the Google Play feed
58:49 at /play, and the direct RSS feed at /rss on talkpython.fm. We're live streaming most of our
58:56 recordings these days. If you want to be part of the show and have your comments featured on the
59:00 air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube. This is your host,
59:06 Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and
59:10 write some Python code.
59:12 [Music]