Learn Python with Talk Python's 270 hours of courses

Land Your First Data Job

Episode #455, published Thu, Apr 4, 2024, recorded Thu, Jan 18, 2024

Interested in data science but you're not quite working in it yet? In software, getting that very first job can truly be the hardest one to land. On this episode, we have Avery Smith from Data Career Jumpstart here to share his advice for getting your first data job.

Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

Guest Background: Avery Smith

Avery Smith is a data scientist, instructor, and consultant who took a unique path into data careers. He started out studying chemical engineering, discovered a passion for programming through MATLAB, and later transitioned into data science. Avery has worked at both small startups and large enterprises (including ExxonMobil), implementing machine learning solutions and translating data insights into real business impact. Now, he runs Data Career Jumpstart, mentoring aspiring data professionals on how to land their first data job.

1. Navigating the “Zero-to-One” Challenge

Breaking into a data role can feel like the toughest leap. Avery’s journey from lab technician to junior data scientist proves non-traditional routes truly can work. Once you land that first job, opportunities for growth multiply quickly, making future moves far easier.

  • Many encounter the catch-22 of needing experience to get a job, yet needing a job for experience.
  • Smaller companies or internal transfers (like Avery’s lab-tech-to-data-scientist story) can help you break in with fewer prerequisites.
  • Once you have that first position on your résumé, each subsequent career move becomes much smoother.

2. The Power of Networking

Networking exposes you to hidden job openings and referral paths you’d never see by just applying online. Often, roles are filled before they’re publicly posted, emphasizing the value of genuine connections. Whether you meet people at local conferences or within your own organization, building relationships can significantly boost your job-hunt success.

  • Many opportunities are filled via referrals before ever hitting public job boards.
  • Local meetups and conferences are prime avenues for making personal connections.
  • Even non-technical coworkers can help you find referrals within their companies.

3. The Role of Python

Python’s expansive ecosystem, user-friendly syntax, and vast community make it the go-to language in data science. From scikit-learn and PyCaret for machine learning, to Streamlit for quick app deployments, Python offers a streamlined way to build and share data projects. For those starting out, it’s one of the fastest paths to tangible, demonstrable success.

  • Tools like Data Nerd show how in-demand Python is for data jobs.
  • Its libraries and frameworks, including scikit-learn, PyCaret, and Streamlit, cover a broad spectrum of applications.
  • Python’s lower barrier to entry helps new data professionals learn quickly and showcase value fast.

4. SQL as “Table Stakes”

Although you can do a lot with Python, SQL remains essential for working with large, structured data. Companies typically house critical information in relational databases, so querying proficiency can make or break your day-to-day success. When data extends beyond basic CSVs, SQL skills often become indispensable.

  • Mastering fundamental commands (SELECT, JOIN, WHERE, GROUP BY, etc.) already makes you valuable.
  • Many companies store massive amounts of data in relational databases, making SQL knowledge essential.
  • Once data grows beyond basic CSVs, SQL queries are often indispensable for efficiency and performance.

5. The Power of Domain Knowledge

Technical prowess matters, but so does understanding your industry. Avery’s chemical engineering background helped him excel at ExxonMobil, because he grasped nuances—like sulfur levels in oil—that pure data scientists might overlook. Blending domain expertise with data skills can make you a vital asset in any organization.

  • Avery excelled in internal data competitions at ExxonMobil because he knew how sulfur levels affected oil refining.
  • Combining data expertise with fields like biology, finance, or engineering gives you a unique edge.
  • Clear insight into the “why” behind data leads to more accurate models and solutions.

6. Creating a Portfolio with Accessible Tools

Portfolios let you prove your abilities instead of just claiming them. By building small, focused data projects or interactive dashboards, you offer tangible evidence of your skills. Tools like Streamlit or Plotly Dash make deployment more approachable, allowing you to share your work with potential employers.

  • Tangible demos, such as interactive dashboards or web apps, let your skills speak for themselves.
  • Streamlit and Plotly Dash make it simpler to convert notebooks into live apps.
  • Deployment is far easier today—once-challenging steps can now be done quickly to showcase your work publicly.

7. AI as a Helper, Not a Replacement

Generative AI tools like ChatGPT can streamline coding tasks and brainstorming, but they don’t replace human judgment or creativity. Avery routinely uses AI-generated scaffolding for his Python or Streamlit projects, then refines the code manually. This lets him move faster without sacrificing the critical thinking that ensures robust solutions.

  • Avery uses ChatGPT as a starting point for quick code scaffolding in Python or Streamlit projects.
  • It’s an iterative process: AI suggestions are refined or corrected as you build.
  • AI can also help brainstorm resume phrasing, highlight key skills, and speed up research.

8. A/B Testing Your Resume

A résumé’s main function is landing interviews. If it’s not working, iterate on it as you would a data experiment. Make small changes—like tweaking keywords or clarifying an accomplishment—then apply to a few more jobs to see if your response rate improves.

  • Change one aspect, apply to 10 openings, track results, and iterate.
  • Align keywords with role requirements to get past Applicant Tracking Systems (ATS).
  • Showcase specific outcomes—how you improved processes, tackled real data sets, or deployed a working model.

Notable Links and Resources

Below are the tools, libraries, and websites explicitly mentioned or referenced:


Overall Takeaway

Landing your first data job calls for a balanced blend of networking, skill-building, and showcasing real-world applications. Python and SQL remain central pillars, while domain expertise can elevate you far beyond what purely technical skills can achieve. Whether you’re spinning up a Streamlit demo or optimizing your résumé with an A/B testing mindset, tangible evidence of your capabilities is key. And remember—once you push through “zero to one,” every step afterward tends to get a little easier.

Avery Smith: www.linkedin.com
Data Career Jumpstart: www.datacareerjumpstart.com
Data Nerd Site: datanerd.tech
Write C# LINQ queries to query data: learn.microsoft.com
A faster way to build and share data apps: streamlit.io
Plotly Dash: dash.plotly.com

Michael's Keynote: State of Python in 2024: youtube.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Episode Transcript

Collapse transcript

00:00 Are you interested in data science, but you're not quite working in it yet?

00:03 In software, getting that very first job can truly be the hardest one to land.

00:08 On this episode, we have Avery Smith from Data Career Jumpstart here to share his advice

00:13 for getting your first data job. This is Talk Python to Me, episode 455, recorded January 18th,

00:20 2024.

00:21 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy.

00:40 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,

00:45 both on fosstodon.org. Keep up with the show and listen to over seven years of past episodes

00:51 at talkpython.fm. We've started streaming most of our episodes live on YouTube. Subscribe to our

00:57 YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of

01:03 that episode. This episode is brought to you by Sentry. Don't let those errors go unnoticed.

01:09 Use Sentry like we do here at Talk Python. Sign up at talkpython.fm/sentry.

01:15 And it's brought to you by Posit Connect from the makers of Shiny. Publish, share, and deploy all

01:21 of your data projects that you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI,

01:27 Flask, Quarto, Reports, Dashboards, and APIs. Posit Connect supports all of them. Try Posit Connect

01:34 for free by going to talkpython.fm/Posit, P-O-S-I-T.

01:39 Hey folks, before we jump in and talk about data science jobs and careers, I want to tell you

01:44 really quickly about some awesome news. Back in February, I gave the keynote at PyCon Philippines.

01:50 It was entitled The State of Python in 2024. Well, that is now out on YouTube. The team at

01:58 PyCon Philippines did a great job. The video came out great. If you want to check out The

02:02 State of Python in 2024, according to me, just click on the link in the show notes to watch

02:07 it over on YouTube. Now let's talk to Avery. Avery, welcome to Talk Python to Me.

02:11 Thanks so much. I'm so excited to be here and be part of the show.

02:15 I'm excited to have you here as well. You know, one of the things that people reach out to me often

02:19 is how do you get into data science? How do you get into programming? How do you get into Python?

02:26 You know, I've been trying, or maybe they got a degree or they took some training program,

02:32 bootcamp or something. And going from zero to one, I think is the biggest career step you have to make.

02:40 That next job and the one after that, it only gets to be smaller steps, not bigger steps.

02:45 And it's really tough because that first big step, you're brand new at it. You have no experience,

02:49 right? It's your first data science job or your first programming job. And so hopefully we can

02:54 give some folks out there a little bit of a hand up to help them make that jump.

02:59 Yeah, totally. I like to show this graphic that says, it's a circle and it's a circle of text. And it says,

03:05 I can't get a job because I don't have experience because, and then it restarts,

03:11 I can't get a job. And that's the tricky part. It's like, how do you get a data science job when

03:15 you have no data science experience? Because to get data science experience, that seems like you have

03:20 to have a job as the prerequisite and vice versa. So it is very tricky. So happy to chime in on that

03:25 today.

03:26 The industry can take it too far. They can take it way too far. So a few years ago, there was a

03:32 really funny tweet that went around back when they call them tweets. I don't know what they're called

03:36 anymore. Sebastian Ramirez, the guy who created FastAPI, saw a job posting. When FastAPI was like a

03:44 year and a half old, it said, you must have four years of experience with FastAPI to apply. He said,

03:50 hey, look, I'm the creator of FastAPI and I'm unqualified for this job. What kind of world are

03:55 we living in?

03:56 Yeah. I don't want to live in that world, but that's unfortunately where we're at. That's so tough.

04:01 And it's hilarious. These job descriptions are getting out of hand. That's for sure.

04:06 Yeah. Well, with AI, it's probably not going to get better.

04:08 We could talk about that more later. But before we get into that, let's just jump in with a little

04:13 bit of background on you before we get to the topic. Tell us a bit about yourself. What do you do?

04:18 How'd you get into Python? Things like that.

04:20 Yeah, absolutely. So I'm currently a data science consultant and also a data science instructor.

04:27 I run some online programs where I teach people to become data analysts mostly is what I'm focused on.

04:34 But I also have this practice where I help companies solve data problems with different techniques.

04:39 I started actually by studying chemical engineering in college in my undergraduate degree.

04:45 And about a semester in, I realized, crap, I hate this. This is not for me.

04:50 But I was a little on a little of a tough. Yeah. Do you agree? Have you felt something similar?

04:54 I did a semester of chemical engineering as well. I thought, I love chemistry. I love math. Put them

04:59 together. Somehow they don't go together. It's like ice cream and eggs or something. No,

05:04 they don't go together for me at least.

05:06 Yeah. It wasn't good for me either. I was just like, oh man, I'm actually not interested

05:10 in refineries or like manufacturing. But I, like you, liked chemistry. I liked math. I thought this

05:16 is perfect. But I quickly realized, oh man, I really liked this whole programming part that I get to do

05:21 in MATLAB at the time when I was an undergrad. And I was on a time crunch to get through college kind of

05:27 quickly through eight semesters. And the other issue I had was I didn't know what to do instead. It was

05:33 like, I don't really want to study computer science. Part of the reason why is they kind of

05:37 had this weed out course at the beginning, which you had to build Excel from scratch, basically like

05:43 some sort of a spreadsheeting tool. And I was like, why would I rebuild something that already exists

05:48 that I don't even like using in the first place? I wasn't really into it. So I didn't, I didn't know

05:52 what to do. And luckily I was working as a lab technician at this company, the really cool company

05:58 that makes the sensors that basically have the ability to smell. So they can sniff what's in

06:04 the air and it has applications for finding drugs or bombs and airports and stuff like that.

06:09 And there was a data scientist on staff and that data scientist was awesome. He was like showing me

06:14 all these cool algorithms he was writing for these sensors. And then one day he got up and left and he

06:20 left the company and we tried to hire another data scientist for like six months, but they were really

06:26 expensive. We were a small company and none of them really wanted to move to Utah where I lived in

06:30 Salt Lake City. And so we couldn't, we couldn't really find someone that would be able to do it.

06:33 And so finally I was like, well, I really liked this programming stuff. And I, you know, the data

06:38 scientist showed me a thing or two, maybe I could take a stab at this. And I started, I wrote like my

06:42 first machine learning algorithm and I was like, oh my gosh, I'm addicted to this. And then I never

06:46 looked back and had been data science since basically.

06:49 What a great story. Yeah. I think, I think a lot of people fall into programming that way. And for some reason,

06:55 not unexpectedly, but for some reason, a lot of people fall into Python that way as well. They're

07:00 like, you know, I have a job and I got this thing I got to do. I just need a little bit more than maybe

07:05 like an Excel spreadsheet or something and put it together. And you're like, actually, this is cool.

07:10 After a while, like, this is cooler than what I've been doing, or maybe I'll make it a good part of what

07:14 I do. Right.

07:15 Yeah. A hundred percent. Even just making, it was in MATLAB, which is basically engineers version

07:20 of Python or college version of Python 10 years ago. Right. And I made like tic-tac-toe and I remember

07:26 playing tic-tac-toe against the computer. I think that's what it was. Or maybe it was,

07:29 maybe it was Hangman. I can't remember. But I remember like the idea of like being able to play,

07:33 to program games and play against the computer. And I built it. I was like, this is the coolest thing

07:38 ever. I got to, I got to do more of this.

07:40 Absolutely. You know, I think I've done some MATLAB too, when I was younger and it's not that

07:44 different from Python, but it's, I think one of the big differences other than it just being like

07:49 embedded in a big expensive app is it's not a general purpose programming language, right? You

07:54 wouldn't go, you know, that was fun, but let me go build this website in MATLAB or let me create

08:00 Airbnb and MATLAB or, you know, like there's, you just don't want to sort, Azure has this like

08:06 self-prescribed limit to what you can do with it.

08:08 That's one of the coolest parts about Python is it's really a Swiss army knife and you can pretty

08:13 much do, I don't want to say anything, but pretty close to anything in Python, which makes it really

08:19 neat. And obviously one of the huge limitations of MATLAB is one, it costs thousands of dollars,

08:23 but two, you're right. It's not going to do cybersecurity for you. It's not going to build

08:27 websites, but the syntax at the end of the day was, was really quick. It was, it was easy for me to

08:32 transition from MATLAB to Python because the syntax isn't all that different.

08:35 No, it's not all that different. More math focused, but pretty similar. So I think maybe that's a good

08:41 place to start discussing and exploring the topic of your first data science job. And

08:46 wouldn't necessarily plan on starting here, but let's, let's start with before you even necessarily know

08:50 programming language, right? Maybe you've dabbled in MATLAB or you've dabbled in Excel or even dabbled in,

08:57 I don't know, JavaScript or something. This thing we've been talking about with MATLAB and it applies to other areas as well,

09:02 like through programming languages per se, like Julia or something like that, is how,

09:08 if you invest your time into learning one of these things really well, like how broadly industry-wide

09:16 of a skill, high demand skill is that going to be, right? If you learn MATLAB, you put yourself in a box,

09:21 you learn a more general programming language, you kind of have more options afterwards, right?

09:25 Yeah, totally. I think like the more broad of a language you learn, the more useful you are to,

09:31 to more industries in general. But I might take that even a step further and just say, you know,

09:37 learning MATLAB, not a whole lot of companies use MATLAB, but just like landing your first data job,

09:43 going from zero to one is the hardest, learning your first language, zero to one is the hardest as well.

09:48 And then once you have that first language, the next language becomes so much easier. So

09:52 one of the first things I learned was MATLAB. And then I moved to Python and that was easier. And then

09:57 I learned SQL and then I learned R and then I learned JavaScript. And every time I added like a new tool

10:02 to my toolkit, it was quite, not almost, it was easy, but it got easier with each one. I think that's true

10:07 with foreign languages as well. Once you learn one foreign language, then the third and the fourth become

10:12 quite easy. At least that's, that's what I heard. I speak kind of through two and a half languages,

10:17 but like I, there's people who speak like seven and they always say like the sixth and the seventh

10:21 become easier.

10:22 Yeah. You wonder how could you probably, because learning the first one is so hard,

10:25 first foreign language. So you're like, well, how could you possibly take that on for this many

10:29 languages? And it's that it's not the same challenge each time, right?

10:32 Yeah, exactly.

10:32 Yeah. So I think when people are considering getting into data science, they really want to consider

10:38 what language they choose and where they go. Like you're coming out of a

10:42 college program. You might feel like MATLAB or something like that's real popular. And yet

10:47 that's because it's popular amongst professors who forced their students to do it. That doesn't

10:51 necessarily mean that's the world, the broad worldview. What do you think about R? You know, both.

10:57 I like R. I'm not, I sometimes troll R on LinkedIn. So I guess that's another thing I should say is I post

11:03 a lot on LinkedIn, kind of a LinkedIn guy. And so a lot of the times, honestly, just for jokes and kicks

11:08 and giggles, I'll kind of roast R on LinkedIn just to get the trolls angry in the comments.

11:13 I've invented it. It's quite fun. It's quite a fun experience, but I'm not that big of a hater. I

11:18 think that's really interesting about R versus Python is obviously a big debate in the data

11:23 science community is R is kind of that does one thing really well. And it's getting a little less

11:29 of that as like more packages and libraries are added to R, but R does the statistics and machine

11:34 learning very well. But obviously I don't think once again, I don't know any websites, any like

11:39 super functioning websites that are built on R. I don't know any cybersecurity that's really done

11:44 done, done VR. So I think R does what it does well. The syntax sometimes is a lot easier for

11:49 people to go from Excel, which a lot of people are more familiar with in the finance or banking world,

11:54 for example. The syntax in R is a little bit more similar to those Excel formulas than it is to Python.

12:00 So I think sometimes people have a little bit more success just because, oh, this kind of feels like

12:04 our formulas are sorry. This feels like Excel formulas. And so people really get there. I think what

12:10 you're kind of alluding to is if you're going to learn one skill, you might as well learn the one

12:14 skill that's applicable to the most, the widest net, right? And so that way you're fishing in the

12:20 biggest lake you possibly could versus in a smaller pond of R. I think that's worth looking at. And one

12:27 of the things I actually really enjoy doing, because you know, you mentioned, oh, you might think MATLAB

12:31 is popular because that's what the professors taught you. And there's actually not a whole lot of

12:35 data out there about, well, what should you learn? So I don't know if you know who Luke Bruce is. He's a data

12:41 analyst YouTuber. I was going to say YouTuber on YouTube, but that's kind of redundant on YouTube. And

12:45 one of the things he's done is he's actually built this tool where he's web scraping thousands of jobs,

12:50 different data jobs every week, and then displaying and analyzing the skills required for those jobs.

12:56 So it's actually like a data driven way of saying, if you want to be a data scientist, what skills should you

13:01 actually be focusing on as you go, as opposed to just listening to what a professor will say,

13:07 or what a LinkedIn influencer will say, or what your bootcamp will say. Like actually getting some

13:13 data on, I think is pretty neat. That is super cool. And I'm not familiar with Luke. So we're going to dig

13:19 him up and put him in the show notes for later so people can check that out. For sure. Do you remember

13:23 any of the trends you've recently talked about? It's datanerd.tech, I think is the website there.

13:27 I look at it mostly for data analysts because that's who I work with the most. So I know the

13:33 data analyst data very well. SQL is number one at 50%. I think Python is number two at like 30%.

13:40 I think Python might've jumped it. Well, this is for all data positions right here. So the job title,

13:47 you can choose. So which one do you think I should pick here? Data?

13:51 Maybe data scientist. Data scientist. Yeah. Right.

13:53 What's that? Yeah, you're right. Wow.

13:56 Whoa, Python 69%. Look at that. That's huge. So like, that's even, that's even what? 20% more than SQL,

14:04 which a lot of people are like, if you were going to be a data scientist, you have to know SQL.

14:07 Yeah. If you look at the job descriptions, Python's mentioned a lot more. So if you're going to learn,

14:11 if you're brand new and you're going to learn one, you might as well start with Python. Because that's

14:15 probably the most in demand skill that there is right now for a data scientist.

14:18 Yeah. And it's pretty easy, right? It's not like, well, why don't you just learn C++ for

14:22 embedded devices? You're like, you know what? Maybe I'll pick something else to start with. Right. But

14:26 you know, Python's pretty easy. I agree with you. I think Python's great. I actually think,

14:31 I think SQL is probably easier to learn if I'm being honest, because really, especially for like

14:35 data science stuff, there's only about like 20 commands that you need to know in SQL. But it's,

14:40 once again, SQL's a lot more, there's no websites built on SQL. I'll tell you that much. So

14:44 it's a lot more limited on what it can do.

14:46 It's a skill, but not the language. It's not enough on its own, generally. I mean,

14:52 you can do reports and quite a bit with it. But you know, it's like, when you see these programming

14:57 popularity, like what's the most popular language? Oh, look, CSS is the third most popular. That's not

15:02 a language. That's a thing that you use with other languages, right? Like use it with all the other

15:06 languages. That's why it's high up. But that doesn't mean it's high in demand. Exactly. It's just like

15:11 table stakes, you know? Yeah.

15:12 So you kind of got to distinguish table stakes from like picking an area, I think.

15:17 That's totally true. And really, I think Pythonistas could make the argument that there's

15:22 really nothing in SQL that you couldn't do in Python. That's a little somewhat true,

15:28 true depending on data size and stuff like that. But regardless, there is ways that you can do

15:32 most of the SQL commands in Python one way or another. Yeah. Yeah.

15:36 It could be when I first became a data scientist, I didn't even know SQL and I was doing SQL commands

15:41 or I was doing the aggregations or the where functions or the window functions using Python.

15:47 So you definitely can. As long as your data is not like super big, then you'll totally be fine.

15:51 Right. Like some kind of generator or even slices or yeah, things like that, right? List comprehensions,

15:57 set comprehensions, all that kind of stuff. Kind of like, gosh, I really wish, a little bit of a

16:02 sidebar, but I wish like list comprehensions and all those things had just a few more SQL features,

16:08 right? Like in a list comprehension, I say, give me this thing, maybe give me this property of this class

16:14 modified, like give me the user's name, uppercase. Right. So that's like select. And then for thing

16:20 in collection, that's like from table or whatever. Right. And then you have the where clause with the

16:27 if statement, but boy, wouldn't it be cool to have like a sort also in there and other things like that,

16:33 you know? Oh, well, totally. It's so close. The cool thing is, is if you want that sort,

16:38 it's what one extra line. Like it's, it's not, it's not too bad. So it, Python, I mean,

16:43 I don't want to say this necessarily to hate all, to make all the data scientists and SQL lovers

16:47 mad, but, but really Python can do a lot of the things that SQL that's for sure.

16:52 Yeah, that's for sure.

16:53 This portion of talk Python to me is brought to you by Sentry code breaks. It's a fact of life with Sentry.

17:00 You can fix it faster. As I've told you all before, we use Sentry on many of our apps and APIs here at

17:06 Talk Python. I recently used Sentry to help me track down one of the weirdest bugs I've run into in a long

17:12 time. Here's what happened. When signing up for our mailing list, it would crash under a non-common execution

17:19 past, like situations where someone was already subscribed or entered an invalid email address or

17:25 something like this. The bizarre part was that our logging of that unusual condition itself was crashing.

17:32 How is it possible for her log to crash? It's basically a glorified print statement. Well, Sentry to the

17:39 rescue. I'm looking at the crash report right now, and I see way more information than you'd expect to

17:44 find in any log statement. And because it's production, debuggers are out of the question.

17:48 I see the traceback, of course, but also the browser version, client OS, server OS, server OS version,

17:56 whether it's production or Q and A, the email and name of the person signing up. That's the person who

18:01 actually experienced the crash. Dictionaries of data on the call stack and so much more. What was the

18:06 problem? I initialized the logger with the string info for the level rather than the enumeration dot info,

18:14 which was an integer based enum. So the logging statement would crash saying that I could not use

18:20 less than or equal to between strings and ints. Crazy town. But with Sentry, I captured it, fixed it,

18:27 and I even helped the user who experienced that crash. Don't fly blind. Fix code faster with Sentry.

18:33 Create your Sentry account now at talkpython.fm/sentry. And if you sign up with the code

18:39 TALKPYTHON, all capital, no spaces, it's good for two free months of Sentry's business plan,

18:46 which will give you up to 20 times as many monthly events as well as other features.

18:51 Probably the biggest, it's a bit of a diversion, but the biggest similarity to that I've seen in

18:57 the languages is C#'s link where they actually have almost all the query operators, including

19:04 joins and stuff like that built into the programming language. I'd love to see more of that kind of

19:09 inspiration into Python, but you know, that's all right. It's still really good. I've got a lot of

19:13 cool SQL-like features, but you're right. Once you are no longer working with data and memory,

19:17 or you want indexes, right? Like this concept of indexes is not sufficiently well understood. I think

19:24 every time I hit a website that takes five seconds to load, I'm like, somebody is not doing all the

19:28 things they should be doing. I just know it. That's totally true. What about SQL? You know,

19:34 let's talk about that for a bit, right? The SQL, the query language or databases and other things,

19:40 there's ways to SQL query, not just relational databases. But you said you got away with not

19:47 quite learning that, but do you think if you could start over, maybe making an effort to learn that

19:52 would be really valuable? Like how, how important is this a thing in the beginning of your career?

19:56 The interesting thing, you know, about landing a data job is your skills only plays, I say,

20:03 a third of the role. Your portfolio or the way that you portray your skills and your network,

20:09 I think are the other two thirds and they're actually more important than your skills. And that's

20:12 kind of how I got away with not knowing SQL and not even being, to be honest, that good at

20:17 Python at the time was because I used my network to be in the situation to get my lab technician job

20:24 in the first place. And then once again, I use that same network, in this case, my coworkers,

20:28 to land that first data scientist position after we couldn't hire anyone. And if I would have been

20:33 applying externally for that role, chances are I wouldn't have gotten that role. I probably didn't

20:39 know enough at the time to land that type of a role, but because they knew I was hardworking,

20:43 they knew I wasn't like a total idiot and I really liked to learn. They took that chance on me. It

20:49 paid off really well for them because at the time I was still in college. And so I wasn't getting paid

20:53 that much. And I was getting, I was not getting paid like a data scientist, but I was getting results like

20:58 a data scientist for them. So I think it was, it paid off for both of us. But I think if that was an

21:03 external job and I applied for it, I probably didn't have enough skills for it. So I definitely think

21:07 learning SQL, if you want to land data science job, isn't a bad place to start, especially because,

21:12 like I said, there, I mean, any programming language, I like to think of like the iceberg,

21:17 kind of like the Titanic, right? There's the parts that you see, and then there's the parts that you,

21:21 that you don't even know that you, that are there. And, and really you could spend the rest of your life

21:26 trying to master SQL or the rest of your life trying to learn Python. But the cool thing is,

21:31 is a lot of the time you only need that top little bit that's sitting at the,

21:34 the top of the surface of the water to actually get stuff done. And so for SQL, I think that's like

21:40 20 commands. And I think you could learn it honestly in like a month, you could learn those,

21:45 those 20 commands pretty easily, but it worked out for me. And I, I didn't have to use it that much

21:49 at the time until I was probably about almost three years into my job. And I actually had switched jobs

21:54 to a bigger company. The other thing that I was working for a smaller company where we didn't have a ton

21:58 of data. So we could use CSVs kind of as our, our database, which is not great practice. But when I,

22:05 when I eventually became a data scientist at Exxon mobile, I was going to say they didn't use Excel as

22:09 a database, but they still did. But the point is they had much larger SQL databases with hundreds of

22:14 thousands, actually millions of rows of data that I had to query.

22:17 Yeah. Then you gotta be really, you need to understand it at a much deeper level. You're

22:22 like, if you do a query like this, it's going to be super slow. But if you do it like that,

22:26 it can use the composite index for the sort and then blah, blah, blah, blah. All right. Then

22:30 you're getting to the bottom of the, the iceberg in SQL, or maybe not the bottom,

22:33 maybe like the middle chunk under the water, but there's so much to learn for both of them.

22:37 Amir at the audience asks, you know, like when you talk data job, like what kind of jobs are out

22:42 there? Right. So we talked to both about how we did chemical engineering and then we saw like

22:46 chemical factories, like, yeah, I don't really want to work here anymore. I'm out.

22:49 So thinking about like, well, what are the kinds of jobs you do? I think that's really important because

22:55 it's easy to get focused in on like the FANG companies. Like I want to work for like some super

23:02 big tech company. I want to move to San Francisco and like that, that, that, right. Like there's not just

23:07 plenty of other jobs, but the opportunities, just like you described, and as well as

23:12 like my first job, I worked at a company that had like eight people and it was awesome. Right.

23:17 They didn't expect me to be, you know, running Kubernetes clusters and doing all sorts of great.

23:22 They're just like, I need you to make this thing happen. Can you do like, I'm pretty new,

23:26 but that thing I can make that happen. Like, let's go. Right. And I feel like the, the possibilities

23:30 to get in, especially with these maybe more niche type of industries and companies might even be easier

23:37 for a first job. People seem to be really obsessed with, with the FANG. And I don't know if that's

23:41 like a societal thing, or if it's just, those are the companies that we use a lot. And so we're

23:46 excited about them, but yeah, there's so many more data jobs outside of FANG than there are inside of

23:51 FANG, even though there's, there's quite a bit inside of FANG. And oftentimes those roles can be

23:57 much more interesting and you can do a lot bigger of an impact. When, when I was working at the small

24:02 company, VaporSense, I like, I had so much power. I didn't even realize it. I had such a big effect on the

24:08 company. I was presenting to, you know, Fortune 500 companies and what I did really made a difference.

24:14 And when it came to the point where ExxonMobil offered me to go be a data scientist for Exxon,

24:20 I said, Oh, I want to go work for the big company with the nice desk and the nice laptop and, you know,

24:26 try something new. And when I got there, I really, I had some pretty cool opportunities when I was at

24:31 ExxonMobil, but ultimately I left pretty shortly after two years of being there because I just felt like a

24:36 cog in the machine and I didn't feel like I was actually making a difference. And that was really

24:40 important to my work satisfaction of like, is what I'm doing being used? Is it being used to better the

24:45 world? Do I feel valued? And the answer was kind of no for me when I was there. So there's definitely

24:50 a trade-off between the small companies and the big companies, but also to go back to your original

24:54 question, there's so many freaking roles in the data world that you're not even thinking of that. Like,

24:59 I'm not even thinking of, I saw a new one the other day when I was helping one of my students.

25:04 It was like, it wasn't data janitor, but it was something like that where I was like, I don't

25:08 even know what that role is, but there's, there's so many roles. When I was, when I was a data scientist,

25:12 VaporSense, the small company, my actual title was junior chemometrician, which basically means

25:19 you're doing data science with chemistry. When I was at ExxonMobil, when I was first there,

25:24 I was doing data science, but my actual title was optimization engineer. And so there's so many

25:29 titles that we don't even think to search of, or even to look up, but those are all data science

25:33 roles. I was doing machine learning every day in both those roles. And you would maybe never guess

25:37 from those titles. Yeah. You would never guess. No, that's awesome. What machine learning libraries,

25:42 frameworks were you using? At VaporSense, once again, because it's a smaller company,

25:46 I had a lot more say in what I was doing. We were building a bunch of machine, we were building

25:51 classification models to basically to take the data from our sensors and sniff if something was in the

25:56 air. Sometimes that was a yes, no, like, oh yes, there is ammonia in the semiconductor factory and

26:02 that's bad. So that's a yes classification kind of binary, right? Other times it was, what drug is this?

26:09 Is this meth or is this heroin? One of the use cases we had was, this is binary once again, but is this

26:15 recreational marijuana or medicinal marijuana? And can we tell the difference between, between those?

26:20 So we are usually using classification models, usually built in scikit-learn in Python, the majority of the

26:26 time there. When I was at Exxon, we had a lot less say, like the data scientists had a lot less say in

26:32 the decision making process. We were doing a lot of multivariate linear regression with a lot of crazy

26:39 hacks and transformations kind of in the meantime for one of my positions there. And then the other time,

26:44 the other position I did there, we were doing a lot of auto ML using PyCaret and letting it kind of

26:50 decide what type of models to do. So. Okay. The unsupervised learning type stuff, huh?

26:54 It was awesome. It was really fun to, to, I love PyCaret because it's like, okay, go make 25 models and

27:00 tell me which one's the best. It's like, takes, makes my job easy, I guess.

27:03 We're going to be creative with sheer numbers. That's how we're going to come up with a solution.

27:08 Got it. Exactly.

27:09 Well, Diego is asking like, what are some of the common stats methods as in mathematical type stuff

27:16 you would use? So one of the things I know that some people getting into programming think is you've

27:22 got to be really good at math to be a programmer. I think you've got to be really good at logical

27:26 thinking, but you need to almost zero math be like a web developer. You know, we're talking percents

27:33 for CSS, incrementing numbers from one to two to two to three for IDs and stuff like that. But for

27:40 data science, maybe there's a little bit more like, where do you see that kind of background?

27:45 I like what you said, you have to think logically, but maybe the math isn't as important. And I think

27:49 it's actually somewhat similar in data science. I will say you probably need a little bit more math

27:54 than a web developer, but I think it's a lot less than most people think. And it's probably less

27:59 about being able to do the math and maybe more about understanding the mathematical concepts.

28:04 And what I mean by that is a lot of, a lot of, so I also have a master's degree in data analytics.

28:10 A lot of master's degrees in data science and data analytics will say you need calculus and linear

28:15 algebra as kind of a background for your math. And that kind of stops people. I don't want to do any

28:19 calculus. I don't want to do any linear algebra. And while both those concepts do exist in data

28:24 science principles, the majority of the time, the computer, Python is doing the math. You just have

28:30 to be able to interpret the results of the math and kind of know what different directions, like this is

28:36 going down, an optimization problem, you know, okay, that's the derivative, you know, getting closer to

28:40 zero. Like it's really less about knowing how to do the math by hand and more just understanding what the

28:46 math the computer is actually doing. So I think it's actually a lot easier than most people say.

28:50 That being said, knowing how to do a derivative or taking integral, those concepts, I think is

28:55 probably underlying pretty important. But other than that, like a lot of the times I'm doing linear

29:00 regression because it's, it's awesome. It gets the job done. A lot of the time I'm doing hypothesis

29:06 testing and statistics, which you have to look like at a P score, nothing all that crazy. At Exxon,

29:12 I had to do a lot of linear programming, but that's honestly, that's like the exception versus the rule.

29:17 There's not a whole lot of linear programming for most data science, most data scientists. So

29:22 I really don't think the math is, is all that hard. Now, of course, that's coming from someone who

29:27 got a chemical engineering degree, who had to take all the calculus, all the linear algebra. So I did go

29:32 through those courses. I haven't really done it from scratch from like a lot of my students are teachers,

29:37 for example, who never took those courses in college. So I can't speak from that perspective,

29:41 but a lot of my students are able to figure it out at the end of the day and transfer. So it happens.

29:45 Yeah, yeah, for sure. I think there, you make a good point. I think it's about knowing,

29:50 okay, this formula or this algorithm or this test means this thing. It applies in this situation.

29:56 It doesn't apply in that situation. Here's what you're trying to get from it, right? Like,

30:00 I know I need to do a fast Fourier transform. So, and this is what it tells me when I get out the other

30:06 side. But do I need to be able to sit down and recreate the integral and the calculus behind it

30:14 and do that on like a home, like as a homework example, like, give me a function and I'll do the

30:18 Fourier transform and I'll actually do the symbolic integration. Like, no, you probably don't need that,

30:23 right? But you need to know, I do the Fourier transform in this situation and this is why. And then I just say,

30:30 call the function, do it, right? And interpret the results. Really, that's what being a data

30:35 scientist is all about is, yeah, what does the business use case, what's the desired business

30:40 use case? How do I relate that use case to the data? What technique can I use to get the outcome

30:45 that I need? Computer, go do it. Interpret results, present to stakeholders. That's a data scientist,

30:52 right? I think one of the challenges with that is going to be, not that it's not good, but I think

30:57 it's going to be challenging because how do you learn when to use a certain statistical test or when to do

31:04 some kind of funky transformation, like a Fourier transform without more traditional mathematical

31:10 backgrounds? And all the academics will not just go, oh, we're just going to give you like

31:14 five minute overview and they'll help you understand. They're like, nope, we're going to start with this

31:19 axiom or this theorem from differential equations. I'm going to work up. You're like, no, no, no, no,

31:24 no, I don't need that. I don't, I'm not on a four-year plan. I'm on a four-week plan. How do I,

31:30 how do I get value from a couple of the mathematical things without being sucked into like, yeah, now I'm in

31:36 differential equations at Harvard online and I don't understand how I got there.

31:39 It's such a big problem and I'm so glad you brought this up and I'll be vulnerable because yeah, I felt

31:45 the same, the same way. And I was like, there has to be a better way. And so about, what was it? Three

31:49 years ago now, two and a half years ago, three years ago, I said, oh my gosh, I'm going to solve this

31:53 problem and I'm going to start my own data science bootcamp. And so I spent about six months making the

31:58 curriculum, making all the videos. I opened it up. I got some students in there and I ran it for about

32:03 six months and I looked at the results and man, we weren't getting anyone into data science jobs.

32:08 And I thought, ah, what the heck am I doing wrong? I had this brilliant idea of like, we're going to be

32:12 less theory, more project, more hands-on. And I realized, man, the truth is people just learn better

32:19 at work. That's where you learn that whole technique that you just like, how does someone learn that?

32:24 The answer is by getting experience and learning it at work. And when I looked back and I said, okay,

32:28 well, we have had students get jobs. What jobs did they get? And it turns out most of them were

32:33 getting like business intelligence, intelligence engineer jobs or data analysts or financial

32:38 analyst jobs that were a little bit below a data scientist job. And I realized, oh man,

32:43 if we can just help people go from zero to one and get their foot in the door, they can go from one to

32:49 five much quicker at work because work is just, I don't know, it's this magical place, right? Like,

32:54 like you said, they, whatever you were working at earlier, and they're like, hey, can you do this

32:58 Kubernetes thing? They just kind of throw you in the fire and you're like, figure it out. And that's

33:03 somehow you do, I don't know what it is about work, but you figure it out and that's where you learn.

33:07 So that's kind of what I've, why I changed my curriculum to be more focused on, you know, okay,

33:11 maybe people aren't going to become data scientists, but can we get them to zero to one quickly?

33:15 And then they can get paid to learn the rest of the data science stuff when they're actually in that

33:19 first position. How much do you know about what you actually want to do in the industry

33:23 before you've done it as well? Right? Like, you're like, oh, I thought everybody said machine

33:28 learning was awesome. And I've used chat CPT and I loved it, but it turns out actually like API is

33:34 better, but I've never had a chance to build an API. So until I started, I didn't even learn that one,

33:39 it was a thing to that. It was cool or vice versa, right? Whatever. But until you get kind of in,

33:45 you don't even know, like, actually this part is where I really am enjoying it. And so just getting that

33:50 first step, that's a big deal. A hundred percent. You don't know what you don't know until you know

33:55 it. That's why, I mean, really when it comes to, if we like, we go back to just SQL or just Python,

34:00 you could spend, I tell people this, if you tried to master Python before you applied to a job,

34:06 you'd be like 80 years old before you ever applied to a job. Same with SQL, same with machine learning.

34:12 The cool thing about data is we're never going to know it all. And so just learn the bare minimum to get

34:17 your foot in the door. And then you have this place where you're going to get paid to learn what

34:21 you want to learn. Eventually, if you learn, oh, I love APIs. I promise you that there's a company out

34:26 there that will hire you and you can learn APIs on the job. Like that's going to happen. But that first

34:31 step is no true. There's a company out there that it doesn't know it needs APIs, but you could help them.

34:36 And you know, they don't have huge expectations because this is the thing they just learned they needed.

34:40 Right. A hundred percent. Yeah. It's wild, right?

34:42 This portion of Talk Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio,

34:50 and especially Shiny for Python. Let me ask you a question. Are you building awesome things? Of

34:56 course you are. You're a developer or data scientist. That's what we do. And you should check out Posit

35:00 Connect. Posit Connect is a way for you to publish, share, and deploy all the data products that you're

35:06 building using Python. People ask me the same question all the time. Michael, I have some cool

35:12 data science project or notebook that I built. How do I share it with my users, stakeholders, teammates?

35:18 Do I need to learn FastAPI or Flask or maybe Vue or ReactJS? Hold on now. Those are cool technologies,

35:25 and I'm sure you'd benefit from them, but maybe stay focused on the data project. Let Posit Connect handle

35:30 that side of things. With Posit Connect, you can rapidly and securely deploy the things you build in Python.

35:35 Posit Connect, Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, ports, dashboards, and APIs.

35:42 Posit Connect supports all of them. And Posit Connect comes with all the bells and whistles to satisfy

35:48 IT and other enterprise requirements. Make deployment the easiest step in your workflow with Posit Connect.

35:54 For a limited time, you can try Posit Connect for free for three months by going to talkpython.fm/posit.

36:01 That's talkpython.fm/posit. The link is in your podcast player show notes. Thank you to the team at

36:08 Posit for supporting Talk Python.

36:13 Posit Connect: Let's talk about some career advice. I mean, I know you talked about being

36:18 connected on LinkedIn pretty well and certainly having some kind of social network is important. And they

36:23 maybe, it's not that you would call it not social, but a real world network of actual human beings that

36:29 you're, you know, physically know somehow. Posit Connect: What's that? I don't know what that is.

36:32 Posit Connect: I know. Like, we gave that up back in 2020, I thought.

36:35 Posit Connect: Yeah. Posit Connect: Anyway, like, there was some stat that I saw somewhere that,

36:38 you know, over half of the jobs are filled filled before even becomes a job posting, right? Maybe

36:44 some of the best ones is like, hey, who knows somebody who can do this? We need some, like your

36:49 data science example, data scientist example. They quit like, oh, we need somebody. Does anybody know

36:54 good data science? I don't want to just go put it out on the open job market and have to have a hundred

36:58 interviews and who knows what I'm going to get. Like, if you can recommend somebody, let's start there,

37:03 right? So being in that group to be recommended, it's important.

37:07 Posit Connect: It's the key. There was a really interesting survey done on LinkedIn and they said,

37:12 it was kind of, it was done by the same person and Jordan Nelson, by the way, he said, "How do you

37:16 approach getting a job?" And then the next day he said, "How did you get your last job?" And 80% of

37:22 people, they use what I call the spray and pray method, which basically means you go and you apply to as

37:27 many jobs as you possibly can and hope for the best. Cross your fingers. That was 80% of what

37:32 people were doing. And then on the next poll, the next day, it was a total, I think of what, 70% were

37:38 either headhunted, recruited or referred. And so it's like the Pareto principle here where, you know,

37:43 80% of the effort is only getting you 20% of the results. And really 20% of the effort gets 80% of the

37:50 results. So it's okay. We know networking and getting recruited is really important, but how do we do it?

37:55 It's easier said than done. And like you said- In the industry, how do I make friends who are,

37:59 right? It's like, well, my neighbors don't do it. So I guess I'm out.

38:02 That's the tricky thing is, is yeah, if you're not in the industry yet, how do you get recruited

38:06 into it or how do you know someone? And what I've come to learn is it actually doesn't even matter.

38:11 So like, for instance, let's take, let's take your neighbor, right? Your neighbor is probably not a data

38:16 scientist. Maybe you're lucky and they are, and they can refer you to a company. But what's really cool is

38:20 I've learned that companies really come to trust their employees and their employees'

38:24 recommendations. And so even if your neighbor, let's say is a web developer, or maybe even less

38:30 technical, let's just say your recruiter is in finance, right? And if there's an opening,

38:35 like a data science opening at that company, a lot of the times they will actually take their

38:39 employee referrals much more seriously than any sort of cold application that they get.

38:44 And so a lot of the times I've had students who just know someone that works at the company,

38:49 they saw a job opening pop up. They're quickly, they message their friends. Hey,

38:53 do you know a recruiter or a hiring manager? I could talk more about this role. Could you do

38:57 an internal referral for me? And they were able to land jobs that they probably wouldn't have.

39:01 No, they definitely wouldn't have without that internal referral. So it is tricky. It's the old

39:06 cliche. It's not, it's not what you know, it's who you know.

39:08 I think there's still plenty of ways, COVID notwithstanding. I think that these days,

39:12 there's plenty of ways to get those connections, right? But maybe people don't know, like meetup.com

39:18 is really good. If you live in a non-tiny city, there's many, many things going on that around data

39:25 science, around Python, around other data engineering, whatever, right? You could go to those things.

39:31 They're typically even free. Often they are free with food. They even feed you, right? And make connections,

39:37 or regional conferences or national conferences, right? Like we probably, many people have heard

39:42 of PyCon, right? There's US PyCon, there's EuroPython, and then there's, but that's, those are the ones

39:48 that are often talked about, but there's 10, 20 little smaller regional ones in the US and many more that

39:54 I'm not aware of throughout the world. Probably one of those within driving distance, right? That you could

39:59 go to make connections and just also kind of take the temperature of actually what, what you see on the

40:05 internet versus what you see and actually talking to real people. So I'd also say, just get out there.

40:10 A hundred percent. Those places have the people who probably want to hire you because they're local,

40:17 right? Which is one thing that's, that's trouble on LinkedIn. I'm, I'm big on networking on LinkedIn,

40:21 but a lot of the times you're going to be networking with people who in all likelihood might never have a

40:26 role that's even open to you. But the people that you're like, for instance, we have,

40:30 I'm in Utah and we have Silicon Slopes that has like a tech meetup. We have a local Python

40:36 meetup chapter. We have the big data and developers conference that that's free every year with tons

40:41 of food. And the people who go there are people from companies around there that have the openings

40:47 that you're trying to find. And they want to hire people like you who are in the area. So at least

40:51 you can maybe come to the office once a week or maybe once a month or whatever. Right. And so really,

40:55 like you said, going to those meetups, it's tough because networking is always difficult,

41:00 either online or in person, but at least in those situations, you know, Hey, these are people that

41:05 are tied to real companies that exist around me that do make data higher. So I have a chance.

41:10 Definitely a much higher chance than just shooting out a resume. All right. Well, let's see.

41:13 We talked about job hunting already. What about like applications and resumes? What are your thoughts on

41:20 that? I think once again, with the applications, the more targeted that you can make it, the better,

41:26 right? So if you can really hone in on, I really want this job, I'm going to cold message five people

41:32 at this company and see if I can get that internal referral one way or another, make a real connection

41:37 with them. I think that's really key. And then with resumes, resumes are more of an art than they are a

41:42 science. I feel like they are so difficult to figure out. And these ATSs that are trying to match you

41:49 and see if you're a good fit. I've tried a lot of them and a lot of them suck. Whoever's the data

41:53 scientist behind those, we need to have a conversation with them because it's, it's a little tricky

41:57 sometimes. But one of the coolest concepts I've been introduced to recently, and I have a whole

42:02 episode on my podcast about it is A, B testing your resume. And basically the idea is a resume's job

42:10 is just to get you a screener interview or like a beginner interview, basically. Right. That's all an

42:15 interview. Like no one's seeing a resume and then hiring you. They're always going to interview.

42:19 So if you think about it, a resume's job, the only job it has is to convince someone to get on the phone

42:25 and talk to you. And it's just a piece of paper. And guess what? You can put whatever you want on

42:29 that piece of paper. Now I'm not saying to lie, but I'm just saying you could theoretically make a

42:35 perfect resume for whatever job you're trying to go for and send it out there and see what happens.

42:39 Right. But I'm not saying to do that. I'm not saying to lie. My point in saying this is that the resume

42:43 is just to get you the interview. And if you're not getting interviews, something's probably wrong

42:48 with your resume. And so, you know, tweak something, apply to 10 more jobs, see what happens. Tweak

42:54 something, apply 10 more jobs, see what happens. Until you finally have the right combination, skills

42:59 of experiences of different keywords. Because a lot of the time you're just trying to beat the ATS. And

43:04 that's the sad part about it is it's like, how do I prove to this random computer algorithm that they should

43:10 talk to me on the phone? That's a hard game to beat. And there's a whole bunch of advice

43:14 from all these different people. What I've come to learn is it's different for every company. It's

43:18 different for every person. You kind of kind of a numbers game till you get lucky and you figure it

43:22 out. That's good advice. I guess two thoughts. One is I know that speaking specifically to anyone,

43:28 one. But in general, women wait until they match all the requirements of a position where a guy's like,

43:34 I know three of those things. I'm taking a flyer. I'm sending it. I would just like to encourage the

43:41 women out there to just send it as well. I 100% agree with that. And I think if you reach 60% of

43:47 the requirements, I think you have a chance. Like it's a lot of the times those are wish lists and not

43:52 actual requirements. And depending on, are you local to the area? Do you have a domain experience

43:58 in this company? Like there's lots of other factors. What about contributing to open source

44:02 or having GitHub repos that can be like projects that you can show off or what's your advice there?

44:08 I'm a huge proponent of projects in the portfolio. I think if you don't have experience with something,

44:13 you create your own by building a project. And if you can do that with open source,

44:18 I think you should totally do that because I've benefited so much from open source. I have not

44:23 given back as much as I should to open source development and projects. I definitely should do that.

44:29 But if you can find a project that you're passionate about that you can help with, I think you should

44:33 totally do that. Even if it's not open source and you're just building a project to showcase your skills,

44:38 I'm all about that. I think you can do projects that are super fun, maybe that are good for your

44:43 community or good for your life. I'm a huge fan of personal projects. I've put a Fitbit on

44:48 my dog before and looked at her steps. I've found the healthiest meal at McDonald's. I've looked at,

44:54 like visualized my weight over time and tried to create like different, like forecasting models and

44:59 stuff like that. There's so much data in our lives that you can use to make really cool projects.

45:03 Oh, absolutely. You talked about, okay, you get your first job and that's where you kind of really

45:08 learn. But if you don't have your first job, you can effectively simulate that. Say, I would have gone

45:14 on to a job and been given a project to analyze something. I'm just interested in this thing. I've got

45:18 two hours a day until I get a job that I can be inspired about this and just get going on it. Maybe

45:24 create a website and publish your results and it can draw more people in to actually see that, right?

45:29 And start to appreciate it. They could even ask like, all right, who's behind this cool project?

45:34 Maybe I want them to come work for me. Little did they know you're doing all this work because you got

45:39 some spare time and you're trying to build up your experience and a self-guided study, right?

45:43 Yeah. If you can build a cool project and flip the job hunt where you're not applying for jobs, but jobs

45:49 start to apply for you, you're in such a good position and doing really cool projects can help you get there.

45:54 Now it's hard to do cool projects. It's hard to publish projects, which is one of the things

45:59 that people really struggle with. For all you Python listeners out there, let me just tell you,

46:04 Streamlit is absolutely amazing because it makes the deployment process so easy. It's free. It's a

46:12 little tricky to deploy at first, but compared to what you used to have to do it back in the day,

46:17 I'm saying back in the day, like four years ago, basically. But it was really hard to deploy

46:23 something where you could send someone a URL. Hey, check out my web application, machine learning

46:27 application. Streamlit is such a cool app that makes it so easy and so intuitive to make these

46:33 cool little apps that you could just put on your resume, put on your portfolio, send to recruiters. I'm

46:38 such a fan of the Streamlit app. I love it.

46:40 Yeah, it's super cool. There's a couple of those and Streamlit is definitely one of the really nice

46:44 ones there. There's also some hosting behind Streamlit as well these days, right? You don't even

46:51 have to set up a server or anything and just create it and put it up there.

46:54 That's what I'm saying. Back in the day, I used Dash a lot and I'm still a big fan of Dash. Dash

46:59 is more customizable than Streamlit and can do quite a bit more, but it's a lot more work to deploy it.

47:06 It's more like programming.

47:07 Yeah, it is more programming.

47:09 Programming the UI rather than just the behind the scenes.

47:12 Yeah.

47:12 You have to do both and you have to know a little bit about systems and data engineering and stuff

47:18 like that versus Streamlit kind of takes that, abstracts that away. But yeah, back in the day,

47:22 I used to make Dash web applications and deploy them on Heroku back when they had a free tier of

47:27 hosting and they've taken that away. So I don't even know what the go-to free hosting platform is

47:32 nowadays. I just, I moved most of my things to Streamlit and it's so nice.

47:35 Yeah. We got Shiny for Python now, which is also nice.

47:38 I haven't checked that out. How is it?

47:40 I haven't done too much with it either, but Joe and the team over there are doing pretty cool stuff,

47:45 like adding more dynamic interactive stuff to Jupyter, like running inside Jupyter and things. Yeah,

47:50 pretty cool.

47:51 I'll have to check it out.

47:52 I think they also do a bunch of hosting stuff over there as well, is why it came to mind.

47:56 What other advice you got for folks out there?

47:58 So AI is AI, not studying AI or learning to use AI, machine learning, but is there a benefit of trying

48:05 to use ChatGPT to help you get this job or is there a danger? I'm thinking, for example, have ChatGPT

48:12 write me an awesome resume and then the tools are like, well, we've detected this is AI generated and

48:18 it's out. You know what I mean? What do you see happening there?

48:21 A lot of people see AI as like an all or nothing tool as in it's either you, the human doing the

48:29 work or it's the AI doing the work. But whenever, I don't know about you, but whenever I'm using

48:33 ChatGPT for anything, it's very rare it's copy and paste for me or at least not iterative where I'm

48:40 doing multiple prompts, prompt after prompt after prompt, trying to tweak it exactly what I want.

48:45 And so the way I look at ChatGPT and other gen AI that will be coming out, that's only inevitable,

48:50 is instead of looking at does this replace me? Does this, like for instance, am I going to build

48:55 my whole resume using ChatGPT? Can ChatGPT build, you know, take a data scientist's job and build the

49:02 whole model for them? I like to see it more as like a hammer. It's like a tool for the data scientist or

49:07 a tool for the job searcher to use in conjunction with your screwdriver or anything else. It's like

49:13 something to be wielded by a human, not replaced for the human, if that makes sense.

49:18 You know, it's really good for stuff like, hey, I know a regular expression will do this.

49:22 Yeah.

49:23 The last time I studied, I completely forgot what this is about. And I know it's gnarly,

49:27 but if I just ask, here's an example, here's what I want. Boom. And traditionally what you would end up

49:32 doing is you'd be on Stack Overflow. Yeah.

49:35 You'd be all over the internet. You'd be trying to piece it together from external information anyway.

49:38 And so code is something that's a little bit more in the wheelhouse of the generative AI,

49:44 because it can't really make it up as much. I know it could like do something insecure and you didn't

49:50 know it was or whatever, but it's not like asking for legal advice where it makes up cases that didn't

49:55 exist. Like it gives you code. You put it in the runtime of the compiler and it runs or it doesn't.

50:00 And the output comes like you did.

50:01 Yeah. It works or not.

50:02 Yeah. So it's pretty, pretty effective for that. But yeah, for resumes, I would be more like,

50:08 let me ask it. What are the in demand things? And if I know these three skills, what other skills should

50:14 I know to get a, you could sort of use it in an explorative way to then come up with what you

50:20 might write for yourself, right? Something like this.

50:22 I find it really useful for brainstorming like action verbs on your resume bullets. Like I think

50:27 it's really good at that. What's 10 different ways to say lead. So I don't say lead five times on my

50:33 resume and I use some different action bullets. I think it's great at that. I personally, it's pretty

50:38 rare that I start any Python code from scratch nowadays. I'm either starting hopefully from a

50:44 template that I've already written, or I'm starting from a ChatGPT. Like this is what I kind of want

50:49 to accomplish, right? Like the outline for it. Like one of the things I hate doing is I make a lot of

50:55 streamlet apps. I probably make a streamlet app a month right now. And I hate starting from scratch

50:59 with streamlet. It's super easy to start from scratch, but I'll say, Hey, ChatGPT, I want to

51:03 build a streamlet app. This is like the component I want here. This is the component I want here.

51:06 This is the component I want here. And it's almost like a warmup for me as a programmer. And it will

51:12 create something that works. It's not what I want. And I spend the next five hours trying to make it

51:18 what I want, you know, without ChatGPT, but it kind of gives me a warm start to my programming process.

51:23 So I really like it. I think it's something that everyone should use. And I think if you're thinking

51:29 about getting into any sort of programming, you know, whether it's data science or web development,

51:34 I think you should be a little bit less worried about it taking your job and job security. I think

51:40 you should almost be more excited that, wow, the bar has never been lowered to break into tech.

51:45 Like this is a step up gift from the programming gods that I get to use to break into tech.

51:51 Another thing to keep in mind is I imagine a lot of people listening to this podcast are not just

51:56 starting a college program, right? They're coming from possibly other experiences, other specialties.

52:03 You know, what's really good for job security, knowing the intersection of two things, the

52:07 intersection of chemistry and programming, the intersection of geology and programming for Exxon,

52:13 potentially, right? Like those things take you from a pool of a thousand to a pool of tens,

52:19 tens, right? And so what's awesome about that is it means two things. You don't throw away,

52:24 if you got a degree in something else like biology or whatever, you don't throw away like,

52:28 well, that was wasted four years. That's out. And it slices the pool of people who could apply for

52:33 certain jobs way, way smaller, right? Sounds like you agree.

52:36 Oh, a thousand percent. I'll just tell a quick little anecdote. When I was at ExxonMobil,

52:41 there's a lot of things I did not like at ExxonMobil, but this is something I really liked. It's about

52:46 once a quarter, they would do a crowdsourced data science competition for the whole organization,

52:51 like around the entire world. And they would say, this is a business problem we're trying to solve,

52:56 you know, and at Exxon, we have data scientists all over the world and like all sorts of different

53:00 teams and things like that. So I like did not know all the data scientists at Exxon. And they'd say,

53:05 this is the problem we're facing. Here's the data go, right? And I loved participating in these. It was

53:10 like right up my, my wheelhouse of like, I really enjoy exploration and all this stuff.

53:15 At the time I was getting my master's degree, but I didn't have my master's degree.

53:18 And I was competing against, so I'm just a chemical engineering grad, right? And I'm competing against

53:24 people with PhDs in computer science and in data science and all these like people who have way

53:29 more experience than me. And I actually won a few of these competitions. Thank you. I appreciate it.

53:35 And it's not because I was a better programmer or a better data scientist. It's because I majored in

53:40 chemical engineering and I knew the business problem, the domain extremely well. And I kind

53:46 of knew the programming and the data science stuff, but the combination of them made me very valuable.

53:51 Like one of the best examples I have is we're looking at crude oil properties. And I remember

53:56 like there was a forum where you'd like ask your questions. And one of the, one of the data scientists

54:01 asked, Hey, is sulfur bad? There's lots of sulfur in this. Is it bad? And like to a chemical engineer,

54:06 that's like the most obvious thing. No, you, yes. Sulfur is very bad in crude oil. That's very,

54:11 no, no, that's like such a fundamental thing to me and to him or her. That was like groundbreaking.

54:17 And so, yeah, your domain can become your superpower in your career.

54:20 Yeah. And it makes it way harder for ChatGPT and other types of tools to just automate you out of a

54:26 job because you bring in all these skills together, which is awesome. But it also makes it easier for

54:31 you to get the job. It makes it easier for you to continue your momentum of whatever you've been up

54:35 to. It's just, it's good all around. Yeah. I think it's more fun too, because once again,

54:39 like when I was trying to decide if I should study computer science, I was like, man, I don't really want to

54:45 to build an Excel workbook for building an Excel workbook sake. That's still true for me today.

54:52 I don't want to do data science for data science sake. I only like machine learning or data science

54:57 when I'm doing it to solve a really fun problem I'm passionate about. That's where it's more fun.

55:01 So if you can be excited about the domain and excited about the algorithms, I think that's a

55:05 great place to be. Absolutely agree. All right. We're getting short on time,

55:09 but maybe tell us a bit about your data career jumpstart. You've referred to it a couple of times.

55:14 Yeah. I have a company called data career jumpstart. I just try to do a lot of education. So the

55:18 education happens on LinkedIn happens on YouTube. And I actually forgot to mention this at the

55:23 beginning, but I have my own podcast called the data career podcast, where I help people land their

55:28 first data job. We're about at a hundred episodes. So not quite the groundwork that you've put in.

55:33 That's still a ton. That's awesome.

55:35 Yeah, we're getting there. And then, yeah, I also have a bootcamp where I try to affordably help

55:40 people land their first data analyst position by teaching them the skills, the networking and the

55:46 project and portfolio building that they need to do something.

55:50 Like the long version of this show.

55:51 Yeah. Basically. Well, yeah. Just take what we talked about today, expand on it,

55:56 make it like 350 unique lessons. And that's exactly what it is.

56:00 Yeah. Very cool. All right. Well, we're about out of time. So maybe just every final call to action,

56:06 people maybe are inspired. I see Dave go out in the audience. That's an awesome talk. Very much so.

56:11 What's next. It's easy to be inspired, but you got to take action.

56:15 Yeah. I love that. I think it's always fun to listen to podcasts, but you probably benefit way

56:21 more from the action you take after a podcast. So for you guys who are maybe interested in a data

56:26 analytics or a data science career, explore that. If you're like, yes, I'm in, make a plan, make a

56:32 roadmap. If you need help, I have a webinar that will help you make a roadmap. What skills should you

56:36 learn? How should you be networking and stuff like that? But really probably if you're just getting

56:40 started trying to figure out what skills you should learn, like what are the top skills that you should

56:44 be learning and then learning those skills and then not only learning those skills, but take action and

56:48 learning and build some sort of a project that we talked about that you could put on a portfolio,

56:52 make a streamlet app or something like that. That's probably the best action you could possibly take.

56:57 If you need any ideas, advice, feel free to check out my website, datacareerjumpstar.com or the

57:02 podcast data career podcast. Hopefully there's a lots of free resources for you guys to check that out.

57:07 If you've never seen streamlet before, I have some YouTube videos about streamlet that you guys can check out, but I love it. Just take action somehow, do something.

57:13 That's one of the huge, huge differentiators is like, you might be inspired, but you just got

57:18 to start taking those steps and it becomes a snowball. So thanks for sharing all your experience and your

57:23 advice. Hopefully some people out there are taking action and yeah, I'll put everything we talked about

57:28 in the show notes, of course. So thanks for being here, Avery.

57:30 Yeah. Thank you. Thanks for having me. I appreciate it.

57:32 You bet. Bye all.

57:33 This has been another episode of Talk Python to Me.

57:37 Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the

57:41 show. Take some stress out of your life. Get notified immediately about errors and performance

57:47 issues in your web or mobile applications with Sentry. Just visit talkpython.fm/sentry and get

57:54 started for free and be sure to use the promo code talkpython, all one word. This episode is sponsored

58:00 by Posit Connect from the makers of Shiny. Publish, share and deploy all of your data projects that

58:05 you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, reports, dashboards and APIs.

58:14 Posit Connect supports all of them. Try Posit Connect for free by going to talkpython.fm/posit.

58:21 P-O-S-I-T.

58:22 Want to level up your Python? We have one of the largest catalogs of Python video courses over at Talk Python.

58:28 Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all,

58:34 there's not a subscription in sight. Check it out for yourself at training.talkpython.fm.

58:38 Be sure to subscribe to the show, open your favorite podcast app, and search for Python.

58:43 We should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play,

58:50 and the direct RSS feed at /rss on talkpython.fm. We're live streaming most of our recordings these

58:57 days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe

59:01 to our YouTube channel at talkpython.fm/youtube. This is your host, Michael Kennedy. Thanks so much

59:08 for listening. I really appreciate it. Now get out there and write some Python code.

59:20 I'll see you next time. I'll see you next time.

Talk Python's Mastodon Michael Kennedy's Mastodon