Land Your First Data Job
Episode Deep Dive
Guest Background: Avery Smith
Avery Smith is a data scientist, instructor, and consultant who took a unique path into data careers. He started out studying chemical engineering, discovered a passion for programming through MATLAB, and later transitioned into data science. Avery has worked at both small startups and large enterprises (including ExxonMobil), implementing machine learning solutions and translating data insights into real business impact. Now, he runs Data Career Jumpstart, mentoring aspiring data professionals on how to land their first data job.
1. Navigating the “Zero-to-One” Challenge
Breaking into a data role can feel like the toughest leap. Avery’s journey from lab technician to junior data scientist proves non-traditional routes truly can work. Once you land that first job, opportunities for growth multiply quickly, making future moves far easier.
- Many encounter the catch-22 of needing experience to get a job, yet needing a job for experience.
- Smaller companies or internal transfers (like Avery’s lab-tech-to-data-scientist story) can help you break in with fewer prerequisites.
- Once you have that first position on your résumé, each subsequent career move becomes much smoother.
2. The Power of Networking
Networking exposes you to hidden job openings and referral paths you’d never see by just applying online. Often, roles are filled before they’re publicly posted, emphasizing the value of genuine connections. Whether you meet people at local conferences or within your own organization, building relationships can significantly boost your job-hunt success.
- Many opportunities are filled via referrals before ever hitting public job boards.
- Local meetups and conferences are prime avenues for making personal connections.
- Even non-technical coworkers can help you find referrals within their companies.
3. The Role of Python
Python’s expansive ecosystem, user-friendly syntax, and vast community make it the go-to language in data science. From scikit-learn and PyCaret for machine learning, to Streamlit for quick app deployments, Python offers a streamlined way to build and share data projects. For those starting out, it’s one of the fastest paths to tangible, demonstrable success.
- Tools like Data Nerd show how in-demand Python is for data jobs.
- Its libraries and frameworks, including scikit-learn, PyCaret, and Streamlit, cover a broad spectrum of applications.
- Python’s lower barrier to entry helps new data professionals learn quickly and showcase value fast.
4. SQL as “Table Stakes”
Although you can do a lot with Python, SQL remains essential for working with large, structured data. Companies typically house critical information in relational databases, so querying proficiency can make or break your day-to-day success. When data extends beyond basic CSVs, SQL skills often become indispensable.
- Mastering fundamental commands (SELECT, JOIN, WHERE, GROUP BY, etc.) already makes you valuable.
- Many companies store massive amounts of data in relational databases, making SQL knowledge essential.
- Once data grows beyond basic CSVs, SQL queries are often indispensable for efficiency and performance.
5. The Power of Domain Knowledge
Technical prowess matters, but so does understanding your industry. Avery’s chemical engineering background helped him excel at ExxonMobil, because he grasped nuances—like sulfur levels in oil—that pure data scientists might overlook. Blending domain expertise with data skills can make you a vital asset in any organization.
- Avery excelled in internal data competitions at ExxonMobil because he knew how sulfur levels affected oil refining.
- Combining data expertise with fields like biology, finance, or engineering gives you a unique edge.
- Clear insight into the “why” behind data leads to more accurate models and solutions.
6. Creating a Portfolio with Accessible Tools
Portfolios let you prove your abilities instead of just claiming them. By building small, focused data projects or interactive dashboards, you offer tangible evidence of your skills. Tools like Streamlit or Plotly Dash make deployment more approachable, allowing you to share your work with potential employers.
- Tangible demos, such as interactive dashboards or web apps, let your skills speak for themselves.
- Streamlit and Plotly Dash make it simpler to convert notebooks into live apps.
- Deployment is far easier today—once-challenging steps can now be done quickly to showcase your work publicly.
7. AI as a Helper, Not a Replacement
Generative AI tools like ChatGPT can streamline coding tasks and brainstorming, but they don’t replace human judgment or creativity. Avery routinely uses AI-generated scaffolding for his Python or Streamlit projects, then refines the code manually. This lets him move faster without sacrificing the critical thinking that ensures robust solutions.
- Avery uses ChatGPT as a starting point for quick code scaffolding in Python or Streamlit projects.
- It’s an iterative process: AI suggestions are refined or corrected as you build.
- AI can also help brainstorm resume phrasing, highlight key skills, and speed up research.
8. A/B Testing Your Resume
A résumé’s main function is landing interviews. If it’s not working, iterate on it as you would a data experiment. Make small changes—like tweaking keywords or clarifying an accomplishment—then apply to a few more jobs to see if your response rate improves.
- Change one aspect, apply to 10 openings, track results, and iterate.
- Align keywords with role requirements to get past Applicant Tracking Systems (ATS).
- Showcase specific outcomes—how you improved processes, tackled real data sets, or deployed a working model.
Notable Links and Resources
Below are the tools, libraries, and websites explicitly mentioned or referenced:
- Avery Smith on LinkedIn
www.linkedin.com - Data Career Jumpstart
datacareerjumpstart.com - Data Nerd (Job Trends Site)
datanerd.tech - Write C# LINQ queries
learn.microsoft.com - Streamlit
streamlit.io - Plotly Dash
dash.plotly.com - scikit-learn (machine learning library)
- PyCaret (AutoML library)
- Shiny for Python (RStudio/Posit)
Overall Takeaway
Landing your first data job calls for a balanced blend of networking, skill-building, and showcasing real-world applications. Python and SQL remain central pillars, while domain expertise can elevate you far beyond what purely technical skills can achieve. Whether you’re spinning up a Streamlit demo or optimizing your résumé with an A/B testing mindset, tangible evidence of your capabilities is key. And remember—once you push through “zero to one,” every step afterward tends to get a little easier.
Links from the show
Data Career Jumpstart: www.datacareerjumpstart.com
Data Nerd Site: datanerd.tech
Write C# LINQ queries to query data: learn.microsoft.com
A faster way to build and share data apps: streamlit.io
Plotly Dash: dash.plotly.com
Michael's Keynote: State of Python in 2024: youtube.com
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm
--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode Transcript
Collapse transcript
00:00 Are you interested in data science, but you're not quite working in it yet?
00:03 In software, getting that very first job can truly be the hardest one to land.
00:08 On this episode, we have Avery Smith from Data Career Jumpstart here to share his advice
00:13 for getting your first data job. This is Talk Python to Me, episode 455, recorded January 18th,
00:20 2024.
00:21 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy.
00:40 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython,
00:45 both on fosstodon.org. Keep up with the show and listen to over seven years of past episodes
00:51 at talkpython.fm. We've started streaming most of our episodes live on YouTube. Subscribe to our
00:57 YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of
01:03 that episode. This episode is brought to you by Sentry. Don't let those errors go unnoticed.
01:09 Use Sentry like we do here at Talk Python. Sign up at talkpython.fm/sentry.
01:15 And it's brought to you by Posit Connect from the makers of Shiny. Publish, share, and deploy all
01:21 of your data projects that you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI,
01:27 Flask, Quarto, Reports, Dashboards, and APIs. Posit Connect supports all of them. Try Posit Connect
01:34 for free by going to talkpython.fm/Posit, P-O-S-I-T.
01:39 Hey folks, before we jump in and talk about data science jobs and careers, I want to tell you
01:44 really quickly about some awesome news. Back in February, I gave the keynote at PyCon Philippines.
01:50 It was entitled The State of Python in 2024. Well, that is now out on YouTube. The team at
01:58 PyCon Philippines did a great job. The video came out great. If you want to check out The
02:02 State of Python in 2024, according to me, just click on the link in the show notes to watch
02:07 it over on YouTube. Now let's talk to Avery. Avery, welcome to Talk Python to Me.
02:11 Thanks so much. I'm so excited to be here and be part of the show.
02:15 I'm excited to have you here as well. You know, one of the things that people reach out to me often
02:19 is how do you get into data science? How do you get into programming? How do you get into Python?
02:26 You know, I've been trying, or maybe they got a degree or they took some training program,
02:32 bootcamp or something. And going from zero to one, I think is the biggest career step you have to make.
02:40 That next job and the one after that, it only gets to be smaller steps, not bigger steps.
02:45 And it's really tough because that first big step, you're brand new at it. You have no experience,
02:49 right? It's your first data science job or your first programming job. And so hopefully we can
02:54 give some folks out there a little bit of a hand up to help them make that jump.
02:59 Yeah, totally. I like to show this graphic that says, it's a circle and it's a circle of text. And it says,
03:05 I can't get a job because I don't have experience because, and then it restarts,
03:11 I can't get a job. And that's the tricky part. It's like, how do you get a data science job when
03:15 you have no data science experience? Because to get data science experience, that seems like you have
03:20 to have a job as the prerequisite and vice versa. So it is very tricky. So happy to chime in on that
03:25 today.
03:26 The industry can take it too far. They can take it way too far. So a few years ago, there was a
03:32 really funny tweet that went around back when they call them tweets. I don't know what they're called
03:36 anymore. Sebastian Ramirez, the guy who created FastAPI, saw a job posting. When FastAPI was like a
03:44 year and a half old, it said, you must have four years of experience with FastAPI to apply. He said,
03:50 hey, look, I'm the creator of FastAPI and I'm unqualified for this job. What kind of world are
03:55 we living in?
03:56 Yeah. I don't want to live in that world, but that's unfortunately where we're at. That's so tough.
04:01 And it's hilarious. These job descriptions are getting out of hand. That's for sure.
04:06 Yeah. Well, with AI, it's probably not going to get better.
04:08 We could talk about that more later. But before we get into that, let's just jump in with a little
04:13 bit of background on you before we get to the topic. Tell us a bit about yourself. What do you do?
04:18 How'd you get into Python? Things like that.
04:20 Yeah, absolutely. So I'm currently a data science consultant and also a data science instructor.
04:27 I run some online programs where I teach people to become data analysts mostly is what I'm focused on.
04:34 But I also have this practice where I help companies solve data problems with different techniques.
04:39 I started actually by studying chemical engineering in college in my undergraduate degree.
04:45 And about a semester in, I realized, crap, I hate this. This is not for me.
04:50 But I was a little on a little of a tough. Yeah. Do you agree? Have you felt something similar?
04:54 I did a semester of chemical engineering as well. I thought, I love chemistry. I love math. Put them
04:59 together. Somehow they don't go together. It's like ice cream and eggs or something. No,
05:04 they don't go together for me at least.
05:06 Yeah. It wasn't good for me either. I was just like, oh man, I'm actually not interested
05:10 in refineries or like manufacturing. But I, like you, liked chemistry. I liked math. I thought this
05:16 is perfect. But I quickly realized, oh man, I really liked this whole programming part that I get to do
05:21 in MATLAB at the time when I was an undergrad. And I was on a time crunch to get through college kind of
05:27 quickly through eight semesters. And the other issue I had was I didn't know what to do instead. It was
05:33 like, I don't really want to study computer science. Part of the reason why is they kind of
05:37 had this weed out course at the beginning, which you had to build Excel from scratch, basically like
05:43 some sort of a spreadsheeting tool. And I was like, why would I rebuild something that already exists
05:48 that I don't even like using in the first place? I wasn't really into it. So I didn't, I didn't know
05:52 what to do. And luckily I was working as a lab technician at this company, the really cool company
05:58 that makes the sensors that basically have the ability to smell. So they can sniff what's in
06:04 the air and it has applications for finding drugs or bombs and airports and stuff like that.
06:09 And there was a data scientist on staff and that data scientist was awesome. He was like showing me
06:14 all these cool algorithms he was writing for these sensors. And then one day he got up and left and he
06:20 left the company and we tried to hire another data scientist for like six months, but they were really
06:26 expensive. We were a small company and none of them really wanted to move to Utah where I lived in
06:30 Salt Lake City. And so we couldn't, we couldn't really find someone that would be able to do it.
06:33 And so finally I was like, well, I really liked this programming stuff. And I, you know, the data
06:38 scientist showed me a thing or two, maybe I could take a stab at this. And I started, I wrote like my
06:42 first machine learning algorithm and I was like, oh my gosh, I'm addicted to this. And then I never
06:46 looked back and had been data science since basically.
06:49 What a great story. Yeah. I think, I think a lot of people fall into programming that way. And for some reason,
06:55 not unexpectedly, but for some reason, a lot of people fall into Python that way as well. They're
07:00 like, you know, I have a job and I got this thing I got to do. I just need a little bit more than maybe
07:05 like an Excel spreadsheet or something and put it together. And you're like, actually, this is cool.
07:10 After a while, like, this is cooler than what I've been doing, or maybe I'll make it a good part of what
07:14 I do. Right.
07:15 Yeah. A hundred percent. Even just making, it was in MATLAB, which is basically engineers version
07:20 of Python or college version of Python 10 years ago. Right. And I made like tic-tac-toe and I remember
07:26 playing tic-tac-toe against the computer. I think that's what it was. Or maybe it was,
07:29 maybe it was Hangman. I can't remember. But I remember like the idea of like being able to play,
07:33 to program games and play against the computer. And I built it. I was like, this is the coolest thing
07:38 ever. I got to, I got to do more of this.
07:40 Absolutely. You know, I think I've done some MATLAB too, when I was younger and it's not that
07:44 different from Python, but it's, I think one of the big differences other than it just being like
07:49 embedded in a big expensive app is it's not a general purpose programming language, right? You
07:54 wouldn't go, you know, that was fun, but let me go build this website in MATLAB or let me create
08:00 Airbnb and MATLAB or, you know, like there's, you just don't want to sort, Azure has this like
08:06 self-prescribed limit to what you can do with it.
08:08 That's one of the coolest parts about Python is it's really a Swiss army knife and you can pretty
08:13 much do, I don't want to say anything, but pretty close to anything in Python, which makes it really
08:19 neat. And obviously one of the huge limitations of MATLAB is one, it costs thousands of dollars,
08:23 but two, you're right. It's not going to do cybersecurity for you. It's not going to build
08:27 websites, but the syntax at the end of the day was, was really quick. It was, it was easy for me to
08:32 transition from MATLAB to Python because the syntax isn't all that different.
08:35 No, it's not all that different. More math focused, but pretty similar. So I think maybe that's a good
08:41 place to start discussing and exploring the topic of your first data science job. And
08:46 wouldn't necessarily plan on starting here, but let's, let's start with before you even necessarily know
08:50 programming language, right? Maybe you've dabbled in MATLAB or you've dabbled in Excel or even dabbled in,
08:57 I don't know, JavaScript or something. This thing we've been talking about with MATLAB and it applies to other areas as well,
09:02 like through programming languages per se, like Julia or something like that, is how,
09:08 if you invest your time into learning one of these things really well, like how broadly industry-wide
09:16 of a skill, high demand skill is that going to be, right? If you learn MATLAB, you put yourself in a box,
09:21 you learn a more general programming language, you kind of have more options afterwards, right?
09:25 Yeah, totally. I think like the more broad of a language you learn, the more useful you are to,
09:31 to more industries in general. But I might take that even a step further and just say, you know,
09:37 learning MATLAB, not a whole lot of companies use MATLAB, but just like landing your first data job,
09:43 going from zero to one is the hardest, learning your first language, zero to one is the hardest as well.
09:48 And then once you have that first language, the next language becomes so much easier. So
09:52 one of the first things I learned was MATLAB. And then I moved to Python and that was easier. And then
09:57 I learned SQL and then I learned R and then I learned JavaScript. And every time I added like a new tool
10:02 to my toolkit, it was quite, not almost, it was easy, but it got easier with each one. I think that's true
10:07 with foreign languages as well. Once you learn one foreign language, then the third and the fourth become
10:12 quite easy. At least that's, that's what I heard. I speak kind of through two and a half languages,
10:17 but like I, there's people who speak like seven and they always say like the sixth and the seventh
10:21 become easier.
10:22 Yeah. You wonder how could you probably, because learning the first one is so hard,
10:25 first foreign language. So you're like, well, how could you possibly take that on for this many
10:29 languages? And it's that it's not the same challenge each time, right?
10:32 Yeah, exactly.
10:32 Yeah. So I think when people are considering getting into data science, they really want to consider
10:38 what language they choose and where they go. Like you're coming out of a
10:42 college program. You might feel like MATLAB or something like that's real popular. And yet
10:47 that's because it's popular amongst professors who forced their students to do it. That doesn't
10:51 necessarily mean that's the world, the broad worldview. What do you think about R? You know, both.
10:57 I like R. I'm not, I sometimes troll R on LinkedIn. So I guess that's another thing I should say is I post
11:03 a lot on LinkedIn, kind of a LinkedIn guy. And so a lot of the times, honestly, just for jokes and kicks
11:08 and giggles, I'll kind of roast R on LinkedIn just to get the trolls angry in the comments.
11:13 I've invented it. It's quite fun. It's quite a fun experience, but I'm not that big of a hater. I
11:18 think that's really interesting about R versus Python is obviously a big debate in the data
11:23 science community is R is kind of that does one thing really well. And it's getting a little less
11:29 of that as like more packages and libraries are added to R, but R does the statistics and machine
11:34 learning very well. But obviously I don't think once again, I don't know any websites, any like
11:39 super functioning websites that are built on R. I don't know any cybersecurity that's really done
11:44 done, done VR. So I think R does what it does well. The syntax sometimes is a lot easier for
11:49 people to go from Excel, which a lot of people are more familiar with in the finance or banking world,
11:54 for example. The syntax in R is a little bit more similar to those Excel formulas than it is to Python.
12:00 So I think sometimes people have a little bit more success just because, oh, this kind of feels like
12:04 our formulas are sorry. This feels like Excel formulas. And so people really get there. I think what
12:10 you're kind of alluding to is if you're going to learn one skill, you might as well learn the one
12:14 skill that's applicable to the most, the widest net, right? And so that way you're fishing in the
12:20 biggest lake you possibly could versus in a smaller pond of R. I think that's worth looking at. And one
12:27 of the things I actually really enjoy doing, because you know, you mentioned, oh, you might think MATLAB
12:31 is popular because that's what the professors taught you. And there's actually not a whole lot of
12:35 data out there about, well, what should you learn? So I don't know if you know who Luke Bruce is. He's a data
12:41 analyst YouTuber. I was going to say YouTuber on YouTube, but that's kind of redundant on YouTube. And
12:45 one of the things he's done is he's actually built this tool where he's web scraping thousands of jobs,
12:50 different data jobs every week, and then displaying and analyzing the skills required for those jobs.
12:56 So it's actually like a data driven way of saying, if you want to be a data scientist, what skills should you
13:01 actually be focusing on as you go, as opposed to just listening to what a professor will say,
13:07 or what a LinkedIn influencer will say, or what your bootcamp will say. Like actually getting some
13:13 data on, I think is pretty neat. That is super cool. And I'm not familiar with Luke. So we're going to dig
13:19 him up and put him in the show notes for later so people can check that out. For sure. Do you remember
13:23 any of the trends you've recently talked about? It's datanerd.tech, I think is the website there.
13:27 I look at it mostly for data analysts because that's who I work with the most. So I know the
13:33 data analyst data very well. SQL is number one at 50%. I think Python is number two at like 30%.
13:40 I think Python might've jumped it. Well, this is for all data positions right here. So the job title,
13:47 you can choose. So which one do you think I should pick here? Data?
13:51 Maybe data scientist. Data scientist. Yeah. Right.
13:53 What's that? Yeah, you're right. Wow.
13:56 Whoa, Python 69%. Look at that. That's huge. So like, that's even, that's even what? 20% more than SQL,
14:04 which a lot of people are like, if you were going to be a data scientist, you have to know SQL.
14:07 Yeah. If you look at the job descriptions, Python's mentioned a lot more. So if you're going to learn,
14:11 if you're brand new and you're going to learn one, you might as well start with Python. Because that's
14:15 probably the most in demand skill that there is right now for a data scientist.
14:18 Yeah. And it's pretty easy, right? It's not like, well, why don't you just learn C++ for
14:22 embedded devices? You're like, you know what? Maybe I'll pick something else to start with. Right. But
14:26 you know, Python's pretty easy. I agree with you. I think Python's great. I actually think,
14:31 I think SQL is probably easier to learn if I'm being honest, because really, especially for like
14:35 data science stuff, there's only about like 20 commands that you need to know in SQL. But it's,
14:40 once again, SQL's a lot more, there's no websites built on SQL. I'll tell you that much. So
14:44 it's a lot more limited on what it can do.
14:46 It's a skill, but not the language. It's not enough on its own, generally. I mean,
14:52 you can do reports and quite a bit with it. But you know, it's like, when you see these programming
14:57 popularity, like what's the most popular language? Oh, look, CSS is the third most popular. That's not
15:02 a language. That's a thing that you use with other languages, right? Like use it with all the other
15:06 languages. That's why it's high up. But that doesn't mean it's high in demand. Exactly. It's just like
15:11 table stakes, you know? Yeah.
15:12 So you kind of got to distinguish table stakes from like picking an area, I think.
15:17 That's totally true. And really, I think Pythonistas could make the argument that there's
15:22 really nothing in SQL that you couldn't do in Python. That's a little somewhat true,
15:28 true depending on data size and stuff like that. But regardless, there is ways that you can do
15:32 most of the SQL commands in Python one way or another. Yeah. Yeah.
15:36 It could be when I first became a data scientist, I didn't even know SQL and I was doing SQL commands
15:41 or I was doing the aggregations or the where functions or the window functions using Python.
15:47 So you definitely can. As long as your data is not like super big, then you'll totally be fine.
15:51 Right. Like some kind of generator or even slices or yeah, things like that, right? List comprehensions,
15:57 set comprehensions, all that kind of stuff. Kind of like, gosh, I really wish, a little bit of a
16:02 sidebar, but I wish like list comprehensions and all those things had just a few more SQL features,
16:08 right? Like in a list comprehension, I say, give me this thing, maybe give me this property of this class
16:14 modified, like give me the user's name, uppercase. Right. So that's like select. And then for thing
16:20 in collection, that's like from table or whatever. Right. And then you have the where clause with the
16:27 if statement, but boy, wouldn't it be cool to have like a sort also in there and other things like that,
16:33 you know? Oh, well, totally. It's so close. The cool thing is, is if you want that sort,
16:38 it's what one extra line. Like it's, it's not, it's not too bad. So it, Python, I mean,
16:43 I don't want to say this necessarily to hate all, to make all the data scientists and SQL lovers
16:47 mad, but, but really Python can do a lot of the things that SQL that's for sure.
16:52 Yeah, that's for sure.
16:53 This portion of talk Python to me is brought to you by Sentry code breaks. It's a fact of life with Sentry.
17:00 You can fix it faster. As I've told you all before, we use Sentry on many of our apps and APIs here at
17:06 Talk Python. I recently used Sentry to help me track down one of the weirdest bugs I've run into in a long
17:12 time. Here's what happened. When signing up for our mailing list, it would crash under a non-common execution
17:19 past, like situations where someone was already subscribed or entered an invalid email address or
17:25 something like this. The bizarre part was that our logging of that unusual condition itself was crashing.
17:32 How is it possible for her log to crash? It's basically a glorified print statement. Well, Sentry to the
17:39 rescue. I'm looking at the crash report right now, and I see way more information than you'd expect to
17:44 find in any log statement. And because it's production, debuggers are out of the question.
17:48 I see the traceback, of course, but also the browser version, client OS, server OS, server OS version,
17:56 whether it's production or Q and A, the email and name of the person signing up. That's the person who
18:01 actually experienced the crash. Dictionaries of data on the call stack and so much more. What was the
18:06 problem? I initialized the logger with the string info for the level rather than the enumeration dot info,
18:14 which was an integer based enum. So the logging statement would crash saying that I could not use
18:20 less than or equal to between strings and ints. Crazy town. But with Sentry, I captured it, fixed it,
18:27 and I even helped the user who experienced that crash. Don't fly blind. Fix code faster with Sentry.
18:33 Create your Sentry account now at talkpython.fm/sentry. And if you sign up with the code
18:39 TALKPYTHON, all capital, no spaces, it's good for two free months of Sentry's business plan,
18:46 which will give you up to 20 times as many monthly events as well as other features.
18:51 Probably the biggest, it's a bit of a diversion, but the biggest similarity to that I've seen in
18:57 the languages is C#'s link where they actually have almost all the query operators, including
19:04 joins and stuff like that built into the programming language. I'd love to see more of that kind of
19:09 inspiration into Python, but you know, that's all right. It's still really good. I've got a lot of
19:13 cool SQL-like features, but you're right. Once you are no longer working with data and memory,
19:17 or you want indexes, right? Like this concept of indexes is not sufficiently well understood. I think
19:24 every time I hit a website that takes five seconds to load, I'm like, somebody is not doing all the
19:28 things they should be doing. I just know it. That's totally true. What about SQL? You know,
19:34 let's talk about that for a bit, right? The SQL, the query language or databases and other things,
19:40 there's ways to SQL query, not just relational databases. But you said you got away with not
19:47 quite learning that, but do you think if you could start over, maybe making an effort to learn that
19:52 would be really valuable? Like how, how important is this a thing in the beginning of your career?
19:56 The interesting thing, you know, about landing a data job is your skills only plays, I say,
20:03 a third of the role. Your portfolio or the way that you portray your skills and your network,
20:09 I think are the other two thirds and they're actually more important than your skills. And that's
20:12 kind of how I got away with not knowing SQL and not even being, to be honest, that good at
20:17 Python at the time was because I used my network to be in the situation to get my lab technician job
20:24 in the first place. And then once again, I use that same network, in this case, my coworkers,
20:28 to land that first data scientist position after we couldn't hire anyone. And if I would have been
20:33 applying externally for that role, chances are I wouldn't have gotten that role. I probably didn't
20:39 know enough at the time to land that type of a role, but because they knew I was hardworking,
20:43 they knew I wasn't like a total idiot and I really liked to learn. They took that chance on me. It
20:49 paid off really well for them because at the time I was still in college. And so I wasn't getting paid
20:53 that much. And I was getting, I was not getting paid like a data scientist, but I was getting results like
20:58 a data scientist for them. So I think it was, it paid off for both of us. But I think if that was an
21:03 external job and I applied for it, I probably didn't have enough skills for it. So I definitely think
21:07 learning SQL, if you want to land data science job, isn't a bad place to start, especially because,
21:12 like I said, there, I mean, any programming language, I like to think of like the iceberg,
21:17 kind of like the Titanic, right? There's the parts that you see, and then there's the parts that you,
21:21 that you don't even know that you, that are there. And, and really you could spend the rest of your life
21:26 trying to master SQL or the rest of your life trying to learn Python. But the cool thing is,
21:31 is a lot of the time you only need that top little bit that's sitting at the,
21:34 the top of the surface of the water to actually get stuff done. And so for SQL, I think that's like
21:40 20 commands. And I think you could learn it honestly in like a month, you could learn those,
21:45 those 20 commands pretty easily, but it worked out for me. And I, I didn't have to use it that much
21:49 at the time until I was probably about almost three years into my job. And I actually had switched jobs
21:54 to a bigger company. The other thing that I was working for a smaller company where we didn't have a ton
21:58 of data. So we could use CSVs kind of as our, our database, which is not great practice. But when I,
22:05 when I eventually became a data scientist at Exxon mobile, I was going to say they didn't use Excel as
22:09 a database, but they still did. But the point is they had much larger SQL databases with hundreds of
22:14 thousands, actually millions of rows of data that I had to query.
22:17 Yeah. Then you gotta be really, you need to understand it at a much deeper level. You're
22:22 like, if you do a query like this, it's going to be super slow. But if you do it like that,
22:26 it can use the composite index for the sort and then blah, blah, blah, blah. All right. Then
22:30 you're getting to the bottom of the, the iceberg in SQL, or maybe not the bottom,
22:33 maybe like the middle chunk under the water, but there's so much to learn for both of them.
22:37 Amir at the audience asks, you know, like when you talk data job, like what kind of jobs are out
22:42 there? Right. So we talked to both about how we did chemical engineering and then we saw like
22:46 chemical factories, like, yeah, I don't really want to work here anymore. I'm out.
22:49 So thinking about like, well, what are the kinds of jobs you do? I think that's really important because
22:55 it's easy to get focused in on like the FANG companies. Like I want to work for like some super
23:02 big tech company. I want to move to San Francisco and like that, that, that, right. Like there's not just
23:07 plenty of other jobs, but the opportunities, just like you described, and as well as
23:12 like my first job, I worked at a company that had like eight people and it was awesome. Right.
23:17 They didn't expect me to be, you know, running Kubernetes clusters and doing all sorts of great.
23:22 They're just like, I need you to make this thing happen. Can you do like, I'm pretty new,
23:26 but that thing I can make that happen. Like, let's go. Right. And I feel like the, the possibilities
23:30 to get in, especially with these maybe more niche type of industries and companies might even be easier
23:37 for a first job. People seem to be really obsessed with, with the FANG. And I don't know if that's
23:41 like a societal thing, or if it's just, those are the companies that we use a lot. And so we're
23:46 excited about them, but yeah, there's so many more data jobs outside of FANG than there are inside of
23:51 FANG, even though there's, there's quite a bit inside of FANG. And oftentimes those roles can be
23:57 much more interesting and you can do a lot bigger of an impact. When, when I was working at the small
24:02 company, VaporSense, I like, I had so much power. I didn't even realize it. I had such a big effect on the
24:08 company. I was presenting to, you know, Fortune 500 companies and what I did really made a difference.
24:14 And when it came to the point where ExxonMobil offered me to go be a data scientist for Exxon,
24:20 I said, Oh, I want to go work for the big company with the nice desk and the nice laptop and, you know,
24:26 try something new. And when I got there, I really, I had some pretty cool opportunities when I was at
24:31 ExxonMobil, but ultimately I left pretty shortly after two years of being there because I just felt like a
24:36 cog in the machine and I didn't feel like I was actually making a difference. And that was really
24:40 important to my work satisfaction of like, is what I'm doing being used? Is it being used to better the
24:45 world? Do I feel valued? And the answer was kind of no for me when I was there. So there's definitely
24:50 a trade-off between the small companies and the big companies, but also to go back to your original
24:54 question, there's so many freaking roles in the data world that you're not even thinking of that. Like,
24:59 I'm not even thinking of, I saw a new one the other day when I was helping one of my students.
25:04 It was like, it wasn't data janitor, but it was something like that where I was like, I don't
25:08 even know what that role is, but there's, there's so many roles. When I was, when I was a data scientist,
25:12 VaporSense, the small company, my actual title was junior chemometrician, which basically means
25:19 you're doing data science with chemistry. When I was at ExxonMobil, when I was first there,
25:24 I was doing data science, but my actual title was optimization engineer. And so there's so many
25:29 titles that we don't even think to search of, or even to look up, but those are all data science
25:33 roles. I was doing machine learning every day in both those roles. And you would maybe never guess
25:37 from those titles. Yeah. You would never guess. No, that's awesome. What machine learning libraries,
25:42 frameworks were you using? At VaporSense, once again, because it's a smaller company,
25:46 I had a lot more say in what I was doing. We were building a bunch of machine, we were building
25:51 classification models to basically to take the data from our sensors and sniff if something was in the
25:56 air. Sometimes that was a yes, no, like, oh yes, there is ammonia in the semiconductor factory and
26:02 that's bad. So that's a yes classification kind of binary, right? Other times it was, what drug is this?
26:09 Is this meth or is this heroin? One of the use cases we had was, this is binary once again, but is this
26:15 recreational marijuana or medicinal marijuana? And can we tell the difference between, between those?
26:20 So we are usually using classification models, usually built in scikit-learn in Python, the majority of the
26:26 time there. When I was at Exxon, we had a lot less say, like the data scientists had a lot less say in
26:32 the decision making process. We were doing a lot of multivariate linear regression with a lot of crazy
26:39 hacks and transformations kind of in the meantime for one of my positions there. And then the other time,
26:44 the other position I did there, we were doing a lot of auto ML using PyCaret and letting it kind of
26:50 decide what type of models to do. So. Okay. The unsupervised learning type stuff, huh?
26:54 It was awesome. It was really fun to, to, I love PyCaret because it's like, okay, go make 25 models and
27:00 tell me which one's the best. It's like, takes, makes my job easy, I guess.
27:03 We're going to be creative with sheer numbers. That's how we're going to come up with a solution.
27:08 Got it. Exactly.
27:09 Well, Diego is asking like, what are some of the common stats methods as in mathematical type stuff
27:16 you would use? So one of the things I know that some people getting into programming think is you've
27:22 got to be really good at math to be a programmer. I think you've got to be really good at logical
27:26 thinking, but you need to almost zero math be like a web developer. You know, we're talking percents
27:33 for CSS, incrementing numbers from one to two to two to three for IDs and stuff like that. But for
27:40 data science, maybe there's a little bit more like, where do you see that kind of background?
27:45 I like what you said, you have to think logically, but maybe the math isn't as important. And I think
27:49 it's actually somewhat similar in data science. I will say you probably need a little bit more math
27:54 than a web developer, but I think it's a lot less than most people think. And it's probably less
27:59 about being able to do the math and maybe more about understanding the mathematical concepts.
28:04 And what I mean by that is a lot of, a lot of, so I also have a master's degree in data analytics.
28:10 A lot of master's degrees in data science and data analytics will say you need calculus and linear
28:15 algebra as kind of a background for your math. And that kind of stops people. I don't want to do any
28:19 calculus. I don't want to do any linear algebra. And while both those concepts do exist in data
28:24 science principles, the majority of the time, the computer, Python is doing the math. You just have
28:30 to be able to interpret the results of the math and kind of know what different directions, like this is
28:36 going down, an optimization problem, you know, okay, that's the derivative, you know, getting closer to
28:40 zero. Like it's really less about knowing how to do the math by hand and more just understanding what the
28:46 math the computer is actually doing. So I think it's actually a lot easier than most people say.
28:50 That being said, knowing how to do a derivative or taking integral, those concepts, I think is
28:55 probably underlying pretty important. But other than that, like a lot of the times I'm doing linear
29:00 regression because it's, it's awesome. It gets the job done. A lot of the time I'm doing hypothesis
29:06 testing and statistics, which you have to look like at a P score, nothing all that crazy. At Exxon,
29:12 I had to do a lot of linear programming, but that's honestly, that's like the exception versus the rule.
29:17 There's not a whole lot of linear programming for most data science, most data scientists. So
29:22 I really don't think the math is, is all that hard. Now, of course, that's coming from someone who
29:27 got a chemical engineering degree, who had to take all the calculus, all the linear algebra. So I did go
29:32 through those courses. I haven't really done it from scratch from like a lot of my students are teachers,
29:37 for example, who never took those courses in college. So I can't speak from that perspective,
29:41 but a lot of my students are able to figure it out at the end of the day and transfer. So it happens.
29:45 Yeah, yeah, for sure. I think there, you make a good point. I think it's about knowing,
29:50 okay, this formula or this algorithm or this test means this thing. It applies in this situation.
29:56 It doesn't apply in that situation. Here's what you're trying to get from it, right? Like,
30:00 I know I need to do a fast Fourier transform. So, and this is what it tells me when I get out the other
30:06 side. But do I need to be able to sit down and recreate the integral and the calculus behind it
30:14 and do that on like a home, like as a homework example, like, give me a function and I'll do the
30:18 Fourier transform and I'll actually do the symbolic integration. Like, no, you probably don't need that,
30:23 right? But you need to know, I do the Fourier transform in this situation and this is why. And then I just say,
30:30 call the function, do it, right? And interpret the results. Really, that's what being a data
30:35 scientist is all about is, yeah, what does the business use case, what's the desired business
30:40 use case? How do I relate that use case to the data? What technique can I use to get the outcome
30:45 that I need? Computer, go do it. Interpret results, present to stakeholders. That's a data scientist,
30:52 right? I think one of the challenges with that is going to be, not that it's not good, but I think
30:57 it's going to be challenging because how do you learn when to use a certain statistical test or when to do
31:04 some kind of funky transformation, like a Fourier transform without more traditional mathematical
31:10 backgrounds? And all the academics will not just go, oh, we're just going to give you like
31:14 five minute overview and they'll help you understand. They're like, nope, we're going to start with this
31:19 axiom or this theorem from differential equations. I'm going to work up. You're like, no, no, no, no,
31:24 no, I don't need that. I don't, I'm not on a four-year plan. I'm on a four-week plan. How do I,
31:30 how do I get value from a couple of the mathematical things without being sucked into like, yeah, now I'm in
31:36 differential equations at Harvard online and I don't understand how I got there.
31:39 It's such a big problem and I'm so glad you brought this up and I'll be vulnerable because yeah, I felt
31:45 the same, the same way. And I was like, there has to be a better way. And so about, what was it? Three
31:49 years ago now, two and a half years ago, three years ago, I said, oh my gosh, I'm going to solve this
31:53 problem and I'm going to start my own data science bootcamp. And so I spent about six months making the
31:58 curriculum, making all the videos. I opened it up. I got some students in there and I ran it for about
32:03 six months and I looked at the results and man, we weren't getting anyone into data science jobs.
32:08 And I thought, ah, what the heck am I doing wrong? I had this brilliant idea of like, we're going to be
32:12 less theory, more project, more hands-on. And I realized, man, the truth is people just learn better
32:19 at work. That's where you learn that whole technique that you just like, how does someone learn that?
32:24 The answer is by getting experience and learning it at work. And when I looked back and I said, okay,
32:28 well, we have had students get jobs. What jobs did they get? And it turns out most of them were
32:33 getting like business intelligence, intelligence engineer jobs or data analysts or financial
32:38 analyst jobs that were a little bit below a data scientist job. And I realized, oh man,
32:43 if we can just help people go from zero to one and get their foot in the door, they can go from one to
32:49 five much quicker at work because work is just, I don't know, it's this magical place, right? Like,
32:54 like you said, they, whatever you were working at earlier, and they're like, hey, can you do this
32:58 Kubernetes thing? They just kind of throw you in the fire and you're like, figure it out. And that's
33:03 somehow you do, I don't know what it is about work, but you figure it out and that's where you learn.
33:07 So that's kind of what I've, why I changed my curriculum to be more focused on, you know, okay,
33:11 maybe people aren't going to become data scientists, but can we get them to zero to one quickly?
33:15 And then they can get paid to learn the rest of the data science stuff when they're actually in that
33:19 first position. How much do you know about what you actually want to do in the industry
33:23 before you've done it as well? Right? Like, you're like, oh, I thought everybody said machine
33:28 learning was awesome. And I've used chat CPT and I loved it, but it turns out actually like API is
33:34 better, but I've never had a chance to build an API. So until I started, I didn't even learn that one,
33:39 it was a thing to that. It was cool or vice versa, right? Whatever. But until you get kind of in,
33:45 you don't even know, like, actually this part is where I really am enjoying it. And so just getting that
33:50 first step, that's a big deal. A hundred percent. You don't know what you don't know until you know
33:55 it. That's why, I mean, really when it comes to, if we like, we go back to just SQL or just Python,
34:00 you could spend, I tell people this, if you tried to master Python before you applied to a job,
34:06 you'd be like 80 years old before you ever applied to a job. Same with SQL, same with machine learning.
34:12 The cool thing about data is we're never going to know it all. And so just learn the bare minimum to get
34:17 your foot in the door. And then you have this place where you're going to get paid to learn what
34:21 you want to learn. Eventually, if you learn, oh, I love APIs. I promise you that there's a company out
34:26 there that will hire you and you can learn APIs on the job. Like that's going to happen. But that first
34:31 step is no true. There's a company out there that it doesn't know it needs APIs, but you could help them.
34:36 And you know, they don't have huge expectations because this is the thing they just learned they needed.
34:40 Right. A hundred percent. Yeah. It's wild, right?
34:42 This portion of Talk Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio,
34:50 and especially Shiny for Python. Let me ask you a question. Are you building awesome things? Of
34:56 course you are. You're a developer or data scientist. That's what we do. And you should check out Posit
35:00 Connect. Posit Connect is a way for you to publish, share, and deploy all the data products that you're
35:06 building using Python. People ask me the same question all the time. Michael, I have some cool
35:12 data science project or notebook that I built. How do I share it with my users, stakeholders, teammates?
35:18 Do I need to learn FastAPI or Flask or maybe Vue or ReactJS? Hold on now. Those are cool technologies,
35:25 and I'm sure you'd benefit from them, but maybe stay focused on the data project. Let Posit Connect handle
35:30 that side of things. With Posit Connect, you can rapidly and securely deploy the things you build in Python.
35:35 Posit Connect, Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, ports, dashboards, and APIs.
35:42 Posit Connect supports all of them. And Posit Connect comes with all the bells and whistles to satisfy
35:48 IT and other enterprise requirements. Make deployment the easiest step in your workflow with Posit Connect.
35:54 For a limited time, you can try Posit Connect for free for three months by going to talkpython.fm/posit.
36:01 That's talkpython.fm/posit. The link is in your podcast player show notes. Thank you to the team at
36:08 Posit for supporting Talk Python.
36:13 Posit Connect: Let's talk about some career advice. I mean, I know you talked about being
36:18 connected on LinkedIn pretty well and certainly having some kind of social network is important. And they
36:23 maybe, it's not that you would call it not social, but a real world network of actual human beings that
36:29 you're, you know, physically know somehow. Posit Connect: What's that? I don't know what that is.
36:32 Posit Connect: I know. Like, we gave that up back in 2020, I thought.
36:35 Posit Connect: Yeah. Posit Connect: Anyway, like, there was some stat that I saw somewhere that,
36:38 you know, over half of the jobs are filled filled before even becomes a job posting, right? Maybe
36:44 some of the best ones is like, hey, who knows somebody who can do this? We need some, like your
36:49 data science example, data scientist example. They quit like, oh, we need somebody. Does anybody know
36:54 good data science? I don't want to just go put it out on the open job market and have to have a hundred
36:58 interviews and who knows what I'm going to get. Like, if you can recommend somebody, let's start there,
37:03 right? So being in that group to be recommended, it's important.
37:07 Posit Connect: It's the key. There was a really interesting survey done on LinkedIn and they said,
37:12 it was kind of, it was done by the same person and Jordan Nelson, by the way, he said, "How do you
37:16 approach getting a job?" And then the next day he said, "How did you get your last job?" And 80% of
37:22 people, they use what I call the spray and pray method, which basically means you go and you apply to as
37:27 many jobs as you possibly can and hope for the best. Cross your fingers. That was 80% of what
37:32 people were doing. And then on the next poll, the next day, it was a total, I think of what, 70% were
37:38 either headhunted, recruited or referred. And so it's like the Pareto principle here where, you know,
37:43 80% of the effort is only getting you 20% of the results. And really 20% of the effort gets 80% of the
37:50 results. So it's okay. We know networking and getting recruited is really important, but how do we do it?
37:55 It's easier said than done. And like you said- In the industry, how do I make friends who are,
37:59 right? It's like, well, my neighbors don't do it. So I guess I'm out.
38:02 That's the tricky thing is, is yeah, if you're not in the industry yet, how do you get recruited
38:06 into it or how do you know someone? And what I've come to learn is it actually doesn't even matter.
38:11 So like, for instance, let's take, let's take your neighbor, right? Your neighbor is probably not a data
38:16 scientist. Maybe you're lucky and they are, and they can refer you to a company. But what's really cool is
38:20 I've learned that companies really come to trust their employees and their employees'
38:24 recommendations. And so even if your neighbor, let's say is a web developer, or maybe even less
38:30 technical, let's just say your recruiter is in finance, right? And if there's an opening,
38:35 like a data science opening at that company, a lot of the times they will actually take their
38:39 employee referrals much more seriously than any sort of cold application that they get.
38:44 And so a lot of the times I've had students who just know someone that works at the company,
38:49 they saw a job opening pop up. They're quickly, they message their friends. Hey,
38:53 do you know a recruiter or a hiring manager? I could talk more about this role. Could you do
38:57 an internal referral for me? And they were able to land jobs that they probably wouldn't have.
39:01 No, they definitely wouldn't have without that internal referral. So it is tricky. It's the old
39:06 cliche. It's not, it's not what you know, it's who you know.
39:08 I think there's still plenty of ways, COVID notwithstanding. I think that these days,
39:12 there's plenty of ways to get those connections, right? But maybe people don't know, like meetup.com
39:18 is really good. If you live in a non-tiny city, there's many, many things going on that around data
39:25 science, around Python, around other data engineering, whatever, right? You could go to those things.
39:31 They're typically even free. Often they are free with food. They even feed you, right? And make connections,
39:37 or regional conferences or national conferences, right? Like we probably, many people have heard
39:42 of PyCon, right? There's US PyCon, there's EuroPython, and then there's, but that's, those are the ones
39:48 that are often talked about, but there's 10, 20 little smaller regional ones in the US and many more that
39:54 I'm not aware of throughout the world. Probably one of those within driving distance, right? That you could
39:59 go to make connections and just also kind of take the temperature of actually what, what you see on the
40:05 internet versus what you see and actually talking to real people. So I'd also say, just get out there.
40:10 A hundred percent. Those places have the people who probably want to hire you because they're local,
40:17 right? Which is one thing that's, that's trouble on LinkedIn. I'm, I'm big on networking on LinkedIn,
40:21 but a lot of the times you're going to be networking with people who in all likelihood might never have a
40:26 role that's even open to you. But the people that you're like, for instance, we have,
40:30 I'm in Utah and we have Silicon Slopes that has like a tech meetup. We have a local Python
40:36 meetup chapter. We have the big data and developers conference that that's free every year with tons
40:41 of food. And the people who go there are people from companies around there that have the openings
40:47 that you're trying to find. And they want to hire people like you who are in the area. So at least
40:51 you can maybe come to the office once a week or maybe once a month or whatever. Right. And so really,
40:55 like you said, going to those meetups, it's tough because networking is always difficult,
41:00 either online or in person, but at least in those situations, you know, Hey, these are people that
41:05 are tied to real companies that exist around me that do make data higher. So I have a chance.
41:10 Definitely a much higher chance than just shooting out a resume. All right. Well, let's see.
41:13 We talked about job hunting already. What about like applications and resumes? What are your thoughts on
41:20 that? I think once again, with the applications, the more targeted that you can make it, the better,
41:26 right? So if you can really hone in on, I really want this job, I'm going to cold message five people
41:32 at this company and see if I can get that internal referral one way or another, make a real connection
41:37 with them. I think that's really key. And then with resumes, resumes are more of an art than they are a
41:42 science. I feel like they are so difficult to figure out. And these ATSs that are trying to match you
41:49 and see if you're a good fit. I've tried a lot of them and a lot of them suck. Whoever's the data
41:53 scientist behind those, we need to have a conversation with them because it's, it's a little tricky
41:57 sometimes. But one of the coolest concepts I've been introduced to recently, and I have a whole
42:02 episode on my podcast about it is A, B testing your resume. And basically the idea is a resume's job
42:10 is just to get you a screener interview or like a beginner interview, basically. Right. That's all an
42:15 interview. Like no one's seeing a resume and then hiring you. They're always going to interview.
42:19 So if you think about it, a resume's job, the only job it has is to convince someone to get on the phone
42:25 and talk to you. And it's just a piece of paper. And guess what? You can put whatever you want on
42:29 that piece of paper. Now I'm not saying to lie, but I'm just saying you could theoretically make a
42:35 perfect resume for whatever job you're trying to go for and send it out there and see what happens.
42:39 Right. But I'm not saying to do that. I'm not saying to lie. My point in saying this is that the resume
42:43 is just to get you the interview. And if you're not getting interviews, something's probably wrong
42:48 with your resume. And so, you know, tweak something, apply to 10 more jobs, see what happens. Tweak
42:54 something, apply 10 more jobs, see what happens. Until you finally have the right combination, skills
42:59 of experiences of different keywords. Because a lot of the time you're just trying to beat the ATS. And
43:04 that's the sad part about it is it's like, how do I prove to this random computer algorithm that they should
43:10 talk to me on the phone? That's a hard game to beat. And there's a whole bunch of advice
43:14 from all these different people. What I've come to learn is it's different for every company. It's
43:18 different for every person. You kind of kind of a numbers game till you get lucky and you figure it
43:22 out. That's good advice. I guess two thoughts. One is I know that speaking specifically to anyone,
43:28 one. But in general, women wait until they match all the requirements of a position where a guy's like,
43:34 I know three of those things. I'm taking a flyer. I'm sending it. I would just like to encourage the
43:41 women out there to just send it as well. I 100% agree with that. And I think if you reach 60% of
43:47 the requirements, I think you have a chance. Like it's a lot of the times those are wish lists and not
43:52 actual requirements. And depending on, are you local to the area? Do you have a domain experience
43:58 in this company? Like there's lots of other factors. What about contributing to open source
44:02 or having GitHub repos that can be like projects that you can show off or what's your advice there?
44:08 I'm a huge proponent of projects in the portfolio. I think if you don't have experience with something,
44:13 you create your own by building a project. And if you can do that with open source,
44:18 I think you should totally do that because I've benefited so much from open source. I have not
44:23 given back as much as I should to open source development and projects. I definitely should do that.
44:29 But if you can find a project that you're passionate about that you can help with, I think you should
44:33 totally do that. Even if it's not open source and you're just building a project to showcase your skills,
44:38 I'm all about that. I think you can do projects that are super fun, maybe that are good for your
44:43 community or good for your life. I'm a huge fan of personal projects. I've put a Fitbit on
44:48 my dog before and looked at her steps. I've found the healthiest meal at McDonald's. I've looked at,
44:54 like visualized my weight over time and tried to create like different, like forecasting models and
44:59 stuff like that. There's so much data in our lives that you can use to make really cool projects.
45:03 Oh, absolutely. You talked about, okay, you get your first job and that's where you kind of really
45:08 learn. But if you don't have your first job, you can effectively simulate that. Say, I would have gone
45:14 on to a job and been given a project to analyze something. I'm just interested in this thing. I've got
45:18 two hours a day until I get a job that I can be inspired about this and just get going on it. Maybe
45:24 create a website and publish your results and it can draw more people in to actually see that, right?
45:29 And start to appreciate it. They could even ask like, all right, who's behind this cool project?
45:34 Maybe I want them to come work for me. Little did they know you're doing all this work because you got
45:39 some spare time and you're trying to build up your experience and a self-guided study, right?
45:43 Yeah. If you can build a cool project and flip the job hunt where you're not applying for jobs, but jobs
45:49 start to apply for you, you're in such a good position and doing really cool projects can help you get there.
45:54 Now it's hard to do cool projects. It's hard to publish projects, which is one of the things
45:59 that people really struggle with. For all you Python listeners out there, let me just tell you,
46:04 Streamlit is absolutely amazing because it makes the deployment process so easy. It's free. It's a
46:12 little tricky to deploy at first, but compared to what you used to have to do it back in the day,
46:17 I'm saying back in the day, like four years ago, basically. But it was really hard to deploy
46:23 something where you could send someone a URL. Hey, check out my web application, machine learning
46:27 application. Streamlit is such a cool app that makes it so easy and so intuitive to make these
46:33 cool little apps that you could just put on your resume, put on your portfolio, send to recruiters. I'm
46:38 such a fan of the Streamlit app. I love it.
46:40 Yeah, it's super cool. There's a couple of those and Streamlit is definitely one of the really nice
46:44 ones there. There's also some hosting behind Streamlit as well these days, right? You don't even
46:51 have to set up a server or anything and just create it and put it up there.
46:54 That's what I'm saying. Back in the day, I used Dash a lot and I'm still a big fan of Dash. Dash
46:59 is more customizable than Streamlit and can do quite a bit more, but it's a lot more work to deploy it.
47:06 It's more like programming.
47:07 Yeah, it is more programming.
47:09 Programming the UI rather than just the behind the scenes.
47:12 Yeah.
47:12 You have to do both and you have to know a little bit about systems and data engineering and stuff
47:18 like that versus Streamlit kind of takes that, abstracts that away. But yeah, back in the day,
47:22 I used to make Dash web applications and deploy them on Heroku back when they had a free tier of
47:27 hosting and they've taken that away. So I don't even know what the go-to free hosting platform is
47:32 nowadays. I just, I moved most of my things to Streamlit and it's so nice.
47:35 Yeah. We got Shiny for Python now, which is also nice.
47:38 I haven't checked that out. How is it?
47:40 I haven't done too much with it either, but Joe and the team over there are doing pretty cool stuff,
47:45 like adding more dynamic interactive stuff to Jupyter, like running inside Jupyter and things. Yeah,
47:50 pretty cool.
47:51 I'll have to check it out.
47:52 I think they also do a bunch of hosting stuff over there as well, is why it came to mind.
47:56 What other advice you got for folks out there?
47:58 So AI is AI, not studying AI or learning to use AI, machine learning, but is there a benefit of trying
48:05 to use ChatGPT to help you get this job or is there a danger? I'm thinking, for example, have ChatGPT
48:12 write me an awesome resume and then the tools are like, well, we've detected this is AI generated and
48:18 it's out. You know what I mean? What do you see happening there?
48:21 A lot of people see AI as like an all or nothing tool as in it's either you, the human doing the
48:29 work or it's the AI doing the work. But whenever, I don't know about you, but whenever I'm using
48:33 ChatGPT for anything, it's very rare it's copy and paste for me or at least not iterative where I'm
48:40 doing multiple prompts, prompt after prompt after prompt, trying to tweak it exactly what I want.
48:45 And so the way I look at ChatGPT and other gen AI that will be coming out, that's only inevitable,
48:50 is instead of looking at does this replace me? Does this, like for instance, am I going to build
48:55 my whole resume using ChatGPT? Can ChatGPT build, you know, take a data scientist's job and build the
49:02 whole model for them? I like to see it more as like a hammer. It's like a tool for the data scientist or
49:07 a tool for the job searcher to use in conjunction with your screwdriver or anything else. It's like
49:13 something to be wielded by a human, not replaced for the human, if that makes sense.
49:18 You know, it's really good for stuff like, hey, I know a regular expression will do this.
49:22 Yeah.
49:23 The last time I studied, I completely forgot what this is about. And I know it's gnarly,
49:27 but if I just ask, here's an example, here's what I want. Boom. And traditionally what you would end up
49:32 doing is you'd be on Stack Overflow. Yeah.
49:35 You'd be all over the internet. You'd be trying to piece it together from external information anyway.
49:38 And so code is something that's a little bit more in the wheelhouse of the generative AI,
49:44 because it can't really make it up as much. I know it could like do something insecure and you didn't
49:50 know it was or whatever, but it's not like asking for legal advice where it makes up cases that didn't
49:55 exist. Like it gives you code. You put it in the runtime of the compiler and it runs or it doesn't.
50:00 And the output comes like you did.
50:01 Yeah. It works or not.
50:02 Yeah. So it's pretty, pretty effective for that. But yeah, for resumes, I would be more like,
50:08 let me ask it. What are the in demand things? And if I know these three skills, what other skills should
50:14 I know to get a, you could sort of use it in an explorative way to then come up with what you
50:20 might write for yourself, right? Something like this.
50:22 I find it really useful for brainstorming like action verbs on your resume bullets. Like I think
50:27 it's really good at that. What's 10 different ways to say lead. So I don't say lead five times on my
50:33 resume and I use some different action bullets. I think it's great at that. I personally, it's pretty
50:38 rare that I start any Python code from scratch nowadays. I'm either starting hopefully from a
50:44 template that I've already written, or I'm starting from a ChatGPT. Like this is what I kind of want
50:49 to accomplish, right? Like the outline for it. Like one of the things I hate doing is I make a lot of
50:55 streamlet apps. I probably make a streamlet app a month right now. And I hate starting from scratch
50:59 with streamlet. It's super easy to start from scratch, but I'll say, Hey, ChatGPT, I want to
51:03 build a streamlet app. This is like the component I want here. This is the component I want here.
51:06 This is the component I want here. And it's almost like a warmup for me as a programmer. And it will
51:12 create something that works. It's not what I want. And I spend the next five hours trying to make it
51:18 what I want, you know, without ChatGPT, but it kind of gives me a warm start to my programming process.
51:23 So I really like it. I think it's something that everyone should use. And I think if you're thinking
51:29 about getting into any sort of programming, you know, whether it's data science or web development,
51:34 I think you should be a little bit less worried about it taking your job and job security. I think
51:40 you should almost be more excited that, wow, the bar has never been lowered to break into tech.
51:45 Like this is a step up gift from the programming gods that I get to use to break into tech.
51:51 Another thing to keep in mind is I imagine a lot of people listening to this podcast are not just
51:56 starting a college program, right? They're coming from possibly other experiences, other specialties.
52:03 You know, what's really good for job security, knowing the intersection of two things, the
52:07 intersection of chemistry and programming, the intersection of geology and programming for Exxon,
52:13 potentially, right? Like those things take you from a pool of a thousand to a pool of tens,
52:19 tens, right? And so what's awesome about that is it means two things. You don't throw away,
52:24 if you got a degree in something else like biology or whatever, you don't throw away like,
52:28 well, that was wasted four years. That's out. And it slices the pool of people who could apply for
52:33 certain jobs way, way smaller, right? Sounds like you agree.
52:36 Oh, a thousand percent. I'll just tell a quick little anecdote. When I was at ExxonMobil,
52:41 there's a lot of things I did not like at ExxonMobil, but this is something I really liked. It's about
52:46 once a quarter, they would do a crowdsourced data science competition for the whole organization,
52:51 like around the entire world. And they would say, this is a business problem we're trying to solve,
52:56 you know, and at Exxon, we have data scientists all over the world and like all sorts of different
53:00 teams and things like that. So I like did not know all the data scientists at Exxon. And they'd say,
53:05 this is the problem we're facing. Here's the data go, right? And I loved participating in these. It was
53:10 like right up my, my wheelhouse of like, I really enjoy exploration and all this stuff.
53:15 At the time I was getting my master's degree, but I didn't have my master's degree.
53:18 And I was competing against, so I'm just a chemical engineering grad, right? And I'm competing against
53:24 people with PhDs in computer science and in data science and all these like people who have way
53:29 more experience than me. And I actually won a few of these competitions. Thank you. I appreciate it.
53:35 And it's not because I was a better programmer or a better data scientist. It's because I majored in
53:40 chemical engineering and I knew the business problem, the domain extremely well. And I kind
53:46 of knew the programming and the data science stuff, but the combination of them made me very valuable.
53:51 Like one of the best examples I have is we're looking at crude oil properties. And I remember
53:56 like there was a forum where you'd like ask your questions. And one of the, one of the data scientists
54:01 asked, Hey, is sulfur bad? There's lots of sulfur in this. Is it bad? And like to a chemical engineer,
54:06 that's like the most obvious thing. No, you, yes. Sulfur is very bad in crude oil. That's very,
54:11 no, no, that's like such a fundamental thing to me and to him or her. That was like groundbreaking.
54:17 And so, yeah, your domain can become your superpower in your career.
54:20 Yeah. And it makes it way harder for ChatGPT and other types of tools to just automate you out of a
54:26 job because you bring in all these skills together, which is awesome. But it also makes it easier for
54:31 you to get the job. It makes it easier for you to continue your momentum of whatever you've been up
54:35 to. It's just, it's good all around. Yeah. I think it's more fun too, because once again,
54:39 like when I was trying to decide if I should study computer science, I was like, man, I don't really want to
54:45 to build an Excel workbook for building an Excel workbook sake. That's still true for me today.
54:52 I don't want to do data science for data science sake. I only like machine learning or data science
54:57 when I'm doing it to solve a really fun problem I'm passionate about. That's where it's more fun.
55:01 So if you can be excited about the domain and excited about the algorithms, I think that's a
55:05 great place to be. Absolutely agree. All right. We're getting short on time,
55:09 but maybe tell us a bit about your data career jumpstart. You've referred to it a couple of times.
55:14 Yeah. I have a company called data career jumpstart. I just try to do a lot of education. So the
55:18 education happens on LinkedIn happens on YouTube. And I actually forgot to mention this at the
55:23 beginning, but I have my own podcast called the data career podcast, where I help people land their
55:28 first data job. We're about at a hundred episodes. So not quite the groundwork that you've put in.
55:33 That's still a ton. That's awesome.
55:35 Yeah, we're getting there. And then, yeah, I also have a bootcamp where I try to affordably help
55:40 people land their first data analyst position by teaching them the skills, the networking and the
55:46 project and portfolio building that they need to do something.
55:50 Like the long version of this show.
55:51 Yeah. Basically. Well, yeah. Just take what we talked about today, expand on it,
55:56 make it like 350 unique lessons. And that's exactly what it is.
56:00 Yeah. Very cool. All right. Well, we're about out of time. So maybe just every final call to action,
56:06 people maybe are inspired. I see Dave go out in the audience. That's an awesome talk. Very much so.
56:11 What's next. It's easy to be inspired, but you got to take action.
56:15 Yeah. I love that. I think it's always fun to listen to podcasts, but you probably benefit way
56:21 more from the action you take after a podcast. So for you guys who are maybe interested in a data
56:26 analytics or a data science career, explore that. If you're like, yes, I'm in, make a plan, make a
56:32 roadmap. If you need help, I have a webinar that will help you make a roadmap. What skills should you
56:36 learn? How should you be networking and stuff like that? But really probably if you're just getting
56:40 started trying to figure out what skills you should learn, like what are the top skills that you should
56:44 be learning and then learning those skills and then not only learning those skills, but take action and
56:48 learning and build some sort of a project that we talked about that you could put on a portfolio,
56:52 make a streamlet app or something like that. That's probably the best action you could possibly take.
56:57 If you need any ideas, advice, feel free to check out my website, datacareerjumpstar.com or the
57:02 podcast data career podcast. Hopefully there's a lots of free resources for you guys to check that out.
57:07 If you've never seen streamlet before, I have some YouTube videos about streamlet that you guys can check out, but I love it. Just take action somehow, do something.
57:13 That's one of the huge, huge differentiators is like, you might be inspired, but you just got
57:18 to start taking those steps and it becomes a snowball. So thanks for sharing all your experience and your
57:23 advice. Hopefully some people out there are taking action and yeah, I'll put everything we talked about
57:28 in the show notes, of course. So thanks for being here, Avery.
57:30 Yeah. Thank you. Thanks for having me. I appreciate it.
57:32 You bet. Bye all.
57:33 This has been another episode of Talk Python to Me.
57:37 Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the
57:41 show. Take some stress out of your life. Get notified immediately about errors and performance
57:47 issues in your web or mobile applications with Sentry. Just visit talkpython.fm/sentry and get
57:54 started for free and be sure to use the promo code talkpython, all one word. This episode is sponsored
58:00 by Posit Connect from the makers of Shiny. Publish, share and deploy all of your data projects that
58:05 you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, reports, dashboards and APIs.
58:14 Posit Connect supports all of them. Try Posit Connect for free by going to talkpython.fm/posit.
58:21 P-O-S-I-T.
58:22 Want to level up your Python? We have one of the largest catalogs of Python video courses over at Talk Python.
58:28 Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all,
58:34 there's not a subscription in sight. Check it out for yourself at training.talkpython.fm.
58:38 Be sure to subscribe to the show, open your favorite podcast app, and search for Python.
58:43 We should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play,
58:50 and the direct RSS feed at /rss on talkpython.fm. We're live streaming most of our recordings these
58:57 days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe
59:01 to our YouTube channel at talkpython.fm/youtube. This is your host, Michael Kennedy. Thanks so much
59:08 for listening. I really appreciate it. Now get out there and write some Python code.
59:20 I'll see you next time. I'll see you next time.