Monitor performance issues & errors in your code

#446: Python in Excel Transcript

Recorded on Thursday, Nov 16, 2023.

00:00 Why is Python so popular?

00:01 There's plenty of room for debate on this, but one solid reason is it's easy to adopt, easy to use, and caters to people who are not quite developers, data scientists, but need to get some computing done.

00:13 Do you know where the largest untapped set of those people hang out?

00:17 Excel.

00:18 That's why it's super exciting that Python is now going to be built directly into Excel.

00:23 Just go into a cell and type equals py, and you're off writing full Python 3 code that is backed by a lightweight Anaconda distribution of Python.

00:32 We have Dr. Sarah Kaiser here to give us the rundown on Python in Excel.

00:37 This is "Talk Python to Me," episode 445 recorded live and on location at Microsoft Ignite in Seattle, November 16th, 2023.

00:46 (upbeat music)

00:51 Welcome to "Talk Python to Me," a weekly podcast on Python.

01:04 This is your host, Michael Kennedy.

01:05 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython, both on fosstodon.org.

01:13 Keep up with the show and listen to over seven years of past episodes at talkpython.fm.

01:18 We've started streaming most of our episodes live on YouTube.

01:22 Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.

01:30 This episode is sponsored by Posit Connect from the makers of Shiny.

01:34 Publish, share, and deploy all of your data projects that you're creating using Python.

01:38 Streamlet, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports, Dashboards, and APIs.

01:45 Posit Connect supports all of them.

01:47 Try Posit Connect for free by going to talkpython.fm/posit, P-O-S-I-T.

01:54 And it's brought to you by the PyBytes Developer Mindset Program.

01:57 PyBytes' core mission is to help you break the vicious cycle of tutorial paralysis through developing real-world applications.

02:04 The PyBytes Developer Mindset Program will help you build the confidence you need to become a highly effective developer.

02:10 The link is in your podcast player show notes.

02:13 Sarah, welcome to Talk Python to Me.

02:14 - Hey, how's it going?

02:15 - Hey, it's going super well, super well, here at Ignite for one more time to talk about cool stuff with Python and things going on.

02:25 And this time it's gonna be Excel and Python.

02:28 It's like Reese's Pieces, you got your chocolate and my peanut butter.

02:32 Like, what's going on here?

02:33 I mean, the cats and dogs, are they living together?

02:36 What's happening?

02:37 - The old school and the new school.

02:39 - That's right.

02:40 The open source and the not.

02:42 - Yeah, yeah.

02:43 My background is I've been a data scientist for a long time.

02:46 I did my PhD in quantum computing in like experimental physics stuff.

02:50 So trying, actually my first programming language was Mathematica and that was actually what got me hooked on kind of the notebook sort of interface.

02:57 - Yeah, okay.

02:58 Well, normally people say their first programming language when that's the background is MATLAB.

03:03 - Yeah, yeah, yeah.

03:04 - But I would say Mathematica is like the proper mathematician's programming space because it does symbolic math, right?

03:12 - Which is so remarkable when you see it go.

03:15 - Yeah, like that was basically like I could visualize, the thing was being able to do easy visualizations.

03:21 So actually after undergrad, I worked at Mathematica for a while on their visualizations team.

03:26 - Really?

03:27 - Yeah.

03:28 - What technologies was that in?

03:28 - I was actually just writing Mathematica in Mathematica.

03:31 Thankfully I didn't have to touch any of the optimization stuff underneath, but designing new like gauges and plotting built-ins and stuff like that.

03:39 I really enjoy visualizing and understanding data that way.

03:43 And so when I got to grad school, realized that none of our instruments talk together.

03:49 Like everything was an entirely separate, archaic, non-documented protocol.

03:54 And we were having to somehow like either sit by hand and match up all of these CSV dumps from the different tools and stuff like that.

04:02 So basically I started learning Python to make our lab work.

04:06 So initially it was like using a serial and C types libraries to actually interface with these APIs.

04:12 But I realized someone once showed me a Jupyter Notebooks and I was like, okay, this is how we're, this is the way.

04:19 - This is Mathematica for Python.

04:21 - Exactly.

04:21 So I have been using Jupyter Notebooks almost since they came out.

04:26 And really enjoyed that both as a learning tool, a teaching tool and a kind of reproducible science tool.

04:34 - Yeah, it's just taken the world by storm, really.

04:36 They were here at Microsoft Ignite, the keynote, they went, they had like basically one code oriented demo in the two and a half hour keynote.

04:44 And it was, now we have a Jupyter Notebook with some Python code that's gonna do some machine learning and you know, back to cats and dogs.

04:53 - Yeah, yeah, yeah.

04:54 - That's incredible.

04:55 So yeah, very cool.

04:57 These days, what are you doing?

04:58 - I am a developer advocate at Microsoft on the Python team, focusing on data science and machine learning, the sorts of Python tools.

05:08 So I get to do all the fun stuff with Pandas, NumPy, SciPy, all of those things.

05:13 And that's kind of why when I heard about the Python and Excel preview, I was like, "Oh, I have done many Pandas import from Excel sort of things." And was kind of curious just to see, 'cause initially I was like, "I'm pretty cool with my Python.

05:31 I'm not sure why I wanna go that way." But I said to say, I think I'm convinced now.

05:36 - Are you?

05:37 Okay.

05:38 So that's kind of a different way than a Jupyter notebook.

05:41 It's for pluses and minuses.

05:42 We'll talk about some fun ones of those as we go.

05:45 Who are you evangelizing Python on the behalf of Microsoft too?

05:50 So are you talking to like companies that do .NET and saying, "Hey, you should do Python." Or are you talking to companies that love Python and saying like, "Hey, you should consider Azure or our offering." What does your world look like?

06:02 What do you do today?

06:03 - Yeah, yeah.

06:03 A lot of it is doing testing.

06:04 I work a lot with the VS Code team and the Jupyter, specifically like the Jupyter features in VS Code to make sure all the notebooks work, to make sure packages with kind of weird output visualizations, are they rendering properly?

06:19 Generally, I think I work with a lot of the actual Python open source maintainer communities in the kind of scientific Python space.

06:27 So like with the NumFocus supported projects and really just try to kind of understand what are they running into.

06:34 And as we're definitely seeing both with all of the LLM and machine learning stuff, like data is getting so big that I definitely can't open it in one Excel sheet.

06:44 (laughs)

06:45 So the data is basically, it's all native on the cloud.

06:50 And so there's, I think a lot of kind of the academic space that I come from of like, that isn't really part of the curriculum or what's taught, or people kind of have to learn it ad hoc.

07:01 And so basically, I kind of like to think of myself as someone who gets to go do all that learning about kind of how does science and data interact with the cloud, and in this case, specifically Azure, and just try and help build out scenarios, like build sample apps, things like that, that can make it so they don't have to waste half their PhD time trying to understand cloud DevOps stuff.

07:26 - Yeah, it sounds a little bit like, you know, a lot of people who are just brought into a lab and said, here, you're gonna learn Python, and then you can go do your actual research.

07:35 They often sort of skimp out on like some of the very basic software practices, like what is Git, what is DevOps, how do I run this?

07:44 Like those kinds of things, right?

07:46 So maybe you--

07:47 - I've run a lot of workshops.

07:48 - Have you?

07:49 - Basically, in my postdocs, I got really tired of my teams and groups that I was working with not knowing.

07:57 So I basically made a standard two-day workshop that was like all of the computer science skills you missed in your physics degree.

08:04 We did have, at one point, data that just existed only on a jump drive, and that jump drive walked away, and that does not feel super good from a research standpoint.

08:13 - I can imagine.

08:14 - So, yeah, let's just say we had a review of that and instituted new policies.

08:20 - We've decided backups are good.

08:23 - Yep.

08:24 - Yeah, it's a bit of a diversion, but I worked on some math research before I graduated and got my bachelor's, and it was just in C++, but we had these large Silicon Graphics computers, like big mainframe thing, and we had Onyx workstations that went into it.

08:41 We came in one morning, and nobody could log in.

08:44 The computers wouldn't respond or turn on.

08:46 It's like a quarter million dollar computer that got used, and it won't turn on.

08:50 We're like, oh boy.

08:51 (laughing)

08:52 What had happened is one of the grad students, non-developer types just doing their research, but had a problem with their program, so they started logging to see what was going on, and they left it running overnight.

09:04 It filled the disk to the very last byte, and apparently once the last byte was gone, the whole mainframe couldn't operate and just nobody could even figure out what was wrong with it forever, and eventually, yeah.

09:17 So, you know, practices.

09:18 - Yeah, we also had a five million dollar microscope get bricked because a younger grad student didn't understand that Windows XP Service Pack Zero should never be plugged into the internet.

09:31 (laughing)

09:33 - So no firewall, none of that.

09:35 - Oh, no, no, no.

09:36 - Why would you have that?

09:37 - They were just like, well, it has an ethernet port, and the company said they can do support over TeamViewer or something like that, so they were like--

09:43 - TeamViewer, that's never gonna go wrong.

09:45 - So we walked in, and we had to yank it out of the wall, and we had the police there the next day because it had already been doing a legal financial transaction.

09:53 (laughing)

09:54 - It's in about 30 seconds.

09:56 - You mean a microscope.

09:57 You're here to arrest our microscope, okay.

09:59 - Yeah, I was like, it's over there.

10:00 (laughing)

10:01 - They're like, well, that's not a computer.

10:03 It sort of is.

10:04 All right, another interesting thing before we dive into our main topic here is I'm not sure if this is real or not.

10:10 Did it happen or did it not, this quantum work that you're doing?

10:14 - Yeah, so my PhD was on quantum cryptography hardware, so both there exist commercial devices, so part of it was finding side channels in the physical devices that you could buy off the shelf, and then the other part was designing satellite optics for doing ground-to-satellite-based key exchange.

10:35 So yeah, that was a lot of what I was doing, like prototyping things in the lab that would look like a satellite, and we'd, you know, satellite on one table, sender on the other, and we slowly worked out to, we did a bunch of tests in an airplane 'cause turns out putting things on a satellite is kind of expensive, and--

10:52 - What if we just flew really high in a 747?

10:55 - Yeah, and weather balloons turn out to be not super predictable in the path once you launch them, and so if you're trying to aim telescopes, yeah, it kind of ended up being messy, but it was pretty cool.

11:06 We got to go, I got to help with the design process for these single-photon kind of collection systems that went from literally, I'm just putting a couple lenses on the table and running Python stepper motor things in my Jupyter notebook to something that was actually automatically tracking and working from the ground to an airplane.

11:25 Hopefully, eventually a satellite, but that's kind of out of my hands at this point.

11:30 - Yeah, space takes a long time to get to, a lot of planning, a lot of, we worked on it for eight years, and then we had our launch window missed by a week.

11:37 - Oof, yeah. - You know, or something like that, right?

11:39 - Yeah, that's rough, and as a grad student, you gotta get out of there at some point.

11:42 - Yeah, you're like, I gotta get a degree, I need to eat.

11:45 - Yeah.

11:46 - So let's get going.

11:48 Cool, well, that sounds like a really fun project.

11:49 You must have learned a lot.

11:51 - Yeah.

11:52 - Whenever you're talking to hardware, I think that's a special kind of programming.

11:55 It's not just, I run the program, and then it just goes a little bit like, there's stuff it touches.

12:00 - Yeah, like you don't get to necessarily do your own architectural design.

12:04 You have to kind of first observe and usually reverse engineer what the original device engineers came up with and then figure out how is this, you know, this device is streaming-based, this device, how am I gonna like, these are entirely different application models.

12:21 - You really, you chose the serial port, okay.

12:23 All right then, let's go with that.

12:26 Cool, all right, now before we get into Excel in Python, let's maybe talk about working with Excel from Python, like the reverse, right?

12:35 So you talked a lot about notebooks, and many people go from kind of notebook into CSV into Excel, and there's some common tools for doing that, like even built into Pandas, for example.

12:48 It could be Wilson's what they can do there before they have to necessarily decide, like I need to run it inside of Excel.

12:53 I think there is still a special value to having structured code.

12:57 - Oh yeah, for sure.

12:58 - That you could have, I mean, notebooks is like kind of a fuzzy structured code.

13:02 It's not like a proper PY file because you can run the stuff still out of order, but at least within the cell it runs within order.

13:10 You know, it's like one dimensional out of order rather than like Excel, which is two dimensional potentially, or three if you're taking to other workbooks.

13:17 But anyway, what can people do if they're not just to talk to Excel now?

13:21 - Right, I think you kind of alluded to it, but the main way there are a couple of packages, one is actually built into Pandas, but I think it's a dependency on OpenPyXL.

13:31 But yeah, that basically, I've been playing around with that recently to try and see if I can get support for the new Python-based cells in that parser.

13:42 Turns out parsing XML is not fun and not something I like doing.

13:47 - I've spent the last two days parsing Excel or XML.

13:51 Oh my gosh, it's not fun.

13:53 - In particular, standardized document specs, 'cause that's basically, so there are wonderful packages out there already that do this for you.

14:01 And so what you would do is you'd basically just hand your Excel files, CSV, whatever, to these packages and they would return them as basically whatever you like.

14:09 Usually I go to Pandas tables 'cause, or data frames 'cause why not?

14:14 But yeah, so there's basically any format that you could pretty much want.

14:18 There's probably already really nice tooling to import that and you can just go right on your merry way using your Python.

14:25 - Yeah, you can use the XLSX writer if you wanna create Excel versus read it, right?

14:32 - Yeah, it's really cool.

14:33 You can even make plots.

14:36 You can, from Python, make a plot in the Excel worksheet so that when somebody opens it, they see the plot made in Excel, kinda crazy.

14:44 - This portion of Talk Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio, and especially Shiny for Python.

14:54 Let me ask you a question.

14:55 Are you building awesome things?

14:57 Of course you are.

14:58 You're a developer or a data scientist.

14:59 That's what we do.

15:00 And you should check out Posit Connect.

15:03 Posit Connect is a way for you to publish, share, and deploy all the data products that you're building using Python.

15:10 People ask me the same question all the time.

15:12 "Michael, I have some cool data science project or notebook that I built.

15:16 How do I share it with my users, stakeholders, teammates?

15:19 Do I need to learn FastAPI or Flask or maybe Vue or React.js?" Hold on now.

15:25 Those are cool technologies, and I'm sure you'd benefit from them, but maybe stay focused on the data project.

15:30 Let Posit Connect handle that side of things.

15:33 With Posit Connect, you can rapidly and securely deploy the things you build in Python.

15:37 Streamlet, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Ports, Dashboards, and APIs.

15:44 Posit Connect supports all of them.

15:46 And Posit Connect comes with all the bells and whistles to satisfy IT and other enterprise requirements.

15:52 Make deployment the easiest step in your workflow with Posit Connect.

15:56 For a limited time, you can try Posit Connect for free for three months by going to talkpython.fm/posit.

16:02 That's talkpython.fm/POSIT.

16:06 The link is in your podcast player show notes.

16:08 Thank you to the team at Posit for supporting Talk Python.

16:11 That's been around for a long time.

16:14 That's not like an announcement, right?

16:17 People have been, there's been different ways, and they've come in and out of support for different Python versions and stuff.

16:23 So, but there are ways to kind of do that.

16:26 And I see that a lot of people are like, "I'm working in notebooks, but the final consumer of this information is not a data scientist or a Python person.

16:34 They live in Excel and they live in Outlook.

16:37 The only thing I can give them is Excel spreadsheets or a PDF." - Yep, that is pretty much always the output of my notebooks is I either export it, like the whole notebook as a PDF, or yeah, basically organizing things such that I can use Pandas or something to export it to an Excel workbook.

16:55 But it's kind of a pain then because like you have your, there's that translation step that you kind of have to keep going back and forth.

17:02 If you're dialoguing, you know, somebody says, "Oh, but what happens if you do this?" And it's like, "Okay, I gotta go rerun and remake a new PDF for you." Could you just like do it? - I just need a quick change.

17:12 Like, could you just incorporate that like, "Yeah, it's not the same change.

17:16 We gotta re..." - Ideally, we all have nice DevOps pipelines where, you know, you just, you commit a change to source and it re-renders all the notebooks.

17:23 That's in a perfect world. - Of course, that's how it works.

17:25 - Yes, obviously, that's how I always work.

17:27 (laughing)

17:28 - It is.

17:28 So I guess, you know, before we get into the details of how this works, like, why?

17:34 Right, 'cause we just laid out, it is possible to like go to notebooks, have Excel kind of as an output and a save as, in a sense.

17:41 - The biggest thing for me is like, well, it's not intended, of course, as any sort of replacement.

17:47 It really is, as you were kind of mentioning before, it's like a different front end sort of thing.

17:51 Like we view, there's Python scripts, there's Jupyter notebooks, which is kind of like an alternative front end for writing that Python code.

17:59 I kind of think of using Excel and Python like this as kind of using a different kind of data first.

18:06 Like whenever we're thinking about data, it's usually in some form of grid or table or something like that.

18:12 So like being able to work with the data kind of visually in that sort of UI is really, really interesting.

18:19 And kind of to the point that you just made about, like the end consumer might only know how to use Excel and doesn't know anything about Python or doesn't know anything about running DevOps pipeline.

18:30 - I didn't mean that comment either to be disparaging because there are people that do incredible stuff with Excel.

18:36 Maybe they shouldn't, but they can and they do.

18:38 - I have in kind of delving, trying to understand what are like the weirdest things Excel can do is I've been trying to make demos, like people do some weird like competitions with Excel.

18:48 Like there's the painting in Excel.

18:50 - Is there a flight simulator?

18:51 - There's a flight simulator, like you can, you know, with macros, it's very scary.

18:56 So I, yeah, absolutely.

18:58 Like I am not an Excel pro.

19:00 I am a Python developer who uses Excel when prompted basically.

19:06 - Yeah, well, there's no question, absolutely no debate about it, that the upper bound of what you can do with Python is greater than the upper bound of what you can do with Excel.

19:15 But so many people just have, they have tabular data and they need reports and understanding and change this, what happens there and yeah.

19:22 - And I think the biggest value, like kind of having played with this now for a little over half a year is collaboration.

19:31 You don't have that kind of, you don't have to have that iteration loop between, okay, here we talked about it or, you know, you came and sat at my computer while I showed you, you know, my Jupyter Notebook so we can make changes like that.

19:41 All of the things like having multiple people in the Excel doc at once, you know, you can just send the share link and they can join and they can run things in the notebook.

19:49 They don't have to install or set up Python.

19:52 And you can just, you can leave comments like you would in Word docs.

19:55 So like basically that whole kind of review and collaboration workflow just turns into something that people are already pretty familiar with, with like Word docs and stuff like that, which is a thing that Jupyter Notebooks generally don't have.

20:08 So I think that's kind of the, one of the most interesting differentiators between them.

20:13 - Yeah, I just, like an hour and a half ago, released the episode on Notebook 7 and JupyterLab 4, which comes with the real-time collaboration and stuff.

20:24 - Which is really exciting.

20:25 - Yeah, it's quite exciting.

20:26 So it's on the horizon.

20:27 - Yeah, for sure.

20:28 - If it hasn't been like a common thing, but even so, again, a lot of these people are not gonna be in the notebook to collaborate with you.

20:35 - Right, right.

20:36 - It's fine.

20:37 And you played with this for a while.

20:38 How much of this is going to be a gateway experience to doing more like, ah, I couldn't remember that Excel formula, but I asked Copilot and it told me to do this Python thing and it totally made more sense to me.

20:51 Maybe I should just learn a little more Python and kind of grow beyond Excel.

20:55 Like, what do you think of this as a first step for many people who go, I'm not a programmer.

20:59 No way, I'm not a geek.

21:00 No, but then they are.

21:02 - Yeah, they totally are.

21:03 Like, if you're playing around with a bunch of numbers and making fancy spreadsheets and stuff, I'm sorry.

21:07 You're nerdy about data.

21:10 And so, yeah, I think it's a really, 'cause there are lots of really cool things you can do with Excel, but as you say, like some of the plotting features and stuff, you can't do, like, you can maybe do, but it'd be really hard.

21:22 And some, it might just be straight up impossible to make, you know, like some of the matplotlib, multi-paneled aligned plots with like interesting statistical plotting things that aren't just your standard bar and chart sorts of stuff.

21:34 So like, if you're trying to do some of those statistical modeling things, or the other big thing is like the Python and Excel, one of the, it has PyTorch and stuff like that.

21:44 So if you wanna do just kind of like entry-level machine learning sorts of stuff, and you don't wanna have to like set up your big data source and SQL servers, you know, whatever, what have you to work with it.

21:55 If you just got your spreadsheet of like your home sensor data or something like that, you can actually just have it make a regression model and you don't have to like spin up a whole thing, you know, in environments and stuff like that.

22:07 - When you predict the battery in your gate will be dead next week, you're gonna need to take an action on this.

22:12 - Which will be great when then I dismiss the notification and still don't remember to do it.

22:16 - Why can't we leave the house?

22:17 Well, remember that notification said you're not gonna be able to open the gate in a week.

22:21 That was a week ago.

22:23 Sounds familiar.

22:23 I know we were talking earlier that you have some cool home assistant automation and stuff.

22:27 - Yeah, yeah, yeah.

22:28 - Awesome, and I have a gate that has a dead battery.

22:30 So let's just jump into it.

22:32 I mean, let's talk about what is Python and Excel.

22:35 Like how do people use it?

22:36 What can it do?

22:37 What can't it do?

22:38 - In short, basically having the Python and Excel feature means that you can, when you're in a cell, if you type the equal sign, which is usually how you start a function in Excel.

22:48 - I type equal some parenthesis and then off you go, something like that.

22:50 - Exactly. - Drag it around.

22:52 - So you type equals and then you type py, open parentheses, and that basically turns the cell now into a Python cell.

23:00 So it'll have a little visual change.

23:03 It'll turn green on the left, and there's also a keyboard shortcut to do this.

23:07 - It should put the Python logo.

23:08 It should be like a blue, yellow.

23:10 - Yeah, green is kind of the Excel color.

23:13 I don't know.

23:14 Anyway, we can file feedback on this.

23:18 - That's community feedback from the Python people.

23:20 We need the Python logo in there.

23:22 Let's make it happen.

23:23 - The idea is that basically you can change a cell in the Excel notebook into one that basically is kind of like a Jupyter cell.

23:30 You can type Python code, you can import modules, you can write functions, you can print things.

23:36 Like basically anything that would be valid in a Jupyter notebook cell, maybe not necessarily the magic commands, but like-

23:44 - Sure.

23:45 You can get to the shell as a hacker, I just do.

23:48 (both laughing)

23:49 - Did actually try a lot of that.

23:50 So I should say, so once you've typed in your code there and you run the cell with Control + Enter, it's not Shift + Enter because that's a very established keyboard shortcut in Excel, which moves you around in the grid.

24:03 So pro tip, remember it's Control + Enter.

24:06 But what happens then is basically Excel connects to a container instance in Azure, which has a standardized Anaconda provided environment.

24:16 So it's an Anaconda environment that has a fixed declarative set of packages, which is basically kind of what 80 to 90% of projects use.

24:27 We can talk about like which ones that are, which ones are in that environment in a second.

24:32 But your code then runs in this, it referred to as like an iron box in this container instance.

24:39 So it does not-

24:40 - Doesn't know who you are.

24:42 - It doesn't know, well, I mean, you are, and it's basically running a Jupyter kernel in that box.

24:48 So like the kind of communication, if you have multiple cells in your sheet that have Python code in it, every time you make a change, just like an Excel would normally kind of like recalculate everything in the sheet, it will basically ping that iron box and as if it was kind of like a Jupyter notebook.

25:04 - That's the thing that surprised me most is that it doesn't just run locally because you could embed the Python runtime into Excel, without any trouble.

25:13 - Yes.

25:14 - And in terms of making the good path work.

25:17 - Yes.

25:18 But I think part of the, part of the design philosophy with this as a feature to Excel is this is an enterprise product that enterprises want to use and IT admins want to feel good about, what security permissions, things, where data can come from, like all the permission compliance good stuff.

25:39 - To go, it does what?

25:40 Yeah, no.

25:41 No, we're not doing that.

25:42 And we just recovered from the ransomware thing.

25:44 So we're definitely not, we're done.

25:46 So what are the restrictions?

25:47 It has no network access.

25:49 - Yeah, so that iron box does not talk to, the only thing it can talk to is your, you know, instance of Excel.

25:55 - It can receive commands, but it doesn't.

25:57 - It does not touch the internet otherwise.

25:58 - Interesting.

25:59 So that automatically changes maybe what some people think like, for example, pandas read CSV, give it a URL and say, I want this, the third table, like it's not gonna do those types of things or you can't do request.get against some external data.

26:14 - Which initially I was like really sad about.

26:18 I understand the security things, but was a bit sad.

26:20 But then some of my Excel friends showed me, 'cause this is also kind of a challenge in Excel too.

26:26 Like, you know, how do you get data?

26:28 People have things in SQL servers, people, or databases more generally.

26:33 - Every minute I want to have it refreshed.

26:34 - Yeah, like that kind of, it's not concurrency, but like keeping everything up to date is a really big headache.

26:41 And the thing is Excel is already very good at addressing this.

26:44 So there's something called Power Query, which is another feature of Excel.

26:48 And that's basically, so the scenarios like you described, like if there's a CSV somewhere in a GitHub repo that I want to pull in, you can use the Power Query feature, can go fetch that and bring that into your Excel file that then you can use with Python and Excel.

27:02 So it kind of separates that task of getting the data to work with, kind of pries that out of the Python piece, but puts it in much safer hands.

27:13 (laughing)

27:13 - Sure.

27:14 - So secure hands, I guess, if that makes sense.

27:16 - Yeah, restricted hands.

27:17 Okay, it doesn't have access to your account, it doesn't have a token to your account, like your Office 365 or whatever.

27:23 Yeah, it's pretty locked down.

27:25 You can't pip install things, it comes pre-can.

27:29 It does what it says on the tin and like, okay, these are the things I can do.

27:32 - Yeah, and that again, sounds kind of sad, but honestly, like I've been trying to make a bunch of samples and stuff and I haven't run into too many, like it's got NumPy, Pandas, SciPy, Seaborn, AstroPy.

27:47 - Yeah, AstroPy is pretty awesome.

27:50 It's got Matplotlib, PyTorch, as you said, PyWavelets, how's that for a throwback to my mathematical days?

27:57 That's some like fast 4-8 transforms but way more complicated.

28:00 Scikit-learn, SciPy, SymPy.

28:03 - Yep, yep, yep.

28:04 - There's your Matlab right there, or your Mathematica right there.

28:06 - And yeah, and others, right?

28:08 I'll link to like the options.

28:09 This portion of Talk Python to Me is brought to you by the PyBytes, Python Developer Mindset Program.

28:18 It's run by my two friends and frequent guests, Bob Delderbos and Julian Sequeira.

28:23 And instead of me telling you about it, let's hear them describe their program.

28:28 - 2024 is unfolding and with it comes a world of opportunities in the Python landscape.

28:33 Opportunities for your Python career and journey can come from anywhere and at any time.

28:39 Are you ready to seize them when they do?

28:41 Don't let this be the year when opportunities pass you by.

28:45 - Our Python coaching is designed to prepare you for the unexpected, to equip you with the skills and confidence needed to tackle any Python challenge.

28:55 Whether it's a dream job, a complex project, or a new venture, you need to be ready for it.

29:01 With PyBytes coaching, 2024 won't just be another year.

29:06 Check out our PDM program and take the first step towards a year filled with Python success.

29:11 Don't just dream about being a developer, achieve it with PyBytes.

29:16 - Apply for the Python Developer Mindset today.

29:19 It's quick and free to apply.

29:21 The link is in your podcast player show notes.

29:24 Thanks to PyBytes for sponsoring the show.

29:26 - So yeah, like it really does cover most of like, obviously there are still plenty of cases where you will just want to use Python and that's okay.

29:35 Like that's, this is in no way, shape or form is this supposed to like take over your entire Python workflow.

29:41 It's really, you know, the, I still am doing my Python development, but basically if there are people who I know I want to like, especially communicate what I'm doing to, like who that is their language, I now have a tool that I can kind of put the things that I would normally have in my development environment in a tool that they can access and interact with too.

30:03 That, you know, also doesn't feel super bad to me.

30:05 (laughing)

30:07 - What's the alternative?

30:08 It's a right like Excel functions and VBA macro.

30:12 - Yeah, no I'm not doing that.

30:12 - Yeah, it's a pretty massive upgrade.

30:14 And I do think it's gonna be a bit of a gateway like, you know what, this Python stuff's not that complicated.

30:20 It's kind of cool.

30:21 Maybe, what do you do with those notebooks again?

30:23 I think that'd be cool.

30:24 I think that'll happen.

30:25 When I think about writing Python, I'll use something like PyCharm, like really cool code, like analysis.

30:32 And then I'll think about it like with rough, maybe running as like an integration and auto-complete.

30:37 And there might even be like an AI magic, or maybe this is happening in VS Code in a real similar way.

30:42 Then when I think about writing Excel formulas, it's like one line across.

30:45 These are not the same.

30:47 So what is it like to write Python in Excel?

30:50 - Yeah, it definitely, I do generally try to keep, like when I write note, like in Jupyter notebooks, I try to keep myself small, not just have giant things.

31:00 But yeah, you're totally right.

31:02 Like even just from the view, you can't really drag it much bigger than a couple lines.

31:07 - You can expand it so it's at least multi-line.

31:08 - Yeah, you can see multi-line and that's fine.

31:11 And it has syntax highlighting and it does have like tab completes.

31:14 It's got like the--

31:16 - Does it do like the language server?

31:17 - Language server, yep, yep, yep.

31:18 - So yeah, that's actually pretty good.

31:19 - As I think was in the keynote stuff today, there's copilot in Excel.

31:24 And so like we'll see, I think some of these AI features also kind of help speed up some of those things.

31:31 So like if you're like, I just, I want a pandas table here and to convert this part, the column that's all dictionaries to like their own columns and stuff like that, it'll just write it for you.

31:43 And then you don't have to, that editor, editing experience is definitely not the same as working in my beloved VS Code.

31:51 - Well, if the keynote here at this conference was a drinking game and AI was a word, you wouldn't have lasted 15 minutes and it was two and a half hour.

31:59 I mean, there was so much AI this and AI that and the AI is talking to the AI and here's the, and it's quite something.

32:06 But they did show a really cool thing of, like I've got some Python code in here and I've got some related data and just asking like, all right, help me graph this in Python.

32:16 And it would show you like a Seaborne graph or a matplotlib or something like that, which is pretty excellent, honestly.

32:21 - Yeah, and then in this case here, then it would be in a format that you could share with somebody that they could then, Excel has what if scenario features where it makes it really easy to be like, well, here's the forecast or whatever, but what if this changed?

32:34 - What if interest rates go up?

32:36 - Yeah, and you have like these little dropdowns, like all of those actually work.

32:40 And so like I can use cells that are like dropdown, data validated dropdown cells, and that just feeds as, feeds into my Python code and changes, like the theme I'm using for my matplotlib.

32:53 - Matplotlib plots or something like that.

32:54 - Go a little meta for a moment.

32:56 Could I use pandas or XLS, XSLX, writer, whatever the extension is, to write an Excel workbook spreadsheet that itself does, has Python in it?

33:08 - In principle, yes.

33:10 - Are there just formulas?

33:11 Are there just strings that go into formulas?

33:12 - Yes, and that's precisely what I've been working on recently is to try and figure out how to integrate into some of those, the existing Python packages.

33:21 'Cause basically what you need to do is make sure all of that data is inserted into the XML document model in the right way and stuff.

33:29 Like it's pretty cool.

33:31 Like you can take an Excel, you know, an Excel notebook if you just change the extension to zip, then you get to see all the XML that's inside of it.

33:38 And you can just see, - A red pill.

33:40 - You can just see the sheets that are like, here's the Python functions, and you can just see the strings and stuff like that.

33:45 So I don't think there's anything blocking that.

33:47 And that's something that I wanna be able to help contribute if I can to the community so that you could actually programmatically write your Excel sheet with Python from Python.

33:57 - And so things that are not allowed, like screen scraping or talking to a database, you could kind of burn that into Excel as like fixed data, but then keep rebuilding that workbook or something like that.

34:07 - Yeah, you could do that.

34:09 - All right, integration between Excel and Python.

34:11 In Excel, I can say I wanna take the sum and I select the big section.

34:14 And it's like A1 : A20, and that sums those up.

34:19 But in Python data science land, I don't even know what that is.

34:24 So how do I make, obviously the point having Python there is to access chunks of data, process it, do other things to it, turn it into graphs.

34:31 What's the interop story?

34:32 - Part of the Python and Excel feature is that in that con environment, there is a module called XL.

34:39 The abbreviation as it's loaded is X, the letter X and then the letter L.

34:43 And basically that allows you to access all kinds of data and properties of the worksheet in Python.

34:51 So that's basically, whether anything from like the cell styling to, you can literally just put XL parentheses, and then in a string, the XL selection, A1 colon whatever selections, and it will just then is already available in Python.

35:07 So there's kind of this interop package that allows you to transition data models between what XLthinks of as data and what Python thinks of as data.

35:17 - Yeah, and basically that comes out as a pandas data frame, right?

35:20 - Yeah, in general, once you run a Python, or a Python and Excel cell that has Python code in it, you run it.

35:28 There's a lot of cells, sheets, code.

35:31 After you run it, you have the option of two different output formats actually.

35:35 So you can, by default, I think it outputs as a Python object.

35:40 If it recognizes that it's a string or a list, it might just display that, or you can kind of intentionally cast it to an Excel data type.

35:49 And again, if it can automagically understand that, oh, this dictionary should just be in a list or something like that.

35:56 And a lot of that functionality is kind of provided by the collaboration with Anaconda.

36:01 So you'll see like, if you hover over, if you have it return, like a Python object, if you hover over it, it'll show you like, you know, if it's a class, it might have some, show you the properties, and then you could drill in and say, turn the output into displaying this property, but it shows you a little pop-up with the Anaconda logo corner and stuff.

36:19 - Yeah, that's some interesting branding there.

36:22 I did notice that as well.

36:23 In Jupyter, I can have a cell, I can say like, X equals something, and then some other point in time, I can say X squared, whatever.

36:32 That assumes I'm running them top to bottom, or at least those two cells, the one that defines the X is run before the one that uses the X, which makes notebooks a little bit non-deterministic, and if you don't just go run all cells.

36:47 But Excel takes it to a whole another level, right?

36:49 This refers to that, which points over there, and then that goes up, and then it goes back down, and it's all over.

36:53 How do we make sense of like, the connections?

36:55 - Yeah, it is something that I've like, always is very impressive about Excel, 'cause I'll like, copy and paste chunks of things, and somehow all the references stay correct, you know?

37:03 Like, that's not what would happen if I refactored my Python code.

37:07 By default, the execution order goes left to right, and then down, so that's something I gotta, I generally just kind of by convention, use single columns.

37:17 - We're just going down.

37:18 - And left to right on the number of worksheets, so it evaluates everything in the first, like if you have tabs at the bottom for separate sheets.

37:24 - Yeah, you can go cross sheet as well.

37:26 - Oh yeah, yeah, and so like, honestly, one of the things, kind of habits I've developed working with this is, the first sheet in like, the top left corner, I put any sort of like, inits or things that like, things I want to have defined everywhere, I just put in literally the first cell that it could ever possibly evaluate, and then everything else from there, will be what it's going to be.

37:46 But, that's how I make sure those things, and actually when it spins up, there is a predefined kind of startup definition block, so it will automatically import pandas as PD, numpy as NP, like, so you don't have to, like, a lot of that kind of standard boilerplate, you don't have to, but if I'm doing a notebook where I'm doing a lot of--

38:05 - Like scikit-learn or something.

38:06 - Yeah, yeah, I just always want to have that there, I just got to copy pasta for that.

38:10 - Yeah, make an incredibly thin column, that if you were to expand it, that's where you're right there.

38:17 - That's actually, I've done that.

38:18 - Yeah, awesome.

38:19 - But, just like in a Jupyter notebook though, that kernel, like, if you can define variables like that X, and you can use that anywhere else, and then you can also just put it in a cell, and you can have Excel refer to it, and it will also know what data is there.

38:32 - Right, you might have to do the Py or the XL to like, transform it, otherwise then you just use it, yeah, yeah, awesome.

38:39 So, you said this runs in Azure, and it's some locked down container, but what version of Python, like, what OS is the container running, what's the story, what do you know about where this actually runs, like, to the extent that people will care?

38:54 - Yeah, yeah, so, I don't know the specific version, I'm guessing it's probably 3.11, but the idea is that it is a well-known and highly descriptive and fixed container descriptor, it's running, I believe it's just kind of running on a standard Ubuntu, like, Docker, it's an Azure container service, so.

39:15 - Somebody could do a sys.platform, and get that and get it out, right?

39:19 - Interestingly, yeah, you can try some of that, but how they've kind of helped lock down some of that is there's actually hooks, 'cause I was actually trying to do this, I was trying to exfiltrate and/or escape the box, but they actually--

39:33 - You can't contain me.

39:34 (laughing)

39:34 - They have hooks in the Python interpreter that basically can catch anything before it touches anything outside of Python, and so it's, I mean, I am not a security expert, I am not a security expert, but any naive sort of like, ha ha ha, I'm gonna import sys and manipulate permissions.

39:51 - OS.run, open pip install.

39:55 - Exactly, so all of that is pretty safe.

39:58 It's gonna be a modern version of Python, it's gonna have everything up to date, and from a supply chain standpoint, it's all provided by Anaconda, which they have assertions about code origin and stuff like that.

40:10 So really it should be a, the most best definition of a reproducible environment that will literally, so like even in perpetuity, like if you hand the worksheet to somebody else and you 10 years later come back to it, assuming the earth hasn't blown up, presumably it'll run in a container of exactly the same description so you don't have to worry about, oh shoot, I wrote this in a version of Python that now the package was built with Python 2.7.

40:36 - Can you believe I wrote it in Python 3 and it's just like, we're on Python 14, it doesn't work.

40:41 - So many things are still in 2.7.

40:43 - Yeah, I know, it's wrong.

40:45 Yeah, like so the statement that I saw on sort of the stability says, "Existing workbooks will still calculate "against the version of the environment "the workbook was created on, "and users can be prompted to upgrade "if they want like new run times, "new Python versions or whatever." So basically, there's no requirements.txt or pyproject.toml pinning your versions, but at the time of creation, it kind of snapshots effectively a pin virtual environment.

41:10 - Yeah, and as I mentioned, there is kind of like a fixed startup script for it that does those kind of suggested imports and settings for things, which you can actually view in Excel, it shows you and it's read only right now, but that might be a thing that you might be able to edit at the beginning.

41:28 So if there are the things in that one cell that I always put at the beginning, maybe I can change that for my notebooks, sorts of things.

41:34 - Yeah, so it's still in preview.

41:37 It's not like if I go to my Excel that I've got installed on my Mac and just try to type this, it's not gonna like it.

41:44 - It's not gonna work because it's a Mac.

41:45 - How do you get it?

41:46 - Yes, so right now, it's in the Windows Insider program.

41:50 So you have to have a M365 sort of subscription for Excel.

41:55 You have to enroll in the Insiders program, which you can do if you go to your account settings in the app right now, like, because it is a gradual rollout.

42:05 I think the initial public preview was only in September.

42:08 - Yeah, it's pretty new.

42:09 - Yeah, it's quite new.

42:10 Like pretty much every week when I start working on this, I'm like, all right, what, you know, does this look the same as it did last week?

42:18 So, but right now it's only available for Windows machines.

42:20 It is fully intended to be available on all platforms, including the web-based version, which will be super cool.

42:27 But yeah, right now, Windows initially and the feature itself is, will have some sort of cost, what that is and how much and when is TBD, just because this is a very--

42:40 - Is it extra, who knows?

42:41 - Yeah, exactly.

42:43 - Yeah, awesome.

42:44 Well, really cool to get a look at this.

42:45 I think for me personally, I think the coolest thing about it is that it's gonna expose more people to Python, give them a little bit of confidence and then they can go, only 20 libraries?

42:55 I heard, oh, over on pypi.org, we're just under 500,000.

43:00 Maybe I could expand my world a bit, right?

43:02 That's awesome.

43:03 - Yeah, and you don't have to like, all right, download a version of Python or download a version of Anaconda.

43:08 Now make sure you have all the things installed.

43:11 Make sure you have permissions, things on path.

43:13 Like all of that just goes away.

43:14 - Create a virtual environment, activate it.

43:16 Yeah, all this stuff, yeah.

43:17 - It's just already handled for you and you can get right to the fun stuff.

43:20 - Yeah, that's super cool.

43:21 All right, well, I always ask at the end of the show for some PyPI package that you think is cool, you've come across, like anything that you're like, oh, that's interesting people should know about.

43:31 It doesn't have to be popular.

43:33 - Oh no, no, no.

43:33 There's one I can't think of the name, generates a bunch of fake data for you to use.

43:39 - Okay, there's, is it Mockaroo?

43:41 And Mockaroo is the website.

43:42 - Yeah.

43:43 - There's Faker.

43:44 - Faker is good, but it's like hypothesis.

43:47 - Okay.

43:47 - Come up with a different one though.

43:49 - Well, one that is near and dear to my heart is actually called PyQIR.

43:55 So one of the, I mentioned the quantum computing stuff before, one of the open source communities I'm still really engaged with is actually an open source consortium to basically create a LLVM style machine agnostic, language agnostic representation for quantum programs.

44:12 - Wow, okay.

44:12 - Which is really cool and like coming, most of programming right now has kind of, was born in Python.

44:18 Like Python was the host for all of this, but as it's growing up and kind of leaving the nest as it were from Python, basically we need ways to interop like this huge existing tool ecosystem with actual hardware and hardware that supports things that basically the Python like mental model doesn't really match up.

44:37 - Right.

44:38 - And that's why you kind of need domain specific languages for quantum computing.

44:41 But this PyQIR package gives you kind of an easy way to connect and like bootstrap that interop from your quantum programs that you've written in Python to this intermediate representation, QIR, quantum intermediate representation, which is actually LLVM.

45:00 So you can, you don't have to figure out how to write and export LLVM files.

45:05 - Yeah, awesome.

45:06 - Which is both very fun and informative and you can do it through Python and Rust.

45:12 The Py03 Rust crate is my favorite.

45:14 - Yeah.

45:15 - Yeah, so that's, if you're at all interested in quantum computing stuff and you have a Python skill set or interest, that would definitely be a package I'd say go check out.

45:24 - Excellent, all right, well, final call to action, people wanna get started with this stuff, what do they do?

45:28 - Yeah, definitely.

45:29 There is a great blog post with the original announcement that's got all of the detailed steps for how you find and enroll in the insiders programs.

45:39 But other than that, there is a GitHub repo.

45:41 If you do get a chance to play with this and find all the things that you hate about it and wanna come yell at people about, I field a lot of those questions and enjoy it.

45:51 So yeah, you can just come file issues on the GitHub repo and like if something isn't working the way you thought or you find a bug, that's kind of the best way that the team is very active.

46:01 And so that's a great way to kind of get some good interaction with the folks who are building this in real time.

46:09 - Very cool.

46:10 All right, well, Sarah, thanks for being here.

46:11 - Yeah, absolutely, thanks so much for having me.

46:13 - Yeah, you bet.

46:13 Cheers. - Cheers.

46:14 - This has been another episode of Talk Python to Me.

46:18 Thank you to our sponsors.

46:19 Be sure to check out what they're offering.

46:20 It really helps support the show.

46:23 This episode is sponsored by Posit Connect from the makers of Shiny.

46:27 Publish, share and deploy all of your data projects that you're creating using Python.

46:31 Streamlet, Dash, Shiny, Bokeh, FastAPI, Flask, Quarto, Reports, Dashboards and APIs.

46:38 Posit Connect supports all of them.

46:40 Try Posit Connect for free by going to talkpython.fm/posit, P-O-S-I-T.

46:47 Are you ready to level up your Python career?

46:50 And could you use a little bit of personal and individualized guidance to do so?

46:55 Check out the PyBites Python Developer Mindset Program.

46:59 Want to level up your Python?

47:00 We have one of the largest catalogs of Python video courses over at Talk Python.

47:04 Our content ranges from true beginners to deeply advanced topics like memory and async.

47:09 And best of all, there's not a subscription in sight.

47:12 Check it out for yourself at training.talkpython.fm.

47:15 Be sure to subscribe to the show, open your favorite podcast app and search for Python.

47:20 We should be right at the top.

47:21 You can also find the iTunes feed at /iTunes, the Google Play feed at /play and the direct RSS feed at /rss on talkpython.fm.

47:30 We're live streaming most of our recordings these days.

47:33 If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

47:42 This is your host, Michael Kennedy.

47:43 Thanks so much for listening.

47:44 I really appreciate it.

47:45 Now get out there and write some Python code.

47:48 (upbeat music)

48:06 you

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon