Learn Python with Talk Python's 270 hours of courses

#445: Inside Azure Data Centers with Mark Russinovich Transcript

Recorded on Thursday, Nov 16, 2023.

00:00 When you run your code in the cloud, how much do you know about where it runs?

00:03 I mean, the hardware it runs on and the data center it runs in.

00:08 There are just a couple of hyperscale cloud providers in the world.

00:11 This episode is a unique chance to get a deep look inside one of them, Microsoft Azure.

00:16 Azure is comprised of over 200 physical data centers with hundreds of thousands of servers in each one of those.

00:24 A look at how code runs on them is fascinating.

00:27 Our guide for this journey will be Mark Rusunovic.

00:30 Mark is the CTO of Microsoft Azure and a technical fellow, Microsoft's senior most technical position.

00:36 He's also a bit of a programming hero of mine.

00:39 Even if you don't host your code in the cloud, I think you'll enjoy this conversation.

00:43 Let's dive in.

00:44 This is Talk Python to Me, episode 445, recorded on site at Microsoft Ignite in Seattle, November 16th, 2023.

00:53 Welcome to Talk Python to Me, a weekly podcast on Python.

01:10 This is your host, Michael Kennedy.

01:12 Follow me on Mastodon, where I'm @mkennedy, and follow the podcast using @talkpython, both on fosstodon.org.

01:20 Keep up with the show and listen to over seven years of past episodes at talkpython.fm.

01:24 We've started streaming most of our episodes live on YouTube.

01:28 Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.

01:37 This episode is sponsored by Posit Connect from the makers of Shiny.

01:41 Publish, share, and deploy all of your data projects that you're creating using Python.

01:46 Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Reports, Dashboards, and APIs.

01:52 Posit Connect supports all of them.

01:54 Try Posit Connect for free by going to talkpython.fm/Posit.

01:58 P-O-S-I-T.

02:00 And it's brought to you by the PyBites Developer Mindset Program.

02:03 PyBytes' core mission is to help you break the vicious cycle of tutorial paralysis through developing real-world applications.

02:10 The PyBites Developer Mindset Program will help you build the confidence you need to become a highly effective developer.

02:16 Check it out at talkpython.fm/PDM.

02:21 Mark, welcome to Talk Python to Me.

02:22 Thanks.

02:23 Thanks, Michael.

02:24 Yeah, it's fantastic to have you here.

02:25 I've been a fan of your work for a really long time.

02:28 We're going to have a really cool look inside of Azure.

02:31 And there's not very many hyperscale clouds in the world.

02:34 You can probably count them on your hands, right?

02:36 And so I think as developers, Python developers generally, we'll be really interested to just kind of get a sense of when we run on the cloud, what exactly does that mean?

02:45 Because it's been a journey.

02:47 Yeah, for sure.

02:48 Before we dive into that, though, and some other cool things you're up to, tell people a bit about yourself.

02:53 Sure.

02:54 I'm a CTO and technical fellow at Microsoft, CTO of Azure.

02:57 I've been in Azure since 2010.

02:59 Prior to that, I was in Windows.

03:00 I joined Microsoft in 2006 when it acquired my software company and my freeware website, Winternals and Sysinternals, respectively.

03:07 And since 2010, I've effectively been in the same role the entire time, which is overseeing technical strategy and architecture for the Azure platform.

03:16 And the skill of that is quite something.

03:18 So it'll be great to get in that.

03:20 That's awesome.

03:21 You first came on my radar, probably in the late 90s, early aughts, through the Sysinternals thing, not through Microsoft.

03:29 And you brought that up.

03:30 So tell people a bit about Sysinternals.

03:32 It was like if you wanted to see inside your what your app is doing on Windows, go to Sysinternals, right?

03:38 Tell us about that.

03:38 Sysinternals grew out of my just love of understanding the way things work.

03:42 And I was doing a lot of work on Windows internals actually in my PhD program where I was trying to figure out how to get operating systems to be able to save their state and then come back in case of a failure.

03:55 So I learned the internals of Windows 3.1 and then Windows 95 and then Windows NT and started to think about cool ways that I could understand the way things worked underneath the hood.

04:06 So actually, the first Sysinternals tool was something called Control to Cap, which swaps the caps lock and key, you know, the control key because I came from a Unix background.

04:17 Who needs caps lock these days?

04:18 Yeah, exactly.

04:19 No one should be that.

04:19 Yeah, I'm not yelling at people very much.

04:21 So the second tool that I wrote was actually called NTFS-DOS to bring NTFS to DOS.

04:28 But the native Windows NT tools that Bryce Cogswell, who I met in grad school, and I co-wrote together were Regmon and Filemon.

04:36 They were like the originals.

04:37 And Regmon allowed you to watch registry activity, Filemon, file system activity.

04:41 We later merged them into Process Monitor after we joined Microsoft.

04:45 But we decided to make those tools available for free and available.

04:51 So we started the ntinternals.com website, which then Microsoft's lawyer said, don't use NT.

04:56 So we switched over to renaming it Sysinternals.

04:59 And then Bryce was like, hey, some of the tools that we've made, we should sell.

05:03 So we wrote a tool that would allow you to mount a dead NT system through a serial cable as if it was a local drive on the recovery system.

05:10 And he said, if we make a read-write version, we should sell it.

05:13 So he started, he went and set up a credit account, you know, credit card account on e-commerce.

05:20 Authorized.net or something like that.

05:21 It was like weird APIs.

05:23 Yeah.

05:23 And we started selling the software and we started selling the software and that grew into what became Winternals, a commercial software company.

05:29 But Sysinternals and Winternals, like I said, were both acquired at the time.

05:33 I joined Microsoft in 2006.

05:36 But Bryce and I continued to work on Sysinternals.

05:39 He worked on them until he retired from Microsoft and retired just in general four years later.

05:44 And then I've continued.

05:45 I have now a couple people, three people working on Sysinternals engineering system, keeping them healthy and the build pipelines working and then adding features to them.

05:56 And I still code on them and still add features.

05:58 Like just did a ZoomIt release a few days ago.

06:01 I added some, I blur and highlight to ZoomIt.

06:04 Yeah, people are doing presentations on Windows.

06:06 With ZoomIt, you can say, let me quickly draw on the screen, not PowerPoint, but just whatever happens to be on your screen, which is really nice.

06:12 I love my Macs these days, but boy, I wish ZoomIt existed on the Mac.

06:16 Well, actually, people have asked for ZoomIt for Mac and I'd like to make a ZoomIt for Mac.

06:21 And now with Copilot, I wonder how good it is at writing Mac apps because I don't want to spend all the time to learn how to write Mac apps just to write ZoomIt.

06:30 But if Copilot can help, maybe it'll, you know, something that I can do in my, you know, you know, spare time.

06:35 I don't know if it'll do it, but it'll get you close.

06:37 It's crazy how these LLMs are writing code for us these days.

06:40 And we're going to talk a bit about maybe how some of those run.

06:42 Yeah.

06:43 And so on.

06:43 I mean, Azure is doing tons of stuff with large language models.

06:48 And you all have some, you know, we're here at the Microsoft Ignite conference.

06:51 You've got some big announcements.

06:52 But yeah, I was such a fan still of Sysmon and I use that for all sorts of things still.

06:59 So super cool.

07:00 Now, before we jump in, I kind of want to talk about some of your big announcements.

07:04 Yeah.

07:04 Because they really caught me off guard here.

07:07 I'm like, yes, this is exciting.

07:08 But maybe since we're going to talk a decent amount about Azure, the internals of hardware, how our code runs, just give us a quick history of Azure.

07:15 You know, when did this whole thing get started?

07:17 Azure started right as I joined Microsoft in 2006.

07:20 There was a group of people, including Dave Cutler, one of the people that I've looked up to because Dave was the original architect behind the VMS operating system.

07:29 And then Windows NT, which is now underlying Windows.

07:31 He and some other people would just at the suggestion of Ray Ozzy.

07:37 This is back when services was a big thing.

07:41 And Ray sent a memo to the company kind of echoing Bill Gates' internet memo saying it's software and services now.

07:48 And they said, how can we build a data center scale type platform to make it easier for Microsoft to develop services?

07:55 And so this was called Project Red Dog, which was incubating for a while.

07:59 And then 2008, they publicly launched it because Steve said, we need to make this available to third parties as well as Windows Azure.

08:06 And I'm sorry, 2008, they announced the preview of it.

08:09 2010, it commercially launched publicly in February.

08:13 And I joined in July.

08:14 And a few years later, with the rise of open source software and so many enterprise customers wanting to have Linux, we rebranded it Microsoft Azure.

08:24 And we also...

08:25 Can I run Linux on Windows Azure?

08:27 I don't know if that makes any sense.

08:28 Yeah.

08:28 And one of the first things I'd done, worked on with Corey Sanders was being asked, hey, we've got platform as a service.

08:36 We have this thing called cloud services, this model for how you write apps.

08:40 But our enterprise customers were saying, I can't move my existing IT stuff to Azure because it just needs VMs.

08:48 And so the first thing we did was, hey, we should get IaaS capability in Azure.

08:53 And so in 2012, we launched the preview of IaaS for Azure.

08:57 And that's really when the business started to take off because enterprises then could, with minimal effort, start to move.

09:02 Oh, well, that's like doing what we do in the...

09:04 Yeah.

09:05 In their own data center.

09:06 Our data center, but in your data center.

09:07 Yeah.

09:07 Yeah, exactly.

09:08 Now no one even thinks about it.

09:09 Exactly.

09:10 So IaaS has continued to evolve.

09:12 PaaS has continued to evolve.

09:13 Cloud services was designed in a world without containers.

09:16 Now we've got containerization, the rise of Kubernetes, and then application models on top of containers.

09:22 And so Azure has evolved.

09:23 And actually, I think, led some of that evolution of cloud native computing up into containers and abstractions.

09:30 But it's been a long, long journey towards that.

09:32 I mean, I think one of the things is I've always believed that ultimately cloud should be about making it easy for developers to say, here's what I want.

09:39 And then the cloud takes care of the rest.

09:41 And we're moving towards it relentlessly, that time when you'll really be able to do that.

09:47 Yeah, so you don't have to know DevOps.

09:49 You don't have to know distributed architectures.

09:52 You just give you guys a good idea.

09:54 Yeah, which is beautiful.

09:55 It's beautiful.

09:56 Now, real quickly, just give us a scale.

09:59 Like, think of how many data centers, how many servers, how many miles, fiber.

10:04 Yeah.

10:05 It's kind of astonishing.

10:06 It is pretty flabbergasting.

10:07 And the numbers continue to grow exponentially.

10:11 I'll just give you, because I remember when I first started in Azure, I was asked to give a talk at the Azure All Hands about architecture and some of the announcements we had coming.

10:20 And the All Hands was two rooms with the partition removed in our on-campus conference center, meetings room, meeting center.

10:30 We totaled about 500 people.

10:32 That was all of the Azure team in 2010.

10:34 And really, nobody outside the Azure team knew anything about Azure.

10:38 Yeah.

10:38 It was kind of a secret even inside, right?

10:41 Yeah.

10:41 So effectively, that was like most, at least half the people in the world that knew anything about Azure was in those two rooms.

10:47 And today, you know, Scott Guthrie's organization, Cloud and AI, all of it's working on Azure.

10:52 And that's tens of thousands of people.

10:54 So at least, you know, a good percentage, majority percentage even of the company is working directly on things that come under the Azure umbrella.

11:04 So it's come a long way from that perspective.

11:06 And you talked about physical scale.

11:08 Back then, when we originally launched Azure in two regions, it was like 40,000 servers, like 20,000 in one, 20,000 in the other.

11:16 That's still a lot of servers.

11:17 Yeah.

11:17 Yeah.

11:18 But, you know, it was like, that is kind of cloud scale back then.

11:22 Now we are at millions of servers.

11:26 And when it comes to data centers, we've got 60 regions around the world, 60 plus regions.

11:31 And each of those consists of one, in many cases, multiple data centers.

11:37 And we're still building out.

11:39 We're launching a data, like two data centers every week, I think is the number that we're launching.

11:43 Wow.

11:43 That's crazy.

11:44 And these could be slotted into one of these regions or it could be.

11:47 Yeah.

11:48 A new region.

11:48 Totally new.

11:48 Yeah.

11:49 Yeah.

11:49 Uh-huh.

11:49 Incredible.

11:50 Incredible.

11:50 Incredible.

11:50 And so the big announcement that I wanted to ask you about, just before we run out of time,

11:54 and then we'll dive into some of that, that sort of how does your code run story is Azure Cobalt.

11:59 Yeah.

12:00 That's a new processor you guys announced.

12:02 And, you know, listeners know, I'm a big fan of Apple Silicon and how it sort of changed the computing landscape for power and speed on like little laptops and stuff.

12:10 And this is kind of that idea, but for the data center, right?

12:13 It is.

12:14 Tell us about that.

12:14 It is that idea.

12:15 Yeah.

12:15 I think having a processor that can be designed really with our specifications.

12:20 If you take a look at Intel and AMD processors, they're fantastic processors.

12:24 They're very versatile.

12:25 They're taking requirements from lots of different sources.

12:28 And so we're just, we're a voice.

12:31 We're a significant voice when it comes to saying we'd like your processors to do these things.

12:35 When we have our own, we've got the ability to just decide unilaterally what we'd like it to do based off of what we see and can vertically integrate it into our systems.

12:44 We can put it on SSEs and integrate it with memory and GPUs.

12:48 And so that is kind of the reason that we've done that verticalization for processors.

12:54 That's not to say that the other processors aren't going to be significant.

12:58 It's going to be probably a blend of these.

13:00 It's going to be a blend.

13:01 They'll have different capabilities that ours won't have.

13:03 There are customers that want the specific features that they've got or performance speeds and feeds that they've got because they're not all going to look the same.

13:10 And so I think it's just better optionality for everybody.

13:14 Well, I can tell you as somebody who tries to run Linux on that thing.

13:19 Yeah.

13:19 It's hit and miss if there's even an ARM version available.

13:22 Right.

13:23 More often than not, there's not.

13:24 And so there's certainly not going to be an insane rush to just let drop everything because there's a lot of code that's written for.

13:30 Exactly.

13:31 Yeah.

13:31 Yeah.

13:31 And optimized.

13:32 That's the other thing too.

13:33 Yeah.

13:34 Yeah, for sure.

13:35 But yeah, so on my list here of things I was going to ask you is, well, what about ARM in the data center?

13:39 Well, that is ARM.

13:40 I know, exactly.

13:41 I'm like, well, okay, so you guys beat me to the punch.

13:43 Exactly.

13:44 This portion of Talk Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio, and especially Shiny for Python.

13:53 Let me ask you a question.

13:55 Are you building awesome things?

13:56 Of course you are.

13:57 You're a developer or a data scientist.

13:59 That's what we do.

14:00 And you should check out Posit Connect.

14:02 Posit Connect is a way for you to publish, share, and deploy all the data products that you're building using Python.

14:09 People ask me the same question all the time.

14:12 Michael, I have some cool data science project or notebook that I built.

14:15 How do I share it with my users, stakeholders, teammates?

14:18 Do I need to learn FastAPI or Flask or maybe Vue or React.js?

14:23 Hold on now.

14:24 Those are cool technologies, and I'm sure you'd benefit from them, but maybe stay focused on the data project.

14:29 Let Posit Connect handle that side of things.

14:32 With Posit Connect, you can rapidly and securely deploy the things you build in Python.

14:37 Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quattro, Reports, Dashboards, and APIs.

14:43 Posit Connect supports all of them.

14:45 And Posit Connect comes with all the bells and whistles to satisfy IT and other enterprise requirements.

14:51 Make deployment the easiest step in your workflow with Posit Connect.

14:55 For a limited time, you can try Posit Connect for free for three months by going to talkpython.fm/posit.

15:02 That's talkpython.fm/P-O-S-I-T.

15:05 The link is in your podcast player show notes.

15:07 Thank you to the team at Posit for supporting Talk Python.

15:12 Awesome.

15:12 Now, one of the things I wanted to kind of maybe have you go through for listeners that I think is just super interesting is sort of the evolution of the hardware where our code runs throughout the data center.

15:23 So you talked about in some of your talks like the data center generations and what is that, like six or seven, eight maybe different variations.

15:31 I'll kind of give you some prompts from them and all.

15:33 But one of the things that I was thinking about when I was looking at this is, you know, do you have a bunch of small servers or do you have like, or do you partition up really large servers, right?

15:43 What's the right flow for that?

15:45 So one of the things that you've seen since the start of cloud, back when we launched Azure, there was one server type.

15:53 Yeah.

15:53 And we had different virtual machines offerings, but they were just all different sizes that could fit on that one server.

16:00 It was a 32 core Dell Optron with, I think, 32 gig of RAM.

16:04 Yeah.

16:04 And so that was the server back then.

16:06 What we've seen is more workloads come to the cloud that have different requirements.

16:10 Some require large memory.

16:12 Some require more compute.

16:13 Some require GPUs.

16:15 Some require InfiniBand backend networking for high performance computing.

16:19 And so there's been a drastic diversification of the server hardware in Azure that's being offered and current at any one point in time.

16:29 And I think you'll continue to see that.

16:31 So the old, you know, it's just a pizza box.

16:34 It's a low end commodity server.

16:37 Kind of that's the cloud vision back in 2010.

16:40 Now it's the cloud contains specialized servers for specific applications.

16:45 And when it comes to large servers back in 2014, we started to introduce very large servers.

16:51 The kind that, you know, people that were cloud purists back in 2010 would have been like, no, don't allow that.

16:57 It's all about the cheap.

16:58 It's all about the cheap and scale out.

17:00 Is scale up servers for SAP workloads in memory database workloads.

17:05 So we introduced a machine that we nicknamed Godzilla, which had 512 gig of RAM in 2014, which was like an astonishing number.

17:15 And we've continued to, as SAP workloads have gotten bigger and bigger and more has migrated to the cloud, created bigger and bigger and bigger and bigger and bigger machines.

17:24 In fact, I'm showing here at Ignite the latest generation of the SAP scale up machines that we're offering.

17:29 It's not yet, they're not yet public, but I'm going to show a demo of them, of one of them call that I'm calling, nicknaming super mega Godzilla beast.

17:38 because we've gone through so many iterations.

17:41 So this, this one is super is the new, yeah.

17:44 You're running low on adjectives here.

17:46 And I don't know what I'll come up with next, but anyway, we're at super mega Godzilla beast as the current generation, which has 1790 cores.

17:53 Wow.

17:54 And 32 terabytes of RAM, 32 terabytes of RAM.

17:58 Incredible.

17:59 So do you do, things to like pin VMs to certain cores so that they get better cash hits and stuff like that rather than let it just kind of mosh around?

18:09 That's especially important with NUMA architectures where you've got memory that has different latencies to different sockets.

18:15 As you want to have the VM that's using certain cores on a socket, have memory that is close to it, close to that socket.

18:23 So that's, and that's just part of hyper V scheduling, is doing that kind of assignment, which we have under the hood.

18:29 And again, it's like the control plane at the very top says launch a virtual machine of this size of this skew.

18:35 Then there's a resource manager, the Azure allocator that goes and figures out this is the best server to put that on.

18:41 It has enough space and will reduce minimize fragmentation and places it on there.

18:47 And then hyper V underneath is saying, okay, these are the cores to assign it to.

18:51 Here's the RAM to give it.

18:52 Excellent.

18:53 And how much of that can you ask for?

18:55 you can ask for the whole machine.

18:56 You can, you can, you can, you can, for a full machine, you know, full, full server virtual machine

19:01 sizes.

19:01 Wow.

19:02 Okay.

19:03 There's probably not too many of those in, but, but some people using them.

19:06 Yeah.

19:06 Like on the SAP ones, because they're designed for SAP.

19:09 I think for those kinds of the current generations, I think we offer just two sizes, like either half of it or the whole, or the whole thing.

19:16 Incredible.

19:17 Wow.

19:18 Okay.

19:18 How, how much of a chunk of a rack does that take?

19:22 It's basically the whole rack.

19:24 Yeah.

19:24 Pretty much top to bottom.

19:25 Yeah.

19:26 It's like a 10 kilowatt server.

19:27 Yeah.

19:28 A little power plane on the side there.

19:30 As you kind of talk through the history of sort of how your code ran, it was more, more colo, as you said, like the more smaller, smaller ones.

19:38 And then as you got bigger and bigger on some of this, you started working on things like, well, how do we let the servers run hotter and have the air cool them rather than more actively cooled?

19:49 And then it even gets to a almost more, just remove big chunks of it and let them fail.

19:55 And then when enough of it has failed, take them out.

19:57 You want to kind of talk about some of that?

19:59 We're still, yeah, good question.

20:01 Cause we're, we're still exploring this space towards higher efficiency, lower energy consumption, more sustainable.

20:07 One of the experiments that came out of Microsoft research was project Natick, which is taking a bunch of servers, putting it in a rack of servers, putting it in a container that has nitrous oxide gas in it.

20:20 So it's an inert gas and dropping it into the ocean floor and letting it be cooled ambiently through the water there.

20:26 Not water on the inside, but the outside of the container.

20:29 Yeah.

20:29 So giant heat sink.

20:31 And the, there was potential benefits of that.

20:33 And it's still something that might get revived at some point.

20:36 But what we've found coming out of that was if the parts are in an inert environment, they have one eighth the failure rates as ones that are in air environment.

20:46 And with particulate matter and corrosive materials in the air.

20:50 So we started exploring liquid cooling, both for that, as well as potential energy savings and more sustainable cooling and then air cooled.

20:59 We've explored a two phase liquid immersion.

21:01 We had a pilot running.

21:03 There's some regulations that's changed around the, the kinds of fluids that have made us take a look at a different direction.

21:09 So we use.

21:10 Is that kind of like the core, the, the, what you would get in like an air conditioner or some of the stuff they'd replace?

21:14 They're called forever chemicals or materials.

21:17 The ones we're using actually aren't, but the regulation is a little broad.

21:22 And so we're just steering clear and it might be revisited at some point, but we're also have been exploring liquid cooling, kind of traditional liquid cooling, cold, cold plate called.

21:32 And some people listening, probably like me are gamers and have liquid cooled GPUs or, or CPUs at home in their gaming rigs, which allow them to get overclocked.

21:42 And it's the same thing we're doing in our data centers.

21:44 In fact, one of the things Satya showed in the keynote was something called sidekick, which is a cabinet that allows us to take liquid cold plate, liquid cooling into an existing data center, air cooled data center, where the Maya 100 AI accelerators are in the cabinet sitting right next to it.

22:02 And the cooling pipes are going into the Maya cabinet to cool the accelerators themselves.

22:09 And so that is.

22:10 So they manage like some big metal plate and then the metal plate is liquid cool.

22:14 Yeah, it's effectively that there's a plate on top of the processor and then liquid is going through that.

22:20 So I'm going to actually show pictures of the inside of the Maya system tomorrow in my AI innovation closing keynote.

22:27 But that is, I think the takeaway here is that at the scale we're at and with the efficiency gains that you might get from even a few percentage, we're exploring everything at the same time, like single phase liquid immersion cooling, still exploring that.

22:42 And then how to do cold plate more efficiently.

22:44 I'll also be showing something called microfluidics we're exploring, which is much more efficient than just pure liquid cold plate, which cold plate is just putting the plate, like you just said, right on top of the processor.

22:56 And so the water's taking the heat away.

22:59 But if we can put the liquid right into the processor, like...

23:04 Are we talking channels in the processor?

23:06 Channels around the processor.

23:07 Okay.

23:07 Just flow it right on top of it.

23:09 And so that's something we're calling microfluidics.

23:12 And I'll show that and talk a little bit about that tomorrow too.

23:15 Offers much more efficient cooling.

23:17 And it's not prime time yet, but looks incredibly promising.

23:21 That looks awesome.

23:22 This portion of Talk Python to Me is brought to you by the Pybytes Python Developer Mindset Program.

23:30 It's run by my two friends and frequent guests, Bob Belderbos and Julian Sequeira.

23:35 And instead of me telling you about it, let's hear them describe their program.

23:39 Happy New Year.

23:40 As we step into 2024, it's time to reflect.

23:44 Think back to last year.

23:45 What did you achieve with Python?

23:46 If you're feeling like you haven't made the progress you wanted and procrastination got the best of you, it's not too late.

23:52 This year can be different.

23:54 This year can be your year of Python mastery.

23:58 At Pybytes, we understand the journey of learning Python.

24:01 Our coaching program is tailor-made to help you break through barriers and truly excel.

24:07 Don't let another year slip by with unmet goals.

24:11 Join us at Pybytes and let's make 2024 the year you conquer Python.

24:16 Check out PDM today, our flagship coaching program, and let's chat about your Python journey.

24:22 Apply for the Python Developer Mindset today.

24:25 It's quick and free to apply.

24:28 The link is in your podcast player show notes.

24:30 Thanks to Pybytes for sponsoring the show.

24:33 One of the things I saw in the opening keynote, I don't know if it fits into what you were just talking about or if it's also another thing where it actually had the whole motherboard submerged and then even just the entire computer is just underwater.

24:46 So that was two-phase liquid immersion cooling, like I mentioned, is just dunking the whole thing in the dielectric fluid.

24:51 And you had it boil at a low temperature, I guess, because the phase change is like an extremely energy-intense, aka heat exchange.

24:58 Yeah, and it's actually two-phase because of the boiling.

25:02 It phase changes into gas and then condenses again back into liquid.

25:07 So that was the idea behind two phases.

25:10 I see.

25:10 Instead of just running a radiator, you almost condense it back somewhere else and then bring it back around?

25:15 Yeah.

25:15 Okay.

25:16 Wow, cool.

25:17 So if we go and run our Python codes, whether it's PASS or IaaS or whatever, what's the chance?

25:22 It's hitting that or is this kind of cutting-edge stuff reserved for high-energy AI training?

25:28 Is it more like if we ask ChatGPT, it's liquid cooled?

25:32 Our standard data centers right now are air-cooled.

25:35 So it's air-cooled servers.

25:37 This Maya part is liquid cooled.

25:39 So in our first summer, we've got some of our own workloads now starting to leverage Maya.

25:44 Do you put on your own workloads first just in case?

25:47 Well, it's just to see it.

25:48 It's just we're proving it out.

25:49 Shake it out, yeah.

25:50 Yeah, it's kind of piloting it, yeah.

25:51 You're not offering that up to the big customers just right away.

25:55 And so Maya, I don't know how many people know about this either.

25:59 This is one of the GPU training systems you guys have.

26:04 I mean, for those who don't know, OpenAI and ChatGPT run on Azure, which probably takes a couple of cores, a couple of GPUs to make happen.

26:12 Yeah.

26:12 I want to talk about that.

26:13 Yeah. So right now our fleet, a large-scale AI supercomputing fleet is made up of NVIDIA parts.

26:21 So the previous generation was, well, the original generation that we trained GPT-3 on with OpenAI was NVIDIA V100s.

26:29 Then we introduced A100s, which is what GPT-4 was trained on.

26:32 And these are graphics cards, like 4080s or something, but specifically for AI, right?

26:38 Yeah.

26:38 Okay.

26:38 That's right.

26:39 And then the current generation of supercomputer we're building for OpenAI training, their next generation of their model, that's NVIDIA H100 GPUs.

26:47 Then Maya is our own custom AI accelerator.

26:50 So it's not a GPU.

26:51 You know, one of the aspects of NVIDIA's parts has been their GPU base.

26:56 So they also can do texture mapping, for example.

26:59 But you don't need that if you're just doing pure AI workloads.

27:03 So-

27:04 Back to that specialization, right?

27:05 Yeah, exactly.

27:05 Like, so if you could build it just for the one thing, maybe you'd build it slightly differently.

27:09 That's right.

27:09 Okay.

27:10 So Maya is just designed purely for matrix operations used in, in fact, low-precision matrix operations used for AI training and inference.

27:18 And so that is the specialized part that we've created called Maya 100, the first generation of that.

27:25 Well, if I think of like some of the stuff presented at the opening keynote and stuff here, I think the word AI was said a record number of times, right?

27:35 Yeah, I don't think there was a topic there.

27:36 Oh my goodness.

27:37 I wasn't a part of.

27:38 And so how much is this changing things for you guys?

27:40 Like 12 months ago or something, Chattapiti appeared on the scene and-

27:45 I mean, it's changing.

27:46 It's literally changing everything.

27:48 You know, Jensen was saying this is the biggest thing since the internet.

27:52 Yeah.

27:52 Jensen being the CEO of-

27:53 Jensen, yeah.

27:54 Yeah, yeah.

27:54 Who was on stage with Satya at the keynote.

27:56 It is changing everything.

27:59 It's changing not just the product offerings.

28:01 So the way that we, you know, have integrate AI into the products using Copilot, it's changing the way we develop the products as well and the way that we run our systems inside already.

28:11 So for example, incident management, we've got Copilot built, you know, our own Copilot internally built into that.

28:16 So somebody that's responding to an issue in our production systems can say, okay, so what's going on?

28:22 What's the, what happened with this?

28:24 Yeah.

28:25 Show me the graph of this.

28:26 You know, just be able to use human language to get caught up in what's going on.

28:30 Yeah.

28:30 People tell me it's just statistics, just prediction.

28:32 It doesn't feel like, it doesn't feel like prediction.

28:35 At some, you know, the, the, the people that say that, I think are missing the scale of the statistics.

28:41 And because we're probably predicting a little bit, like thinking about what are you going to say next?

28:46 Exactly.

28:46 That's what we're, we're statistical.

28:48 Yeah.

28:48 Yeah.

28:49 And so, so it's just, once you get statistics, that at a large enough scale that you start to see something that looks like what we call intelligence.

28:58 Yeah.

28:59 Yeah.

29:00 It's, it's really incredible.

29:01 I'm starting to use it to just write my, get commit logs for me, you know, push a button and it says, oh, you added error handling to the background task.

29:09 So in case this fails, you'll be more resilient and it'll keep running.

29:13 I'm like, that's better than I could have got it.

29:14 Just thought that might crash, you know?

29:17 And you just push the button and it's just, it's magic.

29:20 It's magical.

29:21 Yeah.

29:21 It really is.

29:22 I mean, it's not, it, it's called co-pilot for a reason because we're not at the point yet where you can just let it do what it does autonomously.

29:28 You need to check its work.

29:29 Like you need to look at it and say, oops, you know, that time it screwed it up.

29:33 It didn't get it quite right.

29:33 Or I need to add more context to this than it had, or extracted.

29:38 So, but as far as accelerating work, it's just a game changer.

29:42 Yeah, it really, really is.

29:44 So before we run out of time, I want to ask you just a couple more things, a bit of a diversion.

29:48 So Python, we saw Python appear in the keynote.

29:52 They were showing off, I can't remember who it wasn't.

29:54 Satya was whoever followed him.

29:56 They look, we want to show off our new sharing of the insanely large GPUs for machine learnings.

30:01 Let's just pull up some Python and a Jupyter notebook and I'll just check that out.

30:05 And you're like, wait, where are we again?

30:06 Really interesting.

30:08 So, you know, you said you're using a little bit of Python yourself.

30:11 Like what's Python look like in your world?

30:13 Well, so the reason that I'm using Python is I took a sabbatical this summer.

30:16 And so I just, I was like, I'm going to do some AI research.

30:19 So I got connected with an AI researcher.

30:21 In fact, I'm going to talk about this at my keynote tomorrow.

30:23 some of the work that came out of it.

30:24 The obviously AI is completely Python at these days.

30:28 Yeah.

30:29 Almost entirely.

30:29 Yeah.

30:30 So I was, I spent the whole summer and I still am spending my time in Python,

30:35 Jupyter notebooks, and then Python scripts when you want to do some, some run for,

30:39 for final result.

30:40 so I hadn't used really Python before other than in passing.

30:44 I mean, it's a very language.

30:45 It's very easy to pick up.

30:46 Yeah.

30:47 There's a, there's a t-shirt that says I learned Python.

30:49 It was a good weekend.

30:50 Yeah.

30:50 It's a bit of a joke, but it's a good joke.

30:51 Yeah.

30:52 And I think that's what makes it so powerful is that it's so easy to pick up,

30:57 but what's made it even easier for me to pick it up.

30:59 I'll, I'd say that I'm a, my, I'm a mediocre Python programmer, but I'm using Copilot.

31:04 And that's made me an expert Python coder.

31:07 How do I do this?

31:08 Yeah.

31:09 And it's like, I've never, I don't go to stack over, you know, it's a question.

31:13 I don't go for stack overflow for questions.

31:15 I haven't had to get a book on Python.

31:17 I basically just either ask Copilot explicitly, like, how do I do this or write me this, or I put it in the function or in the comment and it

31:26 gets it done for me.

31:27 And there's a, occasionally I'll have to go hand met, edit it and figure out what's going

31:31 on.

31:32 But for the most part, it is writing almost all my code.

31:35 And so my goal is how can I just not have it write everything for me?

31:39 So that has kind of become the way that I program in Python.

31:41 And I think Python and AI and it, the knowledge of Copilot for Python, because open AI, obviously

31:48 for their own purposes has made GPT four and GPT three, five before it really know Python.

31:55 I hadn't really thought of that connection, but of course they wanted to answer Python

31:59 questions, I'm sure.

31:59 For themselves.

32:00 Yeah.

32:00 So I think when it comes to seeing what AI can do for programming, Python is at the forefront

32:07 of that.

32:08 What was your impression of it?

32:09 I mean, I'm sure you've probably seen it before, but like, what's your impression of working in

32:12 it coming from a curly brace semicolon type language like C++ or something?

32:17 Because they drop a bunch of parentheses, they have these tab space, these four spaces rules

32:22 and stuff.

32:22 Well, it's, you know, the YAML versus Jason.

32:25 It is kind of a debate, isn't it?

32:27 But I mean, I've gotten used to it.

32:30 It's not a, it's not a big deal.

32:31 And I find it's, it's less verbose than C.

32:35 There's less typing.

32:36 Less symbol noise.

32:38 You can just kind of get the essence.

32:39 Yeah.

32:39 Yeah.

32:39 Yeah.

32:40 I kind of had that experience as well, coming from a C based language.

32:42 I'm like, wow, this is really weird.

32:43 And then after I went back to C#, I'm like, but this is also weird.

32:47 And I kind of like the clarity over here.

32:49 So now what do I do with life?

32:51 And then you go back and like semicolons are annoying now.

32:54 Yes, exactly.

32:55 I like, I thought they were needed.

32:56 They're not needed.

32:56 What's going on?

32:57 Another thing that I think, you know, maybe people really enjoy hearing a bit about, and

33:02 I'm a big fan of, you wrote a three part series of novels about computer hackers called Zero

33:09 Day.

33:09 Really good.

33:10 I read all of them back when they came out and so much of this like computer mystery stuff

33:16 is like, oh, they're using VB6.

33:18 I'm going to get their IP address.

33:19 You're like, wait, what?

33:20 I mean, those words are meaningful, but the sense is not right.

33:23 And you know, your books are like a lot of sort of spy stuff, but also a lot of really

33:28 cool, legit, reasonably possible computer stuff.

33:31 Yeah.

33:32 Tell people a quick bit about that if they want to catch up.

33:34 I love cyber, I love thrillers growing up, techno thrillers.

33:37 I read Dramatistrain when I was like in seventh grade and I was like, this book is so cool

33:42 because it's like, I'm learning science plus it's, you know, it's really exciting.

33:48 So I've always wanted to write one.

33:49 And then coming into after, into the late 1990s, when you started to see some of these large

33:56 scale viruses, I was just thinking, this is such a powerful weapon for somebody to cause

34:01 destruction.

34:02 And then 9-11 happened.

34:03 I'm like, all right, logical next step is leveraging a cyber weapon to do something with the same

34:09 goals.

34:09 And so that's what led me to write Zero Day, which is taking that idea of using a cyber

34:16 weapon for terrorism.

34:17 Then I was like, oh, that book was really well received.

34:20 I had a lot of fun doing it.

34:21 So let me write the next one.

34:22 And I wanted to continue in this theme with the same characters, Daryl, Hagen and, and,

34:26 Jeff Akin and say, what, what's something else?

34:29 What's another cyber security angle that I can take a look at in the second one.

34:34 So the second one was state sponsored, cyber espionage.

34:37 And I actually, the ironic thing is I'd already had Iran in the story.

34:41 I'd already had China in the story.

34:42 I had people trying to figure out how to get Iran a nuclear weapon.

34:46 And then Stuxnet happened right when I was still writing the book.

34:49 And I'm like, okay, this is like a small part of my plot line.

34:52 Yeah.

34:53 Line it up for you.

34:54 So I had to change the book a little to acknowledge Stuxnet happening.

34:57 And then the third one was about insider threats, which I think is one of the toughest

35:01 threats to deal with.

35:02 In this case, it was a long range plot from some people that wanted to compromise,

35:07 stock exchange and kind of a mixture of high frequency trading and insider threat with

35:12 cybersecurity systems, was the third one called rogue code.

35:15 Yeah.

35:16 So they were all really good.

35:17 I really enjoyed it.

35:18 Were you a fan of Mr. Robot?

35:19 Did you ever watch that series?

35:21 I did.

35:21 I love that series.

35:22 Yeah.

35:22 Oh my gosh.

35:23 Again, it's, it seemed pretty plausible.

35:25 Yeah.

35:26 I really liked that.

35:27 Imagine a lot of people out there listening have seen, seen Mr. Robot as well.

35:30 If they want, you know, that kind of idea, but in just a series, they can binge.

35:34 Yeah.

35:35 Cool.

35:35 Maybe we should wrap up our, our chat here, but just a quick of like some of the future

35:39 things you talked about, like rapidly deploying some of these data centers and some of these

35:44 Ballard systems.

35:46 Maybe just give us a sense of like, even, like disaggregated rack architecture.

35:50 Do you have, instead of having a GPU alongside a server, like a rack of GPUs and then optical

35:57 connections to a rack of servers?

35:58 Like give us a sense of some of the stuff where it's going.

36:01 Yeah.

36:01 So that's some of the stuff that we've exploring.

36:03 Like I mentioned, we're taking a look at lots of different ways to re-architect the data

36:08 center to be more efficient.

36:09 And one of the ways that you get efficient is by, and in reduced fragmentation is by having

36:14 larger pools to allocate resources from.

36:16 If you think about allocating a virtual machine on a server, how much RAM can you give it at

36:20 most?

36:20 Well, as much as sitting on the server, how many GPUs can you attach to it?

36:24 Well, as most as are attached to that server.

36:26 All right.

36:26 How many PCI slots?

36:27 Yeah.

36:27 Yeah.

36:28 So, but if you think about, I've got a large resource pool, it's a whole group of GPUs

36:34 that I can be able to give it as many GPUs as 50.

36:37 Yeah.

36:37 Ask for 50.

36:38 Ask for 50.

36:39 Exactly.

36:39 The benefits of pooling for resource allocation are that you reduce fragmentation and you

36:44 get more flexibility.

36:45 So we've been trying to explore how we can do this from rack scale disaggregation of saying

36:49 there's a whole bunch of SSDs at the top of the rack, then there's a bunch of GPUs, and

36:54 then there's a bunch of CPU cores.

36:56 And let's just compose the system dynamically.

36:58 There's a bunch of challenges from a resiliency perspective.

37:02 Like how do you prevent one failure of the GPU part of the system, bringing down the whole

37:07 rack?

37:07 For example, there's latency and bandwidth challenges.

37:11 Like how do you, when you're sitting there on the PCI bus, you get a whole bunch of bandwidth

37:15 and you get very low latency.

37:16 If you're going across the rack, you might have the same bandwidth.

37:20 You might have lower bandwidth just because you can't deliver that much bandwidth out of

37:24 the GPUs.

37:25 Right.

37:25 All the systems are optimized to make assumptions about these numbers.

37:29 Exactly.

37:29 And your latency is going to be higher.

37:31 And so some workloads can't tolerate their latency.

37:33 So we've been exploring disaggregated memory, disaggregated GPUs.

37:36 I've shown demos of both of them.

37:38 We're still exploring those.

37:40 We're not, you know, it's not ready for production.

37:42 Yeah.

37:42 Disaggregated memory or GPUs?

37:44 Yeah.

37:45 I would guess memory, but I have a zero experience.

37:48 Memory is challenging because there are certain GPU workloads that aren't so latency sensitive,

37:53 like AI training.

37:54 Sure.

37:54 Like a batch job sort of thing.

37:56 Yeah.

37:57 But when it comes to memory, you almost always see the latency.

38:02 And so what we think we can do is get remote memory down to NUMA, you know, speaking of

38:07 non-uniform memory architecture latency down to that level.

38:10 And a lot of applications can tolerate that.

38:13 Okay.

38:14 And so we have a memory tiering where you've got close memory that's on the system and then

38:18 remote memory, which is like NUMA latency.

38:19 Kind of like a L2 cache, but like a bigger idea of it.

38:22 Very, very cool.

38:23 I think because you're doing such neat stuff.

38:26 And when you see these hyperscale clouds, I think a lot of what people see is the insane

38:33 dashboard of choices.

38:34 Like, do I do?

38:35 Yeah.

38:35 Do I do routing?

38:36 Do I do firewalls?

38:37 Do I do VPCs?

38:39 Do I do like paths?

38:41 I ask, what do I do?

38:42 But oftentimes I don't really think about like, well, you're getting a slice of this giant

38:46 server and, you know, maybe someday it'll live under the ocean or whatever.

38:50 Right.

38:50 So it was really cool to.

38:51 Yeah.

38:52 And I think what you're seeing is the cloud, it started with a few basic building blocks.

38:56 And then we started to explore lots of different directions of creating lots of different paths

39:01 services and paths services for compute and then different data offerings.

39:05 I think the space, and again, coming to the workload, you get this.

39:09 Is it high?

39:10 Do you need key value store?

39:11 Do you need a vectorized database?

39:13 Do you need, and do you need any of those to be extreme performance?

39:18 Because then if you need extreme performance, go for the design for purpose vectorized database.

39:23 If you want key value with vectorization, but it's okay if the vectorization isn't the fastest

39:30 possible, you know, you can go use this other offering.

39:32 So that's why the list of options has continued to expand is just because every workload says,

39:39 I need this.

39:40 And that's the most important thing to me.

39:41 And the other one says, no, I need this.

39:43 That's the most important thing.

39:43 And others are like, I don't care.

39:45 Well, as it becomes the mainframe of the world, right?

39:49 There's a lot of different types of apps running on it.

39:51 Yeah.

39:51 Yeah.

39:52 Awesome.

39:52 All right, Mark.

39:53 Final call to action.

39:54 People maybe want to learn more about some of the stuff we saw here, see some pictures,

39:58 but maybe also just do more with Azure.

39:59 What do you say?

40:00 So a couple of things.

40:01 One is I've been doing a series of Azure innovation talks at build and ignite sessions.

40:06 So go back to the last build and you'll see the most recent one of those.

40:09 And then at this ignite, I'm doing one that's just looking at AI related innovation.

40:13 So that's on Friday, tomorrow here at ignite, and it'll be available on demand.

40:19 So that's awesome.

40:20 Yeah.

40:20 I'll grab the links to some of those and put them in the show notes for people.

40:23 Excellent.

40:23 Well, thanks so much for being on the show.

40:24 All right.

40:25 Thanks for having me.

40:25 Yeah.

40:26 This has been another episode of talk Python to me.

40:30 Thank you to our sponsors.

40:32 Be sure to check out what they're offering.

40:33 It really helps support the show.

40:35 This episode is sponsored by Posit Connect from the makers of shiny publish, share, and deploy

40:41 all of your data projects that you're creating using Python.

40:44 Streamlit, dash, shiny, bokeh, FastAPI, flask, quattro, reports, dashboards, and APIs.

40:50 Posit Connect supports all of them.

40:52 Try Posit Connect for free by going to talkpython.fm/posit, P-O-S-I-T.

40:58 Are you ready to level up your Python career?

41:01 And could you use a little bit of personal and individualized guidance to do so?

41:07 Check out the PyBytes Python developer mindset program at talkpython.fm/PDM.

41:13 Want to level up your Python?

41:15 We have one of the largest catalogs of Python video courses over at Talk Python.

41:19 Our content ranges from true beginners to deeply advanced topics like memory and async.

41:24 And best of all, there's not a subscription in sight.

41:27 Check it out for yourself at training.talkpython.fm.

41:30 Be sure to subscribe to the show, open your favorite podcast app, and search for Python.

41:35 We should be right at the top.

41:36 You can also find the iTunes feed at /itunes, the Google Play feed at /play,

41:41 and the direct RSS feed at /rss on talkpython.fm.

41:45 We're live streaming most of our recordings these days.

41:48 If you want to be part of the show and have your comments featured on the air,

41:52 be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

41:57 This is your host, Michael Kennedy.

41:58 Thanks so much for listening.

42:00 I really appreciate it.

42:01 Now get out there and write some Python code.

42:03 I'll see you next time.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon