#445: Inside Azure Data Centers with Mark Russinovich Transcript
00:00 When you run your code in the cloud, how much do you know about where it runs?
00:04 I mean the hardware it runs on and the data center it runs in. There are just a couple of hyperscale cloud providers in the world. This episode is a unique chance to get a deep look inside one of them, Microsoft Azure. Azure is comprised of over 200 physical data centers with hundreds of thousands of servers in each one of those. A look at how code runs on them is fascinating. Our guide for this journey will be Mark Russinovich. Mark is the CTO of Microsoft Azure and a technical fellow, Microsoft's senior most technical position. He's also a bit of a programming hero of mine. Even if you don't host your code in the cloud, I think you'll enjoy this conversation. Let's dive in. This is Talk Python to Me episode 445 recorded on-site at Microsoft Ignite in Seattle, November 16th, 2023.
00:55 [Music]
01:06 Welcome to Talk Python to Me, a weekly podcast on Python. This is your host Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the podcast using @talkpython both on fosstodon.org. Keep up with the show and listen to over seven years of past episodes at talkpython.fm. We've started streaming most of our episodes live on YouTube. Subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode. This episode is sponsored by Posit Connect from the makers of Shiny. Publish, share, and deploy all of your data projects that you're creating using Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Reports, Dashboards, and APIs. Posit Connect supports all of them. Try Posit Connect for free by going to talkpython.fm/posit. P-O-S-I-T. And it's brought to you by the PyBites Developer Mindset Program. PyBytes core mission is to help you break the vicious cycle of tutorial paralysis through developing real-world applications. The PyBites Developer Mindset Program will help you build the confidence you need to become a highly effective developer. Check it out at talkpython.fm/pdm. Mark, welcome to Talk Python. Thanks. Thanks, Michael.
02:24 Yeah, it's fantastic to have you here. I've been a fan of your work for a really long time. We're gonna have a really cool look inside of Azure and there's not very many hyperscale clouds in the world. You can probably count them on your hands, right?
02:36 And so I think as developers, Python developers generally, will be really interested to just kind of get a sense of when we run on the cloud, what exactly does that mean because it's been a journey. Yeah, for sure. Before we dive into that though and some other cool things you're up to, tell people quick about it, a bit about yourself. Sure. I'm CTO and Technical Fellow at Microsoft, CTO of Azure. I've been in Azure since 2010. Prior to that, I was in Windows. I joined Microsoft in 2006 when it acquired my software company and my freeware website, Winternals and Sysinternals, respectively. And since 2010, I've effectively been in the same role the entire time, which is overseeing technical strategy and architecture for the Azure platform. And the skill of that is quite something. So it'll be great to get in that. That's awesome.
03:21 You first came on my radar probably in the late 90s, early aughts, through the Sysinternals thing, not through Microsoft. You brought that up. So tell people a bit about Sysinternals. It was like if you wanted to see inside your, what your app is doing on Windows, you go to Sysinternals, right? Tell us about that.
03:38 Sysinternals grew out of my just love of understanding the way things work and I was doing a lot of work on Windows internals actually in my PhD program where I was trying to figure out how to get operating systems to be able to save their state and then come back in case of a failure. So I learned the internals of Windows 3.1 and then Windows 95 and then Windows NT and started to think about cool ways that I could understand the way things worked underneath the hood. So actually the first Sysinternals tool was something called Control2Cap, which swaps the caps lock and key, you know, the control key because I came from a Unix background. And who needs caps lock? Yeah, exactly. No one should be that.
04:19 I'm not yelling at people very much. So the second tool that I wrote was actually called NTFS DOS to bring NTFS to DOS, but the native Windows NT tools that Bryce Cogswell, who I met in grad school, and I co-wrote together were RegMon and FileMon. They were like the originals and RegMon allowed you to watch registry activity, FileMon file system activity. We later merged them into Process Monitor after we joined Microsoft. But we decided to make those tools available for free and so we started the NTinternals.com website, which then Microsoft's lawyers said don't use NT, so we switched over to renaming it Sysinternals. And then Bryce was like, hey, some of the tools that we've made we should sell. So we wrote a tool that would allow you to mount a dead NT system through a serial cable as if it was a local drive on the recovery system. And he said if we make a read/write version we should sell it. So he started, he went and set up a credit account, you know, credit card account on e-commerce.net or something like that. And we started selling the software and that grew into what became Winternals, a commercial software company. But Sysinternals and Winternals, like I said, were both acquired at the time. I joined Microsoft in 2006. But Bryce and I continued to work on Sysinternals. He worked on him until he retired from Microsoft and retired just in general four years later. And then I've continued. I have now a couple people, three people, working on Sysinternals engineering systems, on the screen, not PowerPoint, but just whatever happens to be on your screen, which is really nice. I love my Macs these days, but boy I wish Zoomit existed on the Mac. Actually, people have asked for Zoomit for Mac and I'd like to make a Zoomit for Mac. And now with Copilot, I wonder how good it is at writing Mac apps because I don't want to spend all the time to learn how to write a Mac apps just to write Zoomit, but if Copilot can help maybe it'll, you know, something that I can do in my, you know, you know, spare time.
06:35 I don't know if it'll do it, but it'll get you close. It's crazy how these LLMs are writing code for us these days. And we're gonna talk a bit about maybe how some of those run. Yeah. And so on, I mean, Azure is doing tons of stuff with large language models and you all have some, you know, we're here at the Microsoft Ignite conference, you've got some big announcements. But yeah, I was such a fan still of Sysmon and I use that for all sorts of things still. So, super cool.
07:00 Now, before we jump in, I kind of want to talk about some of your big announcements.
07:04 Yeah. Because they really caught me off guard here. I'm like, yes, this is exciting.
07:08 But maybe since we're gonna talk a decent amount about Azure, the internals of hardware, how our code runs, just give us a quick history of Azure. You know, when did this whole thing get started? Yeah, Azure started right as I joined Microsoft in 2006. There was a group of people, including Dave Cutler, one of the people that I've looked up to because Dave was the original architect behind the VMS operating system and then Windows NT, which is now underlying Windows. He and some other people were just at the suggestion of Ray Ozzie. This is back when services was a big thing. And Ray sent a memo to the company, kind of echoing Bill Gates's internet memo, saying it's software and services now. And they said, how can we build a data center scale type platform to make it easier for Microsoft to develop services? And so this was called Project Red Dog, which was incubating for a while. And then 2008, they publicly launched it because Steve said, we need to make this available to third parties as well as Windows Azure. And, sorry, 2008 they announced the preview of it. 2010, it commercially launched publicly in February, and I joined in July. And a few years later, with the rise of open source software and so many enterprise customers wanting to have Linux, we re-branded it Microsoft Azure. And we also... Can I run Linux on Windows Azure? I don't know if that makes any sense. Yeah, and one of the first things I'd done, worked on with Corey Sanders, was being asked, hey, we've got platform as a service. We have this thing called cloud services, this model for how you write apps. But our enterprise customers were saying, I can't move my existing IT stuff to Azure because it just needs VMs. And so the first thing we did was, hey, we should get IaaS capability in Azure. And so in 2012, we launched the preview of IaaS for Azure. And that's really when the business started to take off because enterprises then could, with minimal effort, start to move. Well, that's like doing what we do in our data center, but in your data center.
09:07 Yeah, exactly. Now no one even thinks about it. Exactly. So IaaS has continued to evolve, PaaS has continued to evolve. Cloud services was designed in a world without containers. Now we've got containerization, the rise of Kubernetes, and then application models on top of containers. And so Azure's evolved and actually, I think, led some of that evolution of cloud-native computing up into containers and abstractions. But it's been a long, long journey towards that. I mean, I think one of the things is I've always believed that ultimately cloud should be about making it easy for developers to say, here's what I want, and then the cloud takes care of the rest. And we're moving towards it relentlessly.
09:43 That time when you'll really be able to do that. Yeah, so you don't have to know DevOps, you don't have to know distributed architectures, you just give you guys the code. Yeah, which is beautiful. It's beautiful. Now, real quickly, just give us a scale. Like, think of how many data centers, how many servers, how many miles, fiber. It's kind of astonishing. It is pretty flabbergasting. And the numbers continue to grow exponentially. I'll just give you, because I remember when I first started in Azure, I was asked to give a talk at the Azure All Hands about architecture and some of the announcements we had coming. And the All Hands was two rooms with the partition removed in our on-campus conference room meeting center. A total of about 500 people. That was all of the Azure team in 2010. And really nobody outside the Azure team knew anything about Azure. Really, the world didn't know about Azure.
10:40 Inside, right? Yeah, inside. So effectively, that was like most, at least half the people in the world that knew anything about Azure was in those two rooms. And today, you know, Scott Guthrie's organization, Cloud and AI, all of it's working on Azure. And that's tens of thousands of people. So at least, you know, a good percentage, majority percentage even, of the company is working directly on things that come under the Azure umbrella. So it's come a long way from that perspective. And you talked about physical scale. Back then, when we originally launched Azure in two regions, it was like 40,000 servers. Like 20,000 in one, 20,000 in the other. That's a lot of servers. Yeah, but back, but you know, it was like, let's, that is kind of cloud scale back then. Now, we are at millions of servers. And when it comes to data centers, we've got 60 regions around the world, 60 plus regions. And each of those consists of one, in many cases, multiple data centers. And we're still building out, we're launching a data, like two data centers every week, I think is the number that we're launching. Wow, that's crazy. And these could be slotted into one of these regions or it could be, yeah, a new region. Yeah, yeah. Incredible, incredible. And so the big announcement that I wanted to ask you about, just before we run out of time, then we'll dive into some of that, that sort of how does your code run story is Azure Cobalt.
11:59 Yeah, that's a new processor you guys announced. And you know, listeners, I'm a big fan of Apple Silicon and how it sort of changed the computing landscape for power and speed on like little laptops and stuff. And this is kind of that idea but for the data center, right? Tell us about that. It is that idea. I think having a processor that can be designed really with our specifications. If you take a look at Intel and AMD processors, they're fantastic processors. They're very versatile. They're taking requirements from lots of different sources. And so we're just, we're a voice. We're a significant voice when it comes to saying we'd like your processors to do these things. When we have our own, we've got the ability to just decide unilaterally what we'd like it to do based off of what we see and can vertically integrate it into our systems.
12:44 We can put it on SSEs and integrate it with memory and GPUs. And so that is kind of the reason that we've done that verticalization for processors.
12:55 That's not to say that the other processors aren't going to be significant.
12:59 It's going to be probably a blend of the offerings.
13:01 They'll have different capabilities that ours won't have. There are customers that want the specific features that they've got or performance speeds and feeds that they've got because they're not all going to look the same. And so I think it's just better optionality for everybody. Well I can tell you as somebody who tries to run Linux on that thing, it's hit and miss if there's even an ARM version available. More often than not, there's not. And so there's certainly not going to be an insane rush to just drop everything because there's a lot of code that's written for x86. And optimized. That's the other thing too. Yeah, for sure. So on my list here of things I was going to ask you is, well what about ARM in the data center? Well that is ARM.
13:40 Exactly. I'm like, well okay, so you guys beat me to the punch.
13:44 This portion of Talk Python to Me is brought to you by Posit, the makers of Shiny, formerly RStudio, and especially Shiny for Python. Let me ask you a question. Are you building awesome things? Of course you are. You're a developer or data scientist. That's what we do. And you should check out Posit Connect. Posit Connect is a way for you to publish, share, and deploy all the data products that you're building using Python. People ask me the same question all the time.
14:12 Michael, I have some cool data science project or notebook that I built. How do I share it with my users, stakeholders, teammates? Do I need to learn FastAPI or Flask or maybe Vue or React.js? Hold on now. Those are cool technologies and I'm sure you'd benefit from them, but maybe stay focused on the data project? Let Posit Connect handle that side of things. With Posit Connect you can rapidly and securely deploy the things you build in Python. Streamlit, Dash, Shiny, Bokeh, FastAPI, Flask, Quadro, Reports, Dashboards, and APIs. Posit Connect supports all of them. And Posit Connect comes with all the bells and whistles to satisfy IT and other enterprise requirements. Make deployment the easiest step in your workflow with Posit Connect. For a limited time you can try Posit Connect for free for three months by going to talkpython.fm/posit. That's talkpython.fm/posit. The link is in your podcast player show notes. Thank you to the team at Posit for supporting Talk Python. Awesome. Now one of the things I wanted to kind of maybe have you go through for listeners that I think is just super interesting is sort of the evolution of the hardware where our code runs throughout the data center. So you talked about in some of your talks like the data center generations and what is that like six or seven, eight maybe different variations. I'll kind of give you some prompts from them and all, but one of the things that I was thinking about when I was looking at this is you know do you have a bunch of small servers or do you have like or do you partition up really large servers right? What's the right flow for that?
15:46 So one of the things that you've seen since the start of cloud back when we launched Azure there was one server type. Yeah. And it we had different virtual machines offerings but they were just all different sizes that could fit on that one server. It was a 32 core Dell Optron with I think 32 gig of RAM. Yeah.
16:04 And so that was the server back then. What we've seen is more workloads come to the cloud that have different requirements. Some require large memory, some require more compute, some require GPUs, some require InfiniBand back-end networking for high-performance computing. And so we've there's been a drastic diversification of the server hardware in Azure that's being offered and current at any one point in time. And I think you'll continue to see that. So the old you know it's just a pizza box, it's it's a low low-end commodity server kind of that's the cloud vision back in 2010. Now it's the cloud contains specialized servers for specific applications. And when it comes to large servers back in 2014 we started to introduce very large servers. The kind that you know people that were cloud purists back in 2010 would have been like no don't allow that. It's all about the cheap. It's all about the cheap and scale out. Is scale up servers for SAP workloads in memory database workloads.
17:05 So we introduced a machine that we nicknamed Godzilla which had 512 gig of RAM in 2014 which was like an astonishing number. And we've continued to as SAP workloads have gotten bigger and bigger and more has migrated to the cloud created bigger and bigger and bigger and bigger machines. In fact I'm showing here at Ignite the latest generation of the SAP scale up machines that we're offering. It's not yet they're not yet public but I'm going to show a demo of them of one of them call that I'm calling nicknaming super mega Godzilla beast because we've gone through so many iterations. So this one is super is the new. Yeah you're running low on agitator. Yeah and I don't know what I'll come up with next. But anyway we're at super mega Godzilla beast as the current generation which has 1790 cores. Wow. And 32 terabytes of RAM. 32 terabytes of RAM. Incredible. So do you do things to like pin VMs to certain cores so that they get better cache hits and stuff like that rather than it just kind of mosh around. That's especially important with NUMA architectures where you've got memory that has different latencies to different sockets. As you want to have the VM that's using certain cores on a socket have memory that is close to it close to that socket. So that's and that's just part of Hyper-V scheduling is doing that kind of assignment which we have under the hood. And again it's like the control plane at the very top says launch a virtual machine of this size of this skew. Then there's a resource manager the Azure allocator that goes and figures out this is the best server to put that on. It has enough space and will reduce minimize fragmentation and places on there and then Hyper-V underneath is saying okay these are the cores to assign it to. Here's the RAM to give it. Excellent and how much of that can you ask for? You can ask for the whole machine. You can. For a full machine you know full server virtual machine sizes. Wow okay there's probably not too many of those in. Yeah. But some people using them. Yeah. Like on the SAP ones because they're designed for SAP. I think for those kinds of what the current generations I think we offer just two size like either half of it or the whole or the whole thing. Incredible. Wow okay how much of a chunk of a rack does that take? It's basically the whole rack. Yeah pretty much. Top to bottom. Yeah it's like a 10 kilowatt server. Yeah a little power plan on the side there. As you kind of talk through the history of sort of how your code ran it was more more colo as you said like the more smaller smaller ones and then as you got bigger and bigger on some of this you started working on things like well how do we let the servers run hotter and have the air cool them rather than more actively cooled and then it even gets to a almost more just remove big chunks of it and let them fail and then when enough of it has failed take them out you want to kind of talk about. Yeah we're still yeah good question because we're we're still exploring this space towards higher efficiency lower energy consumption more sustainable. One of the experiments that came out of Microsoft Research was Project Natick which is taking a bunch of servers putting it or a rack of servers putting it in a container that has nitrous oxide gas in it so it's an inert gas and dropping it into the ocean floor and letting it be cooled ambiently through the water there. Not water on the inside but the outside. Yeah so giant heat sink. And the there was potential benefits of that and and it's still something that might get revived at some point but what we found coming out of that was if the parts are in an inert environment they have one eighth the failure rates as ones that are in air environment and with particulate matter and corrosive materials in the air. So we started exploring liquid cooling both for that as well as potential energy savings and more sustainable cooling and then air cooled. We've explored a two-phase liquid immersion we had a pilot running there's some regulations that's changed around the the kinds of fluids that have made us take a look at a different direction so we. Is that kind of like the chlor what you would get in like an air conditioner or some of the stuff they. They're called the forever chemicals or materials. The ones we're using actually aren't but the regulation is a little broad and so we're just steering clear and it might be revisited at some point but we're also they have been exploring liquid cooling kind of traditional liquid cooling cold plate cold and some people listening probably like me are gamers and have liquid cooled GPUs or your CPUs at home in their gaming rigs which allow them to get overclocked and it's the same thing we're doing in our data centers. In fact one of the things Satya showed in the keynote was something called sidekick which is a cabinet that it allows us to take liquid cold plate liquid cooling into an existing data center air cooled data center where the Maya 100 AI accelerators are in the cabinet sitting right next to it and the cooling pipes are going into the Maya cabinet to cool the accelerators themselves and so that is. Do they manage like some big metal plate and then the metal plate is liquid cooled or something like that? Yeah it's effectively that there's a plate on top of the processor and then liquid is going through that. So I'm gonna actually show pictures of the inside of the Maya system tomorrow in my AI innovation closing keynote but that is I think the takeaway here is that at the scale we're at and with the efficiency gains that you might get from even a few percentage we're exploring everything at the same time like single phase liquid immersion cooling still exploring that and then how to do cold plate more efficiently.
22:44 I'll also be showing something called micro fluidics we're exploring which is much more efficient than just pure liquid cold plate which a cold plate is just putting the plate like you just said right on top of the processor and so the water's taking the heat away but if we can put the liquid right into the processor like. Are we talking channels in the process? Around the processor. Okay.
23:08 Just flow it right on top of it and so that's what something we're calling micro fluidics and I'll show that and talk a little bit about that tomorrow too offers much more efficient cooling and it's not it's not prime time yet but looks incredibly promising. That looks awesome.
23:23 This portion of Talk Python to Me is brought to you by the PyBite Python Developer Mindset Program. It's run by my two friends and frequent guests Bob Delderbos and Julian Sequeira and instead of me telling you about it let's hear them describe their program. Happy New Year! As we step into 2024 it's time to reflect. Think back to last year what did you achieve with Python? If you're feeling like you haven't made the progress you wanted and procrastination got the best of you it's not too late. This year can be different. This year can be your year of Python mastery. At PyBites we understand the journey of learning Python. Our coaching program is tailor-made to help you break through barriers and truly excel. Don't let another year slip by with unmet goals.
24:11 Join us at PyBites and let's make 2024 the year you conquer Python. Check out PDM Today, our flagship coaching program, and let's chat about your Python journey.
24:22 Apply for the Python Developer Mindset today. It's quick and free to apply. The link is in your podcast player show notes. Thanks to PyBites for sponsoring the show. One of the things I saw in the opening keynote, I don't know if it fits into what you were just talking about or if it's also another of things, where actually had the whole motherboard submerged and then even just the entire computer is just underwater. So that was two-phase liquid immersion cooling like I mentioned. Yeah. Just dunking the whole thing in the dielectric fluid. And you had it boil at a low temperature I guess because the phase change is like an extremely energy intense aka heat exchange. Yeah and it's called, yeah, that's it's actually two-phase because of the boiling it goes, it phase changes into gas and then condenses again back into liquid. So it's that was the idea behind two phases. I see. Instead of just running a radiator you like almost condense it back somewhere else and then bring it back around. Yeah. Okay. Cool. So if we go and run our Python codes whether it's pass or I has or whatever, what's the chance it's hitting that or is this this kind of cutting-edge stuff reserved for high energy AI training? Is it more like if we ask ChatGPT it's liquid cooled? Our standard data centers right now are air-cooled. So it's air-cooled servers. This Maya part is liquid cooled. So when that you know in our first summer we've got some of our own workloads now starting to leverage Maya. Do you put on your own workloads first just in case? Yeah.
25:47 Well it's just to see it like shake it out. Yeah. Yeah. Yeah. Piloting it. Yeah. You're not offering that up to the big customers just right away. And so Maya I don't know how many people know about this either. This is one of the GPU training systems you guys have. I mean for those who don't know OpenAI and ChatGPT run on Azure which probably takes a couple of cores, a couple GPUs to make happen. Yeah. I want to talk about that. Yeah. So the right now our fleet, a large-scale AI supercomputing fleet is made up of NVIDIA parts. So the previous generation was well the original generation that we trained GPT-3 on with OpenAI was NVIDIA V100s and then we introduced A100s which is what GPT-4 was trained on.
26:32 And these are graphics cards like 4080s or something but specifically for AI right? Yeah. Okay. That's right. And then the current generation of supercomputer we're building for OpenAI training their next generation of their model that's a NVIDIA H100 GPUs. Then Maya is our own custom AI accelerator. So it's not a GPU.
26:51 You know one of the aspects of NVIDIA's parts has been their GPU base. So they're also can do texture mapping for example. But you don't need that if you're just doing pure AI workloads. So back to that specialization right?
27:05 So if you could build it just for the one thing maybe you build it slightly different. That's right. So Maya is just designed purely for matrix operations used in in fact low precision matrix operations used for AI training and inference. And so that is the specialized part that we've created called Maya 100 the first generation of that. Well if I think of like some of the stuff presented at the opening keynote and stuff here I think the word AI was said a record number of times. Yeah I don't think there was a topic there that AI wasn't a part of.
27:37 And so how much is this changing things for you guys? Like 12 months ago or something Chat GPT appeared on the scene. I mean it's changing it's literally changing everything. You know Jensen was saying this is the biggest thing since the internet. Yeah Jensen being the CEO of NVIDIA. Yeah yeah who was on stage with Satya at the keynote. It is changing everything. It's changing not just the product offerings.
28:01 So the way that we you know have integrate AI into the products using Copilot. It's changing the way we develop the products as well. And the way that we run our systems inside already. So for example incident management. We've got Copilot built you know our own Copilot internally built into that. So somebody that's responding to an issue and our production systems can say okay so what's going on? What's that what happened with this? Yeah show me the graph of this. You know just be able to use human language to get caught up on what's going on.
28:29 People tell me it's just statistics. Just prediction. It doesn't feel like prediction. You know the people that say that I think are missing the scale of the statistics. And because we're probably predicting a little bit.
28:44 Thinking about what are you gonna say next. That's what we're we're statistical.
28:48 And so it's just once you get statistics large that at a large enough scale that you start to see something that looks like what we call intelligence. Yeah yeah it's it's really incredible. I'm starting to use it to just write my git commit logs for me. You know push a button and says oh you added error handling to the background task. So in case this fails you'll be more resilient and it'll keep running like that's better than I could have got.
29:15 But then I crash you know and yeah you just push the button and it's just magic. It's magical. Yeah it really is. I mean it's not it it's called copilot for a reason because we're not at the point yet where you can just let it do what it does autonomously. You need to check its work. Like you need to look at and say oops you know that time it screwed it up. It didn't get it quite right or I need to add more context to this than it had or it extracted. So but as far as accelerating work it's just a game-changer. Yeah it really really is.
29:44 So before we run out of time I want to ask you just a couple more things bit of a diversion. So Python we saw Python appear in the keynote. They were showing off I can't remember who it wasn't Satya it was whoever followed him. They look we want to show off our new sharing of the insanely large GPUs for machine learning.
30:01 Let's pull up some Python and a Jupyter notebook and I'll just check that out.
30:05 And you're like where are we again? Yeah really interesting. So you know you said you're using a little bit of Python yourself like what's Python look like in your world? Well so the reason that I'm using Python is I took a sabbatical this summer and so I was like I'm gonna do some AI research. So I got connected with an AI researcher. In fact I'm gonna talk about this at my keynote tomorrow some of the work that came out of it. The obviously AI is completely Python these days. Yeah almost entirely. So I was I spent the whole summer and I still am spending my time in Python Jupyter notebooks and then Python scripts when you want to do some some run for for a final result. So I hadn't used really Python before other than in passing. I mean it's a very language that's very easy to pick up. Yeah there's a there's a t-shirt that says I learned Python it was a good weekend. Yeah it's a bit of a joke but it's a good joke you know. Yeah and I think that's what makes it so powerful is that it's so easy to pick up.
30:56 But what's made it even easier for me to pick it up I'd say that I'm a mediocre Python programmer but I'm using Copilot and that's made me an expert Python coder. How do I do this? Yeah and it's like I've never I don't go to stack over you know it's a question I don't go to stack over flow for questions. I haven't had to get a book on Python. I basically just either ask Copilot explicitly like how do I do this or write me this or I put it in the function or in the comment and it gets it done for me. And there's a occasionally I'll have to go hand met edit it and figure out what's going on but for the most part it is writing almost all my code. And so my goal is how can I just not have it write everything for me. So that is kind of become the way that I program in Python and I think Python and AI and it the knowledge of Copilot for Python because OpenAI obviously for their own purposes has made GPT-4 and GPT-3.5 before it really know Python. I hadn't really thought of that connection but of course they wanted to answer Python questions I'm sure. For themselves. So I think when it comes to seeing what AI can do for programming Python is at the forefront of that. What was your impression of it? I mean I'm sure you've probably seen it before but like what's your impression working in it coming from a curly brace semicolon type language like C++ or something like this weird they drop a bunch of parentheses they have these tab space these four spaces rules. Well it's you know the YAML versus JSON. But I mean I've gotten used to it it's not a it's not a big deal. I find it's it's less verbose than C. There's less typing. Less symbol noise. You get the essence. Yeah. Yeah. Yeah I kind of had that experience as well coming from a C-based language like wow this is really weird and then after I went back to C# I'm like but this is also weird and I kind of like this clarity over here so now what do I do with life? And then you go back and like semicolons are annoying now. Yes exactly. Like I thought they were needed they're not needed what's going on? Another thing that I think you know maybe people really enjoy hearing a bit about and I'm a big fan of. You wrote a three-part series of novels about computer hackers called Zero Day. Mm-hmm. Really good. I read all of them back when they came out and so much of this like computer mystery stuff is like oh I'm they're using vb6 I'm gonna get their IP address. You're like wait what? Those words are meaningful but the sense is not right? And you know your books are like a lot of sort of spy stuff but also a lot of really cool legit reasonably possible computer stuff. Yeah tell people a quick bit about that. I love cyber I love thrillers growing up techno thrillers. I read Andromeda Strain when I was like in seventh grade. I was like this book is so cool because it's like I'm learning science plus yeah it's actually you know it's really exciting. So I've always wanted to write one and then coming into after into the late to 1990s when you started to see some of these large-scale viruses I was just thinking this is such a powerful weapon for somebody to cause destruction and then 9/11 happened I'm like all right logical next step is leveraging a cyber weapon to do something that with the same goals and so that's what led me to write Zero Day which is taking that idea of using a cyber weapon for terrorism. Then I was like oh that book was really well received I had a lot of fun doing it so let me write the next one and I wanted to continue in this theme with the same characters Darryl Hagen and Jeff Akin and say what what's something else what's other another cyber security angle that I can take a look at in the second one so the second one was state sponsored cyber espionage and actually the ironic thing is I'd already had Iran in the book story I'd already had China in the story I had people trying to figure out how to get Iran a nuclear weapon and then Stuxnet happened right when I was still writing the book I'm like okay this is like a stolen part of my plot line. So I had to change the book a little to acknowledge Stuxnet happening and then the third one was about insider threats which I think is one of the toughest threats to deal with in this case it was a long-range plot from some people that wanted to compromise stock exchange and kind of a mixture of high-frequency trading and insider threat with cyber security systems was the third one called Rogue Code. Mm-hmm yeah so they were all really good I really enjoyed it. Were you a fan of Mr. Robot did you ever watch that series? I did I love that series. Yeah oh my gosh again it's it seemed pretty plausible. Yeah I really like that imagine a lot of people out there have seen Mr. Robot as well if they want you know that kind of idea but in just a series they can binge. Yeah cool. Maybe we should wrap up our chat here but just a quick of like some of the future things you talked about like rapidly deploying some of these data centers and some of these ballast systems maybe just give us a sense of like and even like disaggregated rack architecture do you have instead of having a GPU alongside a server like a rack of GPUs and then optical connections to a rack of servers like give us a sense of some of the stuff where it's going. Yeah so that's some of the stuff that we're exploring like I mentioned we're taking a look at lots of different ways to re-architect the data center to be more efficient and one of the ways that you get efficient is by and in reduced fragmentation is by having larger pools to allocate resources from. If you think about allocating a virtual machine on a server how much RAM can you give it at most? Well as much as sitting on the server.
36:22 How many GPUs can you attach to it? Well as most as are attached to that server.
36:26 Right how many PCI slots you got? Yeah so but if you think about I've got a large resource pool it's a whole group of GPUs then I could be able to give it as many GPUs as you want. 50? Yeah ask for 50. Exactly the benefits of pooling for a resource allocation are that you reduce fragmentation and you get more flexibility so we've been trying to explore how we can do this from rack scale disaggregation of saying there's a whole bunch of SSDs at the top of the rack then there's a bunch of GPUs and then there's a bunch of CPU cores and let's just compose the system dynamically. There's a bunch of challenges from a resiliency perspective like how do you prevent one failure of the GPU part of the system bringing down the whole rack for example. There's latency and bandwidth challenges like how do you when you're sitting there on the PCI bus you get a whole bunch of bandwidth and you get very low latency.
37:16 If you're going across the rack you might have the same bandwidth you might have lower bandwidth just because you can't deliver that much bandwidth out of the GPUs. Right all the systems are optimized to make assumptions about these numbers. Exactly and your latency is gonna be higher and so some workloads can't tolerate their latency so we've been exploring disaggregated memory disaggregated GPUs I've shown demos of both of them there's we're still exploring those we're not you know it's not ready for production. which is harder.
37:42 Yeah disaggregated memory or GPUs? Yeah I would guess memory but I have a zero experience. Memory is challenging because there are certain GPU workloads that aren't so latency sensitive like AI training. Sure like a batch job sort of thing.
37:57 Yeah but when it comes to memory you almost always see the latency and so what we think we can do is get remote memory down to NUMA you know speaking of non-uniform memory architecture latency down to that level and a lot of applications can tolerate that. Okay. And so we have a memory tiering where you've got closed memory that's on the system and then remote memory which is like NUMA latency. Kind of like a L2 cache but like a bigger idea of it. Very very cool I think you guys are doing such neat stuff and when you see these hyperscale clouds I think a lot of what people see is the the insane dashboard of choices like do I do yeah I do routing do I do firewalls do I do VPCs do I do like paths I as what do I do but oftentimes don't really think about like well you're getting a slice of this giant server and you know maybe someday to live under the ocean or whatever right so it's really cool to yeah. What you're seeing is the cloud it started with a few basic building blocks and then we started to explore lots of different directions of creating lots of different paths services and past services for compute and then different data offerings I think this space and again coming to the workload you get this is it high do you need key value store do you need a vectorized database do you need and and do you need any of those to be extreme performance because then if you need extreme performance go for the design for purpose vectorized database if you want key value with vectorization but it's okay if the vectorization isn't the fastest possible you know you can go use this other offering so that's why the the list of options is continue to expand is just because every workload says I need this and that's the most important thing to me and everyone says no I need this that's the most important thing and others are like I don't care well as it becomes the mainframe of the world right there's a lot of different types of apps running on it yeah yeah awesome all right mark final call to action people maybe want to learn more about some of the stuff we saw here see some pictures but or maybe also just do more with Azure what do you say so a couple things one is I've been doing a series of Azure innovation talks at building ignite session so go back to the last build and you'll see the most recent one of those and then at this ignite I'm doing one on that's just looking at AI related innovation so that's on Friday tomorrow here at ignite and it'll be available on demand so that's awesome yeah I'll grab the links to some of those put them in the show notes for people excellent well thanks so much for being on the show all right thanks for having me yeah this has been another episode of talk Python to me thank you to our sponsors be sure to check out what they're offering it really helps support the show this episode is sponsored by posit connect from the makers of shiny publish share and deploy all of your data projects that you're creating using Python streamlit - shiny bokeh FastAPI flask quattro reports dashboards and APIs posit connect supports all of them try posit connect for free by going to talkpython.fm/posit POSIT are you ready to level up your Python career and could you use a little bit of personal and individualized guidance to do so check out the pie bites Python developer mindset program at talkpython.FM/PDM want to level up your Python we have one of the largest catalogs of Python video courses over at talk Python our content ranges from true beginners to deeply advanced topics like memory and async and best of all there's not a subscription in sight check it out for yourself at training.talkpython.FM be sure to subscribe to the show open your favorite podcast app and search for Python we should be right at the top you can also find the iTunes feed at /itunes the Google Play feed at /play and the direct RSS feed at /rss on talkpython.fm. We're live streaming most of our recordings these days if you want to be part of the show and have your comments featured on the air be sure to subscribe to our YouTube channel at talkpython.FM/YouTube. This is your host Michael Kennedy thanks so much for listening I really appreciate it now get out there and write some Python code you [MUSIC]