Monitor errors and performance issues with Sentry.io

#320: Python in the Electrical Energy Sector Transcript

Recorded on Sunday, Jun 6, 2021.

00:00 In this episode, we cover how Python is being used to understand the electrical markets and grid in Australia. Our guest, Jack Simpson has used Python to uncover a bunch of interesting developments as the country has adopted more and more solar energy. We round out the episode looking at some of the best practices for high performance large data processing in Pandas and beyond. In addition to that, we also spend some time on how Jack use Python and open CV computer vision to automate the study of massive bee colonies and behaviors. Spoiler alert that involved gluing wingding fonts on the backs of bees. This is Talk python to me Episode 320, recorded June 6 , 2021.

00:52 Welcome to talk Python to me, a weekly podcast on Python, the language, the libraries, the ecosystem and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm '@mkennedy', and keep up with the show and listen to past episodes at 'talkpython.fm'. And follow the show on Twitter via '@talkpython'. This episode is brought to you by 'Square' and 'Linode'. And the transcripts were provided by 'AssemblyAI', please check out what all three of them are offering. It really helps support the show. Jack, welcome to talk Python to me.

01:21 Thank you, Michael, it's great to meet you after hearing your voice for so many years.

01:25 It's so great to have you on the show. It's it's always fun to have people who are listeners but have interesting stories to tell. Come on the show. And you know, you definitely have some interesting stories about the energy grid and doing data science around really important stuff like keeping the lights on in Australia. Absolutely. Yeah, I'm definitely looking forward to diving into that stuff. It's gonna be a lot of fun, I think. Absolutely. Before we get to it, though, let's start with your story. How did you get into programming and what brought you to Python? Yeah,

01:51 well, I guess I have a very strange background. I actually started off at university enrolling in journalism and politics, right out of the GFC. I had never programmed before, and I didn't even realize I was interested in it. And my lecturers kept telling me how many journalists were losing their jobs through the financial crisis. And so I actually dropped out and was trying to consider what I wanted to do. And I always had a passion for biology and science. And my hobby was I was actually a beekeeper. I had six of my hives at home. I really love that.

02:27 Oh, amazing. Are these like honey bee type of bees? Or what kind of is

02:30 A honey bees, Yes, absolutely. The ones that stink, I actually also had a couple of native Australian native bees 'Tetragonula Carbonaria', which I'm not sure if you've heard of them before, but they're actually they almost look like tiny little flies, they're stingless bees. And they will actually make their hives out of the resin of trees. And they will build their brood in this beautiful kind of spiral pattern going up through the hive. And so I was I was just really interested in I guess, all things bee and insect related at the time. And so I actually started blogging about bees and beekeeping. And that was actually my introduction to code. Because I had I think I had a website on blogger. And one day, I suddenly thought, Well, I'd love to actually be able to make my own website. How do I do that? And so I started learning HTML and JavaScript. And it was literally just so I could talk about my Bees, I had no interest in programming amazing.

03:26 Well, I think so many people get into programming that way, who don't necessarily feel like my goal is to go be a programmer, but they just really have something they're into. And programming is almost in the way, right? It's just like something you've got to figure out so that you can actually get to the thing that you actually like, but then a lot of people find out Well, hey, this is actually kind of cool. And what else can I do now that I know this? Right?

03:46 Absolutely. That was really what made me change my degree. So initially, I was going to do a pure biology degree. And so I decided I would do biology and web development. And as I kind of went along with the degree, I suddenly started realizing that a programming skills I was picking up during my degree, so I learned, you know, PHP, Perl, and Python. Suddenly, I realized that these skills could actually help me with working with scientific data for we kind of hit this point where there's just so much genomic data, really most people use at the time I was during undergrad, but one of the things I really noticed is most people these days that enroll in a biology PhD, you join the lab, and it's almost like, right, you're learning Python, or you're learning art, there's no other way you're working with this style. And so suddenly, I kind of hit this point where it was like, wow, these kind of technical skills, were letting me do things and be useful in ways that I never thought and it led me answer research questions that I was really fascinated by and that was my motivation to actually go into and do a PhD. And then try and take those skills further. What was your PhD in? So I was in computational biology, it was trying to develop software to automate a analysis of honeybee behavior in the hive. So the thing that was interesting was it was both a physical setup and the code as well. So the physical side was actually how do we set up a beehive in a building with with a kind of like a glass window in so that I can build them with an infrared camera in the dark? And how do I put little tags with patterns on them on the backs of the bees, that I can then use Python and machine learning to identify and track over the course of several weeks. And that kind of process ended up being much harder than I had anticipated? Because when I, when I started out, I read a couple of papers by some computer scientists who mentioned that they'd printed out some card tags. And they said that they filmed bees for a couple of hours, got the data and did an analysis. And I thought great, I'm going to do that problem solved. It was only until later that I realized the reason that they only filmed them for a couple of hours was because that was how long it took the bees to chew the cardboard off each other in the hive. No,

06:02 Did they help each other like, hey, I've got this thing on my back, get it off me?

06:05 Yes, yes, they actually did, they actually. And that was the thing I came, I would actually find time and time again, I would come up with the material. And I would try and stick it on the back of the bees. And you would see that their friends effectively come over and start trying to pry it off them in the hive. And so it was actually a process to find something that didn't I guess trigger them so to speak. And one of the things are really immensely frustrating experience I had when I was doing these experiments was I thought I had found the perfect fabric and the perfect glue to put them on the bees. And I'd spent hours tagging hundreds of them, I put them into the hive. And then I came back, I would come back a few hours later and all my tags have disappeared. And I couldn't understand why. And I kept doing it. And then at one point, I thought you know what, I'm going to put a bucket outside the hive entrance just to see what happens. And I'm going to watch in the dark. And what actually happened was the bees didn't like the smell of the glue. So they were actually physically grabbing bees that I tagged dragging into the into the entrance and fling them out of the hive and because and because the hive was because these bees were juvenile bees, they were too young to fly yet the ants were actually dragging them away. So I thought my tags were dropping off or being pulled off. But actually, my poor bees were getting eaten by the ants, because they couldn't fly away. My goodness. Right. So I guess it's another example as well, you know, when you've got missing data, understand the process. Sometimes the process that made that data missing is significant.

07:33 No, I would have never guessed that's really pretty insane, actually.

07:37 And so the solution for dealing with this was I would actually go through the process of tagging the bees, then I would put them in this heated incubator on a frame of honey for several hours, until all of the smell had kind of faded away. And then I could introduce them to the hive and there would be

07:55 accepted. I see. Okay, wait, basically wait till that dried. That was really awesome. Yeah,

08:00 well wait until Yeah, all the fumes were gone. And then there would be accepted. And then it would work. Because I think this was the real challenge of my project was we weren't interested in tracking, showing that we could write software that could track the bees, we had a specific application was to look at the social development over several weeks of these bees. So we needed a kind of experimental setup and the code to support it that would let us look at these extended periods of behavior. Do they have

08:27 different markings based on like their age or their role in the colony or so like were they all tagged the same? And you just said, well, they kind of move around like this, or they're like, did you group them or something?

08:40 What I would often do is I had use a laser engraver to burn patterns in the fabric that I would put on them. And so each bee had a unique pattern that I could use to identify it

08:50 like a QR code on the bee Oh, kind of

08:52 like that. But no, in fact, I think if you scroll through the website to the bottom of the page, there's some there's some little patterns. This was some initial prototypes with at the bottom just to be scroll up a little bit more. The last image Yep, that one. I was literally using Wingdings BOD to try out different

09:09 patterns Wingdings Okay,

09:11 on the bees, because I just had, the idea was to have a relatively inexpensive 4k camera that could pick up the different patterns. Of course, if you had a really expensive high resolution camera, then you could do more with QR codes, for instance. And what I would do is I would do these experiments where I would half of the bees would be that I would introduce that would all be juvenile except I would also mark the Queen so I could know how they were interacting with the queen. But half the juvenile bees I would introduce into the hive and would receive a label that I could reference later on. And half of half would receive a different label that I knew about. And the reason I did this was so I could actually do these have these control and treatment groups in my experiment because I would I would do these experiments where I would treat the bees with caffeine seem to see how it would actually affect their social development in the hive, I guess to give a little bit more context to that dug a little bit into the way that bees develop. And you could always think of a worker bee in the hive, like the pictures I have on my site, the jobs that a bee does, over its lifetime, are influenced by how old it is. So these juvenile bees, I was first introduced into the colony, they really would just have quite menial colonies, they'll do a little cleaning tasks around the hive, they wouldn't do much, then when they're a little bit older, they would start nursing other juvenile bees. And then the eldest bees are the ones that you actually see out and about flying and collecting nectar and pollen. So those are actually the eldest of the bees in the colony, typically. And so I wanted to see how these caffeine would affect that kind of behavioral process in the juvenile bees.

10:53 How interesting. Sure, briefly,

10:54 what did you find that caffeine does to bees. One of the things I found was, it effectively meant that bees sped up how quickly they adjusted to the rhythms of the colony. So probably the context, if you're if you're a juvenile bee, in the hive, you don't really care about circadian rhythms, day night cycles, because you're in a hive, it's completely dark all the time. And so what we found was that, that we hadn't seen before was these juvenile bees, even though they weren't exposed to the lights and the outside, they would actually pick up these accordion rhythms by interacting with the older bees that were coming back. It was effectively like a socially acquired the circadian rhythm. And so what we found was that bees that were treated with caffeine effectively picked up this rhythm more quickly than bees that weren't and kind of progressed in their roles in the quality more quickly, as well. Okay, yeah. So that was that and I had a few other areas. But yeah, to be honest, a lot of the a lot of the work was really just making the, the software and the bees all play nice. Together. immensely. I will say one of the things that is quite nice about the energy sector is I don't have to deal with I guess I can deal with machines, which are a little bit more

12:12 less frustrating at times more reliable, more predictable and certain as for sure. This portion of talk Python to me is brought to you by 'Square'. Payment, acceptance can be one of the most painful parts of building a web app for a business. When implementing Checkout, you want it to be simple to build, secure and slick to use. 'Squares' new web payment SDK raises the bar in the payment acceptance developer experience, and provides a best in class interface for merchants and buyers. With it, you can build a customized branded payment experience and never miss a sale. Deliver a highly responsive payments flow across web and mobile that integrates with credit cards and debit cards, digital wallets, like Apple Pay and Google, ACH bank payments and even gift cards. For more complex transactions. Follow up actions by the customer can include completing a payment authentication step, pulling in a credit line application form or doing background risk checks on the buyers device. And developers don't even need to know if the payment method requires validation. Square hides the complexity from the seller and guides the buyer through the necessary steps. Getting started with a new web payment SDK is easy. Simply include the web payment SDK, JavaScript blog and element on the page where you want the payment form to appear. And then attach hooks for your custom behavior. Learn more about integrating with 'Squares web payments SDK' at 'talkpython.fm/square', or just click the link in your podcast player show notes. That's 'talkpython.fm/square'. Before we move on to the energy sectors, just give us a quick overview of like the software that you use. Was Python part of this role here?

13:50 Yes, absolutely. So I used a mix of Python and open CV for a lot of the image process processing, and of course, TensorFlow and CARIS as for training my neural network to identify the different types. And that actually ended up being quite an interesting process, building up that data set and improving it over time. Because one of the things I found when I started trying to train that data set was I thought, okay, I can take my patterns, film them, add a little bit of noise and rotation, and then that's my kind of starter, you know, machine learning model. The problem was that when you put the tag on the bee, the way that they kind of walk around the hive, you'll see different kind of angles of they kind of have this little wobble walk as they go around. So it kind of introduces this level of distortion to the tag, and then other. So then also, you could have other situations where bees would walk over each other, they 'd be blocked occluded tags as well. So one of the things I ended up having to do was I had to introduce a class to my predictive class to my model that was literally just like the, I don't know what this is class. And effectively, the idea was, I'm going to see this bee, I'm gonna have multiple attempts to classify this bee as it's walking around. So I want to only attempt a classification when I'm seeing enough of the tag. And I'm competent enough in that to attempt it. And so that was one of the one of the techniques I found that helped improve the classification. And really, it ended up just becoming a process where I would, I had a bit of a pipeline that would go through it would extract tags, it would use the model at the current iteration two to label them, I would then go in and manually review it, and then figure out where it stuffed up where it was doing well, and then use that corrected data set to retrain the model, and then improve and see how well that integration did. And it became kind of like a, almost like a semi supervised problem to an extent when I was building it out. At a certain point, it became just as good as me at doing these classifications, and then, effectively, then it was fully automated, as well. Yeah, I think I ended up labeling about seven or 800,000 images as part of doing this. And my wife was actually she was a PhD student in working in genetics at the time, she was helping me in her spare time. So she does not look favorably upon that. on that project. It is not the most she probably doesn't

16:24 love wingding fonts. Maybe a bee comes by she's like not you again.

16:28 Yeah, absolutely. But I'd say yeah, so Python and open CV were big ones. And then the other tool I was using a lot of was python siphon library, where I would for certain parts that I wanted to run really efficiently. I wrote those in c++, and then used our siphon to expose some of those methods to and that worked amazingly well, it was so impressive how you could call pass a list a Python list to my, my c++ class, and it would interpret that as a vector, and then it would pass back the information as well.

17:02 I think this this is the reason I'm such a fan of Python was just how well it lets me do so many different things that I'm working on. That's a really interesting point. You know, a lot of people talk about, well, Python is slow for this, or it's slow for that. And yet, here's all these really intensive computational things that Python seems to be the preferred language for. And I think this is one of the hidden secrets that's not apparent as people come into the ecosystem, right? Obviously, people have been here for a long time, they kind of know that story. But you know, as people come in, because there's there's all sorts of people coming into the Python world drawn in a little bit like you, you talked about how you started out in biology, not necessarily to be in software development, specifically, but then he kind of got sucked into it, right? Absolutely. Yeah. I think all of the conversations around the performance of Python is super interesting. It's like, Oh, it's, it's really slow, except for in this time, where it's like as fast as c++. Wait a minute, is it? Is it slow? Or is it fast C++ Well, it's both right. It varies. But you can bring in these extra like, turbo boost, right, like Cython and or do you work in NumPy, rather than in straight list and stuff like that.

18:08 Absolutely. And like one of the initially, when I started off my PhD, I actually wrote an initial prototype version of it all in C++, using open CV opens, our C++ library and the machine learning deep learning library called 'Caffe', which is a bit of a thing back in the day, and the product process for dealing with data and even just converting data between like, I think the best thing about Python is the fact that NumPy arrays is just understood by all the scientific libraries, whereas sometimes with other languages, it can be painful moving data between different libraries and tools. Oh, interesting. Yeah,

18:45 you're right about that. Yeah. So like,

18:47 I remembered at one point during my PhD with with that initial c++ version, I had like a page of code to convert between an open CV matrix and a cafe, I think blob, and it was a page of code that I was terrified of breaking, because I didn't understand how it worked. Whereas Python, it was like, everything. I can move between, you know, scikit- learn pandas, and all these other libraries. It's all kind of got that common foundation that makes me really efficient. And that I understand really well,

19:15 that's a really interesting insight that there's this sort of common data structure across the libraries, because you're right, I remember in C++ and other languages like C# and whatnot. This one will take something like this and you've got to reorder the data and reformat it to pass it over. And if you have to do that back and forth, it completely slow things down. All sorts of stuff. Yeah, very interesting,

19:37 in a way as well. I really loved that the Python stack has let me do things in during my PhD and then post PhD as well, in just the skills that I developed in analytics here. I've gone on to be able to use that in so many different places. For instance, I one of the pieces of analysis i did was i used pythons network x library to look at the social interactions between the queen and worker bees. And I would build out these network graphs that would explore the the number of interactions and the length of time of those interactions between the Queen and the worker bee. And this actually, recording independent interactions actually became important because sometimes the Queen would literally fall asleep behind another worker. And it would look like she loves that worker, but she just was resting for like over an hour or two, what I've actually found is that those skills for working with data and with with network analysis, when I was working in consulting, I would use network X to analyze the corporate structure of organizations that we're doing an overview for. And then more recently, I've done work in the energy sector looking at building out networks of power stations, as well. And so it's, I think that's, that's one of the things I love about this area is that you have this kind of transferable skill sets, that you're more limited by what you can think of, but using it by rather than what you can actually do with it with it itself.

21:02 Yeah, absolutely. And I think for a lot of people, if they're out there listening, and they're doing, you know, academic type stuff, or working in one area, but maybe that's not the area they necessarily want to stay in. A lot of these skills are super transferable. One of the things that's blown my mind, as I spent more and more time in the software industry was I remember, I was doing professional training. And I spent one week at a stock brokerage in New York City teaching programming. And then I spent two weeks later, I was like an Air Force Base, working with some of the engineers there, the stuff that those two groups need to know, it sounds like it's entirely different worlds, right? It's like 90%, identically the same. It's just a little bit of what do you do with that? Once you know it? Like, what's the secret sauce on top of it that puts it together? But yeah, and it sounds like you kind of got that skill in your research.

21:52 Absolutely. And I think this is one of the things I've noticed is that some PhDs can struggle to transition into industry. And often it's because there's people on the industry side that don't really understand how those skills can help them. But at the same time, I think it's actually a skill to be able to explain how you can link what you already know what you're capable of, and solve their kind of business problems. And in fact, I think when I went into management, consulting and do some work for some of the partners, eventually it took me a little while to figure out that they weren't that interested in, you know, the code I was doing, or even some of the raw data. But if I could figure out a way to link that to the business problem that we're trying to solve, then they were interested in being able to kind of communicate and act as like a bridge between those, or something I didn't realize was a skill, but it is hugely valuable in organizations. I've really noticed. Yeah,

22:47 absolutely. All right. One final question about your research before we get into the energy sector, that what year did you do that

22:54 work? Oh, so 2014 to 2017.

22:57 Yeah, and that's not that long ago. And yet, the machine learning story has probably progressed really quite a bit with deep learning, transfer, learning all sorts of stuff going on the different use of GPUs and tensor compute units and whatnot. What would it look like? Now? If you're doing it? versus then what would it be different?

23:17 I think now, one of the big differences was really that TensorFlow only came out towards the second half of my PhD. So I think a lot of the the, I think that that was a difference, having more accessible machine learning libraries, and tools really made a big difference. The other one was, I think, when I started my project, I actually spent a lot of time playing around with with, you know, now, if you started your PhD, you would do image analysis, it's going to be deep learning. Whereas when I started, I was actually pointed in the direction of, Oh, go check out, you know, support vector machines, try out a random forest, try out a whole bunch of different feature engineering and machine learning techniques. And so I spent a lot of time kind of moving around between those before I literally had a, I got in touch with a researcher in the computer science department, because I was in the biology department doing this work. And he literally, I had a chat with him. And he literally looked at what I was doing. And he said, use deep learning is and he said, He's go check out, check out these libraries, but this is what you need to do it. And I think, yeah, in a way, like that type of the libraries and the understanding about how you would solve this problem now is a lot further along, and probably would have shortcut a lot of my initial frustration, compared to, Yeah,

24:34 probably but think of all the lessons you've learned with those nights of it not working and, and whatnot. Right. One other thing really quickly is I love to look at this graph here, this the Stack Overflow trends, and I'll link to this in the show notes. There was back in 2017 article by StackOverflow, their data science team called the 'Incredible growth of Python', and they predicted, oh, pythons gonna overtake some of these languages and You're not going to believe it, it's going to be more popular than than JavaScript more popular than Java. Like No way, there's got to be something wrong with the data. And obviously, here we are. Now, in 2021, where I think they underestimated Honestly, I don't want to have the exact picture in my mind, but I'm pretty sure they underestimated the last couple of years, which is pretty interesting. But that's not what I want to talk about. But I want to talk about is that 2012? You know, Python had been around for that time, 25 years or something. It was well known. It was a fairly popular language, but it was kind of just steady state. And then it's like somebody just lit the afterburner on that language. And it just, you know, it just started going up and up, right around that time, this is the time that you got into Python as well, more or less, right? Absolutely. I feel like so many people came from these not traditional programming spaces, I mean, still interested in programming, but not like a CS degree type of programming in and it just brought so much diversity in terms of the problems being solved. And I think this graph is exactly what's happening here. It sounds like you're part of that making that curve go up there.

26:07 Yes. Yeah, I guess so. And I think, for me, as well, when pandas came out, I think around 2012, for working with, you know, data frames as objects, I used R I really liked that kind of data frame feature in R initially, and it was a little bit frustrating before pandas was a thing, being able to having to deal with, you know, CSV files and having to treat them as lists and indexing. So, when pandas became a thing, that was almost one of the big reasons I pushed into using Python for so much, and I still feel like I've been using pandas for the, I guess, eight or nine years. And I'm pretty sure the project I'm on Currently, I'm pretty sure I've learned a few extra things about the library just in the last couple of weeks.

26:47 Yeah, it's crazy how that works, right? Like, I've been doing this forever. How did I not know about this part of it? Right. Absolutely. Amazing. Amazing. All right. Well, super cool project, you had that? Let's talk about energy. So you work for the Australian energy market commission? Yes. Yeah, it was just what do you do there,

27:03 you can always think of them as the rule maker for the energy market, we don't run the energy market. That's the Australian energy market operator. But they effectively passed the legislation that determines how people have to act within the energy market. The reason I really joined the organization was because when I was working in consulting, I started doing work in the energy sector. And I do work for, you know, energy retailers, the people that you know, you pay for your electricity, I just work for some industrial companies. And one of the things I found was, when I bumped into the wholesale energy data, it was always like this, what was it the, you know, the City of Gold in some way, it was immense amounts of reasonably well structured and cleaned data, where the limitation wasn't, you know, the data or cleaning it, the limitation was understanding the domain well enough to do interesting things with it. Right. Okay. And so that's really became my obsession was to learn as much as I could. So I could actually do more and more interesting things with the data. And so the reason I joined the AMC was because it's one of the most amazing workplaces in terms of the capability of everyone there is so passionate, and incredible at what they do. And so just being around these people, and learning from them, is just an experience in itself.

28:25 Yeah, fantastic. It sounds super interesting. It sounds like things like your network experience. Yeah, there's probably a lot of networks and energy and suppliers and whatnot there might go together. Absolutely. Yeah. Basically, there's, there's kind of a market that they set the price, energy, and then generators, like private companies that are, you know, have power plants and solar farms and whatnot, they can decide whether or not they want to participate at that very moment in the grid, or how does it work? Yeah, absolutely. So

28:54 yeah, this is one of the fascinating things about the wholesale energy market, you can almost think that every five minutes, the market operator is effectively running an auction, where all power stations around us on the east coast of Australia, they're bidding in bids for how much they were willing to sell different volumes of electricity at. So for instance, a wind farm might say that they will sell this volume of power quite cheaply, whereas a gas generator that has quite a high cost of fuel will set a higher price. And the market operator will take all of these bids. And it knows the locations of these generators, it knows the capabilities of the transmission lines and the network, and it will run this linear optimization to figure out okay, what is the cheapest mix of generators that I should dispatch to satisfy demand, while still making sure that work is secure.

29:49 Okay. So it's like trying to optimize certain goals like we are going to need however much energy in the grid at this very moment. And these people are willing to supply it at this like you know, who do we take However much energy from until we get, like both enough people that are willing to participate from a financial perspective, and then what people also need.

30:08 Yes, absolutely. And that's the thing that's so fascinating about this market is that at all times, supply and demand

30:15 have to be matched very, very carefully, because it'll break the grid. If there's too much, it's probably worse than too little. Because you just get a brownout, right, but too much could destroy things, right?

30:26 Yeah, you don't want too much, if you have too much, then you need generators to start to try and reduce the output. And at the same time, if you have too little, then it can also create problems. And in fact, the the grid has to be kept at this such a precise level of balance, that if it actually, you have too much or too little for too long, it will damage the machines that are connected to it. And in fact to protect themselves, you will actually see them start to disconnect. And it can actually create these kind of cascading problems. So unless you we actually had a fascinating example, recently in Queensland, where a turbine or coal turbine blew up, and it then tripped a whole bunch of other coal power stations that then couldn't then stop creating load. And so you suddenly had this situation where you had all this demand for electricity. And suddenly, they just lost all of this generation ability. And what actually happened is the system just started disconnecting. And what caused the blackout was this automated system in a fraction of a second, they just started disconnecting mode or demand to try and balance it as quickly as possible to try and arrest the problem. So and one of the things I've actually been doing has been looking at this at like a, on a full second basis, the events that happened on this day, and how different units responded to these events. It's amazing. Like, there's almost like the energy sector at the end the market, it's almost like this, the physical infrastructure, of making and making everything work and all that amazing engineering. And then there's the financial market, within the bids and everything like that that's built on top of it. And the market. And the bids are fascinating. But at the end of the day, everything has to bow to the engineering. It has to work.

32:11 It has to work. Yeah. It's all just gonna come apart. You ever seen an AR out in the live stream says, Is EAMC do anything with energy web?

32:21 I'm not sure if I've come across that before, but I'd be interested in looking into it. Yeah.

32:25 And then also, it sounds like terms, what you're describing are DERMS I'm not sure whether how you

32:31 pronounce it that might be something an acronym from the, from the US energy markets is everyone has learned kind of different acronyms.

32:40 Oh, yeah, that makes it easy right? To not even be consistent.

32:43 If you go to the market operators website, they have a glossary page, where you can just scroll for all those days, all the acronyms

32:51 that are used in the sector. Like an acronym thesaurus, we call it this, what do they call it? Exactly, exactly. This portion of talk Python to me is sponsored by Linode. Visit 'talkpython.fm/linode'. To see why Linode has been voted the top infrastructure as a service provider by both G2 and Trustradius. From their award winning support, which is offered 24/7, 365 to every level of user, the ease of use and setup. It's clear why developers have been trusting linode for projects both big and small, since 2003. Deploy your entire application stack with a node to one click app marketplace, or build it all from scratch and manage everything yourself with supported centralized tools like Terraform. The Linode offers the best price and performance value for all compute instances, including GPUs, as well as block storage Kubernetes. And their upcoming Bare Metal release. Linode makes cloud computing fast, simple and affordable, allowing you to focus on your projects, not your infrastructure. Visit 'talkpython.fm/linode'. and sign up with your Google account your GitHub account or your email address. And you'll get $100 in credit, as 'talkpython.fm/linode'. Or just click the link in your podcast player show notes. And thank them for supporting talk Python. So we have a graph here, this picture on the screen where energy went negative actually. And so yes, this is where people are so willing to pay to take energy that you've generated. Like that sounds completely insane.

34:21 Yeah, I know. It sounds weird. So yeah, to explain this, explain this figure out what's been happening this year in South Australia, the wholesale price of electricity has been around averaged around a negative $20 during the middle of the day, pretty much consistently. And so the way this works is because the generators submit bids for how much they're willing to sell their electricity for because they'll sell effectively when they run the optimization. The price of the bid that satisfies demand is the price that everyone gets paid. So what a lot of generators will do is they'll bid in quite cheaply at negative prices, so that they assure that they will get dispatched. But if everyone bids in a negative prices that everyone gets the negative price. And so what we've actually been seeing is because there's now so much generation in the middle of the day, you're ending up with these really fascinating market events like, yeah, these negative prices where literally, you can get paid to consume electricity as a consumer.

35:19 That sounds pretty good. Yeah, I get it nice and chilly in here. I'll be fine. One of the drivers of this it sounds to me like is solar energy in Australia? Right?

35:28 Yes, yes, we now have so much rooftop solar, I can't remember the exact percentage, but significant percentage of Australian households now have solar panels, because the cost is has come down so much. And so a lot of our work has evolved, looking at how that is impacting the grid. Because if you imagine, historically, the energy market, it was a process where, you know, the market operator could instruct generators to turn on or turn off. And now we're in a world where there's so much of these kind of small scale solars that solid that you can't tell what to do. How do you factor that into balancing supply and demand in the grid? Yeah,

36:06 well, it definitely sounds like some interesting Python must be a play there. So give us an overview of sort of kind of tools using the types of problems you're solving.

36:15 Yeah, sure. In the in the solar place, we've been using a Python software package called SAM, which is a System Advisor Model, which is actually released by the the National Renewable Energy Laboratory in the States. And so what it lets you do is if you provide solar irradiance data and data from weather stations, you can use it to simulate the generation of rooftop in different areas around the country on a granular on a half hourly basis over the course of the year. And so what this lets us do is I can use, they've got a Python library that lets me kind of cool and run this tool. And I can simulate different PV systems sizes, and different locations and angles, and all sorts all around the country. So I can effectively simulate hundreds and hundreds of different PV systems. And if I combine that with how much the household is consuming, and what the actual cost of electricity was, and those half hour intervals, you can certainly build up a picture for the economic effect of different PV panels for different households around the country.

37:23 Yeah, how interesting is this the right thing I pulled up here, this PySAM,

23:17 yes. And I will say that for your US listeners, for a laboratory release all of the data for the US mainland, in a format that's ready for you guys to go, I had to be part of my project was actually trying to turn the Australian data into a format that this program could understand. And that in itself was an interesting exercise in data cleaning and manipulation, because for instance, all of the data on the irradiance for the country came as a as 10s of 1000s of these text files that were just these kind of grids, which pretty much, they said, each value represents a five by five kilometer grid on the Australian mainland, it starts at this coordinate, so I had to pretty much try and convert this text file into a map and then convert that into format. So I could know the how, where the household fell in that as well. Oh, wow, how interesting.

38:18 Yeah, that's, you're normally taking a bunch of text files attorney, like piecing those together in a map. But I guess you do. Eugene, who was on the show a little while ago about the life lessons from machine learning have brought an interesting quote or something to the effect of the data cleaning is not the grunt work, it is the work of I can so much of this, right. Like it's getting everything right, making sure it's correct, converting it formatted. And then you feed it off to the magic library and get the answer, right. Yep.

38:48 Absolutely. And I think, yeah, the lessons that you learn from working with and cleaning the data often help inform your analysis later on. For instance, one of the things I was doing recently was I've been trying to correct by areas in this really large data set, measuring output from these power stations. And so one of the pieces of advice I received was that if I see a data where the belt where the the generation value from the power station does not change, and effectively says this power station is generating 100 megawatts, and that value doesn't change for more than a minute for at least a minute. That means there's an error in the data collection. And so as I was, you know, cleaning up the data, and I implemented that, and I started looking for outliers. And I actually discovered that you could see for some solar farms that it looks like if I use this metric that I implemented to pick out the bad data, it was actually removing cases where the power station were the solar farm was deliberately keeping their output perfectly level to match this instruction from the market operator. And so I think this is a case where they actually, whether we're actually foregoing additional generation to be more predictable, and I would have missed this whole interesting Oh, how power station behavior, if I just, you know, if I wasn't thinking about what the implications were of these different, you know, cleaning techniques that I was doing,

40:16 okay. Yeah, cuz maybe that that advice comes from, I don't know, a gas power plant or a coal plant where they, they have to fluctuate, because, you know, whatever reason, right? And this new world that the assumptions changed, or the situation change and the assumptions didn't, right.

40:32 Absolutely, absolutely. I think, yeah, like, in a way, like part of the reason, I think I've always gravitated towards being passionate about combining the programming and the analytics with like, deep domain expertise is that I really love when I'm, when I'm working with a data set, when I see something weird, I love that I can go, that's wrong, I can remove that, or that looks weird. I'm going to investigate this because I think that's interesting. And one of the things I found in consulting was the projects where I didn't understand the data, or the industry as well, we're always a bit and I was brought into the team to provide, you know, the analytics capability, but I was effectively, you know, turning the understanding of others into code, I've always found them a little bit less satisfying, from a personal perspective, because I didn't feel like I was the one who was really, you know, getting who was I felt like I was a vehicle, other people turn their thoughts into code, whereas I really like that, if I understand the domain, then suddenly I can investigate and understand the area that I'm working at.

41:36 It becomes a puzzle, not just , I don't know, more, yeah, get information from these people apply it to the data, see what comes out? Yeah, for sure. You know, someone asked me recently, they were looking to hire somebody, it was I don't know if it was exactly in the data science world, but it's close enough, they were asking something to the effect of, should I go and try to find a computer science type of background person who I can then teach the subject matter to and kind of get them up to speed there because we need good programmers? Or should I find some people who really understand what we're doing? And then try to teach them Python? Yeah, what would you say to that? I haven't thought on it. But I'd love to hear yours.

42:16 I think, to a certain extent, the experience, you want to have, I guess, the passion for learning about the domain. And obviously, if they understand the domain, that's really valuable, but you probably want them to be exposed, at least a little bit to some programming concepts for them to know that they like it. In fact, I remember when I had a chat with my former boss who hire means my current role. And he said that he's hiring philosophy is he looks for people with interesting backgrounds, you know, my background, he saw computational biology, and no, a lot of people would be like, Oh, how does that apply to the energy sector?

42:53 Yeah, that doesn't. That's not for us. That's totally different

42:55 Right, exactly. But he said, you know, for him, that's an interesting story. And he could see how those skills can generalize to different areas. And then it's more about are you passionate about the thing you're working on, as well. So like, I think people can learn, you know, people with domain expertise, I think learning Python can be like adding a bit of a superpower to your, you know, your skills, and domain skills as well. But I also think that you wouldn't want to say, for instance, you wouldn't want to hire someone who had good domain expertise into the team to be a programmer who just got who'd never programmed before. And they hated programming, as well.

43:30 Yeah, I think I think that assumption was that they had a little bit of programming experience, or they were super interested in it, but maybe not all the way there. I think the subject matter expertise is really valuable. It's, I think, these days are so many amazing libraries in Python, it's so accessible, that it is really important to understand, like, deeply what's what's happening, but you should probably also have one or two people who have like a true software engineer, experience like, hey, has anybody told anyone around here about GIT , we need to be using source control? And what about continuous integration? And have you heard of testing? Right? Like those kinds of things matter? But I think also having this this, like, deep understanding of it really matters? Absolutely. I agree. Cool. Cool. Cool. All right. So are there I mean, you talked about this system advisor model, are there other things like in say, the astronomy space, there's Astropy, like all the astronomers talk about. This is the library. These are the things you talked about pandas and NumPy, and whatnot already, but is there something like that, or a couple libraries like that in the energy space?

44:20 closest I would probably say is the Pyomo optimization library. So that I think Clark mentioned on a previous interview,

44:37 yeah, yeah. We had Clark, come on and talk about that. And he was doing really cool stuff.

44:43 Like, yeah, I'm trying to, I'm going to set up a chat with our teamwith Clark. That's the plan at a later date. Because, yeah, it was very interesting that what he was able to do during these masters with Lilly optimization, and so I think, yeah, like, really, there may be libraries out there that I haven't come across yet at this point, but I've really found that the whole Yeah, the Python stack of your pandas, Pyomo for optimizations, and even things like, have you come across a library called Geo pandas at all? Yes. which adds spatial element to data frames. I use that for a lot of analysis. Yeah, geo pandas sounds cool. I

45:20 haven't done anything with it. But I would love an opportunity to do something fun with geo pandas.

45:24 I did that, um, I was working in consulting I did, I used that library once for looking at data from the Australian Bureau of Statistics. And then suddenly, I was in demand for every proposal to be making these heat maps of the country, I suddenly was just making and heat that's coming out by a by everywhere. Yeah. It's a phenomenal that

45:42 was how to make these graphs, give it to jack, he'll build it for you

45:47 Exactly but geopandas. If you know a bit about using your pandas and data frames for working with data sets, it's pretty much like using a panda's data frame, but it just adds a whole bunch of capability for working with spatial data sets and creating beautiful figures as well. It's amazing. Yeah,

46:04 it sounds super cool. Super cool. Yeah, it works with shapely and it sounds like it would work really well with your 10,000 text files, almost even some of the some of these things linking it up. Yeah. Alexander out in the live streams. Coming back, just one quick thought says I wish people learned at least some programming, making customs software to cover simple cases is definitely tiring. And most the time, it's just a simple script. Yeah, I mean, kind of the automate the boring stuff could take a lot of people a long ways for sure. Yes, at

46:31 that level. Absolutely. In a way, a lot of little things that I would go around and be useful for when I was working in consulting. If people had a little bit of programming background, then yeah, I wouldn't, they almost wouldn't need my input, because they understood the area better better than me. And they could get their head a little bit of Python and knew how to link up some data sets. And it would be like, they could just automate so much of some tedious things in their lives. You

46:57 know, one thing I heard a lot in that sort of realm was Have you automate all these things, you're going to take our jobs away, what are we going to do like this painful, tedious, manual stuff that should be automated, like that's our job, because that's what a lot of the people that I had worked with for a while, that's what they did. And they were legitimately a little concerned that if we wrote software that would do those things automatically? Well, then what would they do? And I saw year after year, we would write that software, they would say, thank goodness, we don't have to do this again. And they would just solve more problems take on more data, like they would just do more, and almost never did it result. And well, we don't need these people anymore. It just meant they got to do more interesting stuff in a bigger scale.

47:37 Absolutely, This is what I find that when I work with new data sets or problems, and once you've kind of, you know, solved the problem, you understand that data, you've fixed the issues with it, suddenly, having that kind of foundation and curated data set, lets you actually build on it and do more interesting things going forward. It's not like you know, you've done that you can, you know, you never need to do that again. Yeah. And that's what drew me to the energy sector, because it was like, the more I worked with this data sets, the more I understood, and the more interesting questions I could answer, which is really satisfying.

48:07 Yeah. And the more things that are batch processes can become almost real time. And it really changing. So speaking of data, it sounds like you guys work with a ton of data over there give us a sense of the scale.

48:19 Yeah. So I guess the most standard data set is there's a database that has pretty much everything going on in terms of dispatch on a five minute basis. And so for most of your uses, if you just want to see what the power station is doing what it's building, you can use that data, it's large, like pulling out some of these data sets in in, you know, 100 or 200 million rows looking at certain parts of it. But the thing I'm working on at the moment, it almost makes this kind of look small. And this is kind of that same data from that same database, but it's on a four second basis. So a single month of data is about 750 million rows, and it gets all released as 1000s of zipped folders containing csvs, one CSV every half hour, it is and oh my goodness. So is there like a big process that that just goes along unzips, it grabs it inserts it into some database or something along those lines, I think that was how it got gets shared in some format. So what I actually so this is how I was given the data on my current project. And so it's so big, I can't actually unzip it on the machine. So I have to use Python to kind of spin up a number of separate processes that will kind of work through the different folders, it will then use I think Python has a library called zip file. So it will unzip the folder in memory, reading the CSV and process them. And then it will eventually concatenate it all back into to a cleaned data frame that I can work with going forward and so and so then I'm trying to use those sets of tools to try then turn this into a more compressed clean data set that I can work with

49:53 going forward. So does that fit in memory or do you have to like only pull out slices sort of dynamically with the zip file processing

50:00 Yeah, so I can fit about a month in memory on my machine, we do have some large servers. And so I will transition to processing this in parallel on the servers, we should get even better speed up. But yeah, at the moment, I really just kind of look at things on a monthly basis. So what I can actually do is, I know there's a ton of processing, I have to do with this four second interval data, because what I can see is I will break the data up then into five minute intervals. And I, what I can do is I can see what the generated was doing on a four second basis. And then I can see what its target was. So when they run their optimization that will say, We know you're sitting at this point here, you have to ramp up to hit this target here at the end of this five minute period. And so I can use this data to tell how well the generators are actually hitting their targets, and how well they're following instructions. But the funny thing is, even though you know, you think of four second data as being, you know, very, very, such as short interval of time. But if you look at some of the big batteries in the grid, that data is actually too slow, but some of the batteries, because batteries can actually turn on inject power and turn off again, and I can miss it. In the four second data. It's amazing. Some of these big grid scale batteries, like the big Tesla battery in South Australia Hornsdale. They're amazing feats of engineering that you really appreciate. When you realize you're missing things that are four

51:25 second interval, they break your sensors and things like that. Yeah, Australia is really well known for having some of these big batteries in the energy sector. I think, for some reason Tesla seem to partner up with you guys to build these.

51:37 Yeah, the plan is to roll out a bunch more of these batteries around the grid, as well. And they're just really impressive at you know how I mentioned the whole challenge of constantly balancing supply and demand. And really, that's what batteries are so good at doing is

51:53 right, their response time is almost instant. Yeah. So you could just, they could take it in exactly, they could eat the energy, or they could initially fill like immediately fill the gap right for a while.

52:04 Exactly, and sometimes with some of this data, you can actually see the battery will receive an instruction, it will quickly turn on it will discharge some power. And then a couple of seconds later, it will actually then because there's too much power in the grid, the battery will actually then suck up some of that power and recharge and do the opposite effect. It's just amazing. Oh, fantastic,

52:23 I would love to dive into that. But let's say I'm super fascinated with batteries and their potential. Talk Python to me is partially supported by our training courses. When you need to learn something new, whether it's foundational Python, advanced topics like async, or web apps and web API's, be sure to check out our over 200 hours of courses at talk Python. And if your company is considering how they'll get up to speed on Python, please recommend they give our content a look. Thanks. With all of this data, you said that you had basically learned some good advice like certain things you can just easily do and pandas NumPy on a small data sets, maybe not so much on large data sets like that give us some of the things that you found to be useful. And tips and tricks

52:10 yeah, absolutely. So there's a concept called vectorization. I'm not sure if you've come across it. But it's effectively How can you apply an operation to a whole column. So you're not writing a manual loop, or you know, using conditionals. For instance, if I try to multiply a column with millions of rows by a number, it's really, really fast, because that's all kind of, you know, optimized the under the hood. And so with a lot of this, when I'm working with smaller data sets, you can get away with doing some manual loops yourself or using pandas to group by a column. For instance, I would often say this is the identifier for a power station, I want you to group by this column by this column identifier, and then sum up. And even those things start to become too slow once your data is kind of at this scale. And so the real trick I find is, is how do you find ways where you can apply some operation, a calculation to the whole column? And but the tricky part with that starts to be what happens if you want to do conditional calculations. And so one of the things I find is, sometimes I'll want to see how much the output of on a four second basis, how much is the generation of a power station changing. And so you can imagine that pandas has a calculation, and lets you effectively calculate the difference between the previous value that came before really, really efficiently. But because you know, I've got all these different generators and intervals, like kind of all in the same, you know, data frame, I don't want to consider the first value in a five minute interval, because that's effected by, you know, a different time interval as well. So what you can do is NumPy has these great functionality called where or select where you can pretty much pass it a column and that turns out to be true or false for the whole data set. And it will then replace the value with something else really efficiently. So what I can do is I can, I can run my calculation for the whole column, and then I can use a NumPy where to replace the first value in each five minute interval with a missing value. And that pretty much does things in, you know, a few seconds that would have taken I don't even know how long with the other way hours, at least, it's amazing.

55:25 Yeah, I think that that whole computational space with pandas and with NumPy, is there's, you know, again, Python, we speak about pythonic code, right, you would use a foreign loop instead of trying to index into things and so on. And then there's a whole special flavor of that in the pandas world, right. And a lot of it almost has the guidance of, if you're doing a for loop, you're doing it wrong, right? Like there should be some sort of vector operation or something passed into pandas or something along those lines, right? Absolutely.

55:55 Like, it's always like its own kind of type of problem solving in a way, because it's like, how can I apply this calculation to everything in a column? But also, if these cases do something else? That's really the problem solving in a lot of ways? Yeah, yeah. It's

56:12 a lot more set basic, almost like databases. Yeah. What about things like threading or multi processing or stuff like that? Like, have you tried to scale out some of the things that you're doing in that way?

56:22 So yeah, so we have a server that has about 60 cores, and about 700 gigs of RAM on it? So that's the plan is I can I can ship my things over there once this project. 60 cores

56:32 That's pretty awesome. Actually, yeah,

56:35 we've got a couple of them, which is very useful for the energy modeling that we do. And usually what I am doing is kind of a mix of, yes, using pythons multi processing library to try and yeah, just split. Usually, what I'm doing is I'm just processing a whole heap of data frames in parallel, and then concatenating them back into a single data frame, once they kind of processed that workflow seems to work pretty well, for one of the requirements that I have on project, because you can get each subset data frame bit to do its own computation in parallel, right? Yes, yep. Yeah, exactly. And so and the other thing, too, that can also benefit is that usually, as part of the cleaning process, I'm kind of subsetting the data as well. So while the data is starting off, you know, in the med size as well, I'm figuring out which parts of that I need and cleaning it, and then so then the data frame, I end up concatenating back together can be a more manageable size, as well. Right,

57:36 right. Right. Interesting. Have you looked at DASK? For any of this?

57:39 Yes, we've had, I did some work on the server with for different projects, looking at it. And I think DASK might be once I've kind of built up the more curated version of this four second interval data, I think, Dask will probably what I'll use all the server for working with the whole data set in the future.

57:58 Yeah, it sounds like it might really be. I mean, I haven't actually tried to apply to that much data that you got there. But it's sort of its functionality, it sounds like it really might be the thing to do, because it'll take basically, your description of breaking into the mini data frames having to run and then bring it back together. That sounds to me like what dask is about

58:17 for right? Yeah. And the brilliance as well, I think of the dask project is how they were able to kind of emulate the Python, sorry, the pandas way of doing things as well, which is great, because, you know, it's nice not to have to relearn too many things. Yeah, and be efficient from the beginning. Also, I should probably just like as well, that these data sets that I'm working with the four second interval, and then the actual database on the five minute basis. One of the things that got me so into the energy sector, and that's, I think, unique in a way about the Australian energy market, although I stand to be corrected is that this is all public data. If you want, you can go and download all of this data from the market operator's website, which is, you know, an amazing amount of kind of openness, to be able to go in and look at what these power stations are doing and what prices that we're getting on such a granular basis, as well. That's been really

59:02 what's fascinated me. Yeah, that's super cool. You know, if people are out there trying to do research working on a thesis or something like that, they can just grab this data. And it's, like you said, it's real. And a lot of people have both physical machine reasons and financial reasons. Keep it accurate, right.

59:20 Yes, absolutely. And really, I guess it comes back to that that original point where the limitation for this data isn't, how clean it is, or the amount of data it's just understanding the process is well enough to know what you can do with it.

59:33 Right. A little bit like that example you had about the solar farms versus coal fired coal generation. Right. Exactly. Exactly. Yeah. No, that I mean, different things. Yeah. Very cool. All right. Other advice or interesting things going on before we were getting short on time, but anything else you want to throw out there work with all this

59:51 data? One of the areas I'm quite interested in at the moment is pythons number library. And NUMBA Yeah, kind of I think it's uses the able to kind of compile a Python code under the hood to get really high performance can be perfect for my usage. And the reason I'm rather interested in it is that if I can vectorize a calculation, you know how I'm applying it to the whole column. But for instance, if I'm trying to do some calculation where the I want to loop through it, because my current calculation depends on the state of a previous calculation, then that can be a limitation of this kind of vectorization approach. Where is right, right,

01:00:29 you can't just apply it to the set and go, Hey, every row here, just look back at yourself a little bit in that that place and then do the thing. It's got some some dependencies on what happened before, right? And that I have no idea how you would fix that. Maybe it's possible, but it's very tricky, right? And so number does that. Okay,

01:00:47 so number is effectively a way to write Python loops that run as fast as C. But within for numeric calculations. So this is why I'm very interested in in this for some of the areas that are a little bit trickier. For me to vectorize. I'm using data frames, I'm very interested in a number library as a solution for some of those challenges. Now, so number makes Python

01:01:09 code fast. It says it's an open source JIT compiler translates a subset of Python and NumPy code into fast machine code. So I haven't used it but it sounds like it really knows about NumPy. In addition, right, because you can use Cython but Cython doesn't necessarily know anything about NumPy, for example, right? Yes,

01:01:26 yeah, you're exactly right. Like in a way number has kind of replaced what I would have used stipend for in the past, but for my application, and what I could do, say for instance, if I have a NumPy array, and I want to loop through it, and there's some calculation I want to do, and then my next calculation depends on that previous calculation going forward. This lets me do that, in a way and I'm writing Python code. I don't have to write C or C++, you can see you literally adding decorators, just like, you know, how Flask has that lovely kind of decorator syntax. It's always like that, to an extent. Yeah, exactly. You

01:02:02 say at number.JIT. And then anyone this run in parallel or not, yeah, sure, why not?

01:02:07 Exactly. And also, it even has, lets you run your calculations in parallel as well, you can see that little parallel, is it true. And it's actually true parallelism without the GIL, as

01:02:17 well, which is amazing. This is super interesting. I knew that it was a compiler along the lines of Cython but I didn't realize that I had this special integration with NumPy. That's very nice.

01:02:27 Yeah, I really recommend like, I think, for most people, pandas, is probably what you need. But if you're running into these kinds of Yeah, these types of problems, then I think NUMBA would be a solution before you, you would have to look to necessarily have to look to a different programming language. Yeah, absolutely.

01:02:45 And like, this is how I started off our conversation together. The performance side of Python is super interesting, because it's like, oh, it's not really fast enough until you apply this decorator, and all of a sudden, it's just as fast. And it's amazing. I, there's just all these little edge cases that are super neat, cool.

01:03:01 back to some of the code I wrote during my PhD. And some of and I cringe at some of, you know, bad performance practices I probably had for working with them. But and I think this comes back to your point about having people in the team who are a bit more experienced in this area. Because if you can have people who understand this endpoint team members to these tools for optimizing their code, that I think that can deal with a lot of the issues for people who may not be as experienced with writing Python results.

01:03:27 Yeah, that's super advice. And then if you've got data that maybe is like the other side of your story is like too big to fit in RAM, or you want more sort of automatic rather, parallelism across, say, a large data frame than dask is definitely a good thing to look at. Absolutely. All right, jack, this has been super interesting. I think we're getting a little long on time. I don't want to take up your entire day. So I have to call it a whole other show on that bit. But before we got here, of course, you have to answer the two final questions, as always. So if you're gonna write some code, or work on some of these projects, what editor Do you use

01:04:02 VS code, especially since they added Jupyter Notebooks into the editor as well.

01:04:07 Yeah, it's really interesting to see both what VS code and PyCharm are doing to try to bring either just bring notebooks into the space or try to come up with a more native alternative, right, like, Well, here's a cell and it's separated by like this special comment, but you can still run it in notebook style, but it's like feels like a text file you're working with. And it's an exciting time for that stuff. Yeah. And then you've already given a shout out to a couple of different projects.

01:04:34 But any notable PyPI packages want to tell people about Yeah, I mean, NUMBA, the one I'm fascinated with in terms of learning more about at the moment, but I think for in terms of getting things done just pandas and geo pandas and the kind of scientific stack, it's just, it's amazing what people the code that people have done that makes me so effective, just by understand just being being able to use their libraries. It's phenomenal phenomenal Yeah,

01:05:00 are you a Jupyter or Jupyter lab person or something else

01:05:04 I used mainly Jupyter mainly because at least at the time when I tried out Jupyter lab, I couldn't collapse some of the outputs from the cells as well. They may have fix that, but I'll take a peek at Jupyter lab every, every couple of months or so and see what's what's new. All right, fantastic. All right, well, final call action.

01:05:21 People are interested in maybe they work in energy somewhere else in the world, or they're trying to do research with it. What do you tell?

01:05:21 If you're interested get in touch with me. We've

01:05:29 run a few meetups, for energy modelers in Sydney. So if you ever interested in getting in touch to chat about some of the data in Australia or some of the work we're doing feel free, is that zoomable these days, or is it in person? The plan is for the meetup to be stop as a zoom . So that's the plan. Yeah. Fantastic.

01:05:47 Well, Jack, it's been great to have you here. Thanks so much for sharing what you're up to. Thanks so much, Michael. It was great meeting you. Thanks for having me. Yeah, keep the lights on Down Under. Thank you. Bye, bye. This has been another episode of talk Python. To me. Our guest on this episode was Jack Simpson and it's been brought to you by 'Square' 'Linode' and 'AssemblyAI'. With 'Square' your web app can easily take payments seamlessly accept debit and credit cards as well as digital wallet payments. Get started building your own online payment form in three steps with Squares Python SDK at 'talkpython.fm/square'. Simplify your infrastructure and cut your club bills in half with Linode. Linux virtual machines develop, deploy and scale your modern applications faster and easier. Visit 'talkpython.fm/linode' and click the Create free account button to get started. Transcripts for this and all of our episodes are brought to you by 'AssemblyAI' Do you need a great automatic speech to text API get human level accuracy and just a few lines of code visit 'talkpython.fm/assemblyai'. One level up your Python we have one of the largest catalogues of Python video courses over at talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription insight. Check it out for yourself at 'training.talkpython.fm' Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the iTunes feed at '/iTunes', the Google Play feed at '/play' and the direct RSS feed at '/RSS' on talkpython.fm. We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at 'talkpython.fm /youtube'. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon