Monitor performance issues & errors in your code

#398: Imaging Black Holes with Python Transcript

Recorded on Monday, Dec 12, 2022.

00:00 The Iconic and first ever image of a black hole was recently released. It took over a decade of work and is a major achievement for astronomy and broadens our understanding of the universe for all of us. Would it surprise you to know that Python played a major part in this discovery? Of course it did. And Dr. Sara Issaoun is here to give us the full story. This is talk Python to me Episode 398, recorded December 12 2022.

00:38 Welcome to Talk Python to Me a weekly podcast on Python. This is your host Michael Kennedy. Follow me on Mastodon where I'm @mkennedy and follow the podcast using at talkpython, both on fosstodon.org. Be careful with impersonating accounts on other instances, there are many keep up with the show and listen to over seven years of past episodes at talk python.fm We've started streaming most of our episodes live on YouTube, subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.

01:11 This episode is brought to you by Cox automotive use your technical skills to transform the way the world buys, sells and owns cars at talkpython.fm/cox and by Sentry. Don't let those errors go unnoticed. You sentry get started at talkpython.fm/sentry.

01:29 Sarah, welcome to talk Python to me. Thank you so much. Thank you for having me. Yeah, I'm very excited to have you on the show. Ever since I saw that first picture the black hole and heard the news. I didn't know but I thought out there has to be some Python going on here. Oh, yeah, definitely. We got a lot of traction from our matplotlib plots that seemed

01:50 to catch the eye of many developers. I see that it is it is Python. I see what there it is. I know that the That's right. Yeah. Yeah, absolutely. So there's just many fascinating aspects of this. The amount of data, the different data sources, the earth scale telescope, that you all created, the process, even with the pipeline, Python, and just seeing stuff 55 million light years away at all, which is dark is kind of crazy. But before we get to all those fun topics, let's just hear your story. How'd you get into programming and into the Python side of these things? And how do you get going black holes too, just to introduce myself? Hello, everyone. My name is Sarah. I'm currently a member of the Event Horizon Telescope project, that team that makes images of black holes. I've been interested in astronomy since I was a little kid that that was eight years old, I got to integrin Gray, I got to build a cardboard solar system, I'm sure everybody has kind of had similar experiences. That was really my first kind of introduction into a world outside of our planet. And I was just so fascinated by planets that I kind of binge through all of the astronomy books I could find at my local library and fell in love with astronomy. And so I ended up studying kind of the physical sciences, I did a bachelor's in physics. And because I was doing a bachelor's in physics, I didn't have astronomy courses as part of my program. So that was something that was pretty disappointing to me, because I really wanted to do astronomy, but I couldn't quite do that yet. So as an undergrad, I decided to do a summer project in astronomy. So I emailed a professor, or a number of professors working in astronomy in the Netherlands. And this professor responded A few minutes later to my email, which I sent during Christmas holidays. So I was very surprised that he'd respond. And it turned out, he said, it was unusual for him to get emails from students wanting to do work in the summer. So he's like, Oh, come on by, I'll tell you about my project. And he told me, he was one of the big leads of this amazing adventure to take an image of a black hole. And I thought that was the coolest thing I've ever heard. And I kind of jumped right in and undergrad, and then continued my masters and PhD with the project and I've been here ever since. And so Python, I got introduced to Python in undergrad for programming kind of simple data analysis, although most of my undergrad was spent on C programming and MATLAB, because it was very much on the physics side. And this is mostly what physicists use. But then when I transitioned into an astronomy, programming in my Master's and PhD, it was really almost all python and so about kind of, well, I've kind of learned by doing kind of training that I had to do. I had to learn how to write I didn't take any Python classes really just kind of learn as I go was Data Processing and Analysis ended up playing a big part in the EHT results, and they really plays a big part in most astronomy data analysis lately. It seems like it's just taken over that world recently, at least for the analysis side, right? Maybe not running on devices, right. There's probably I don't think there's any Python running on say JWST

05:00 But maybe some of the earthbound telescopes have some in play? I don't know. Yeah, a lot of that data processing and analysis is done in Python. A lot of visualizations are also done in pythons. So become a very kind of versatile and flexible tool for us to quickly kind of make figures for papers, as many people have noticed in our favors our kind of major kind of matplotlib tools that we use in to display our, our data. Exactly. I think it's just completely taken over that space, which is really neat. And I think the Python world is much better for having all of these scientists and folks with other backgrounds using it for whatever the they're interested in, right? It's not just, Well, here's how Instagram uses it. And here's how this other API company uses it. Right? There's, there's a much more diverse mosaic of interests and things people want from the language. Absolutely. I think one of the big strings that we drew from Python was really the open source community. There are many open source codes available that we were able to make use of in astronomy that were not necessarily designed for us astronomy data science purposes. But that just happened to be really useful for what we were trying to do with the data. And so it ended up saving us an enormous amount of time. And so whenever we speak about our project, I tried to make a big point of saying that we do rely a lot on the open source development community, which I think is part of the developer community that is often overlooked. And that it's really important to support open source work, because a lot of it is done by students and postdocs who are in the sciences, who just happen to need code to do a certain thing. And they make it available. And it turns out to have many different uses and other people science. And so we really build on the shoulders of all these people. And it saves so much time. Yeah, like matplotlib, or pandas or a lot of these are absolutely no purpose. Right? Yeah. So you gave a really great presentation, a keynote at Pycon 2022. And I'll be sure to link to that. And we'll pull some ideas out of there. And before we get too, really into the the programming and Python side of things, give us maybe just a little bit of history of black holes. You know, a lot of this starts with Einstein, right? And yeah, as soon as you see that dip, well, if there's enough gravity here, but what have you put more and what do you put more like how steep can that curve get until it, that stuff falls into it, and it gets steeper, right, like, give us the history history, it starts, I think in this in the 1700s, even before Einstein, there was a this runway, wherever Mitchell who thought about this idea of say, when you have kind of something trying to leave the Earth's gravity, it has a specific escape speed in order to leave the Earth's gravity. And no matter what the size and weight of the out of the object trying to leave, it always has the same escape speed, because the depends on the gravity of the earth. So he thought what would happen if you would kind of make the earth so heavy, so compact, that the escape speed would become higher than the speed of light. And so he thought that he made this thought experiment. And he found out that by having the speed being higher than speed of light, this far that you coughs, ends up being dark, because all of the light that is on its surface cannot escape, so you can't see it anymore. If it doesn't get to your eye, you cannot see it. So you end up with a dark star. And that was this kind of thought experiment that led to being a black hole, this dark star is actually a black hole. So even in the 1700s you know, philosophers even you know, religious people like like Reverend Mitchell thought about these ideas about what it could look like if you had such an object. And these objects often seem well, this was just a thought experiment. And then when Einstein came along, he tried to describe space and time and gravity in a single unifying theory and up the theory of general relativity. That was in 1915, when he showed that light was actually affected by gravity, so that light would bend around heavy objects because of this bend in the space time. And then a year later Schwarzdale found the first kind of real solution to Einstein's theory to Einstein's equations, which ended up in what we call the singularity, which showed that there's this object that is kind of a hole in the spacetime. And this object has a horizon, which we call the event horizon. And the size of the horizon, the radius is called now but schwartzsheld radius after Carl Schwartz shield, and it showed that anything that goes beyond the horizon can never come back, because the escape speed at the horizon is at the speed of light. So anything that goes beyond would need to escape speed higher than the speed of light. So light is the fastest thing that goes nothing circle around and just kind of just somehow probably still move and still be trapped within that space. Exactly. So you can never once you cross that point, you can never come back. And at that time, so people thought a lot about the theory of Einstein and this mathematical object that we call the singularity. At that time, they

10:00 Far, okay, black holes are mathematical objects, but they're no way that could be real, right? So people thought, okay, mathematically that's so interesting. But these things can't exist in the universe. And then in the 1930s, Chandrasekhar famous Indian astronomer, he thought about the mass of stars. So they're, you know, stars, when they die, they become white dwarfs. And so they die in these kind of stellar dead stars. But then he found it read about massive star stars that are much heavier, it turns out that he calculated it in white dwarfs have a maximum mass. So there are certain classes of heavier stars that could never end up as a white dwarf, because they would collapse to something more massive than a white dwarf. So it cannot be a white dwarf. And it turns out, what they clumped into is a black hole. And this was the very first evidence that black holes could exist in reality in the universe, that they're kind of inevitable endpoint for heavy stars. Versus then people have been thinking about where to find them. And people have been detecting these powerful streams of gas coming from centers of galaxies that could only be powered by black hole, and the Galaxy M87, that's 55 million light years away from us. That's how one of these big streams so that and that was the very first stream of gas from a center of a galaxy ever observed by humanity in 1918, didn't know that it was going to be a black hole at the time. But then there was more and more evidence that the only thing that could really power, this really powerful process had to be a black hole. So we started then the story of trying to understand what they would look like, from an observer point of view, how big does the black hole have to be for us to be able to see it from Earth? How big do we design a telescope that could see it. And it turns out, there are only two black holes in the sky that are reachable with the telescope, the size of the Earth, that's our maximum size that we could go. And one is the black hole, NM 87, which we imaged in 2019. And then the second one is our very own black hole in our Milky Way. Sagittarius the star, which we imaged in 2022. Just last night. Oh, do you have pictures published to that? Yeah, that's right. That it came out on me. 12/2022 Yeah, just two weeks after my PyCon talk.

12:09 Yeah.

12:11 frustrating not to be able to say anything. So I kept telling people, you know oh please watch our news. And we have exciting stuff coming. But

12:21 we have more pictures. I'm sure I know. I've seen them. Yeah. It's also worth pointing out this whole project that you were on? It's a long running project. Yes, it went from when did it start? I know. Yeah. Kind of the technology part started in the late 90s. And then actually starting to add more and more telescopes started in 2007. So it is about a 10 year process between the first telescopes that joined and then the 2017 campaign in which we had finally enough telescopes to take the data to make images. So about 10 years of building, and we're still building, we're still adding more telescopes. We've added three more since then. Yeah. So this is called the Event Horizon Telescope. And it's a planet scale telescope. I mean, I just saw a bunch of news and cool pictures from JWST. James Webb Space Telescope. Yeah, it's huge. But it's only like 30, how many meters across 30 feet? Some number of meters? 12 meters? It's not planet sized? No. How do you build a planet size of one, we use a very clever trick that is called interferometry that actually won the Nobel Prize back in the 70s. And by Martin Ryle, who designed this technique, and what we do is, in fact, it would be great to have a telescope the size of the Earth that we could build like a giant Deathstar in space that we can look at that it would save us to lunch trouble on its breath, it turns out convincing funding agencies to give us that kind of money wasn't really possible. So we had to use this clever technique of interferometry. And what we do with our technique is we sync up telescopes that are at strategic locations around the Earth, they have to observe at the exact same time, and they're timed accurately by atomic clocks that are so precise, they lose about a second every million years. And so these clocks time, they're they're like, a mini fridge. Yeah. Wow. Okay, yes, they're really good. They're very expensive. So we have to take really good care of them. So there's one at each of our telescopes that time tags, the data as it arrives at each site. And then we record that data onto hard drives. And then we ship those hard drives to a central computer, in which all the data from all the telescopes are combined and aligned together to synthesize a virtual telescope the size of the Earth. So that's how we do it. That telescope is not physically the size of the Earth, but we synthesize it using telescopes that span the sides of the year. Right. So kind of like little different sensors at Yeah, exactly. locations, but you have to keep them in sync. That's the big problem, right? Yeah. Timing is everything for what we do. Yeah, yeah. And you think about how off can you be? You're talking the speed of light, a little microsecond you cover?

15:00 lot of distance in that time at the speed of light, right? That's right, yeah, we have to be extremely accurate because any shift in time between signals means that we're no longer in the reference frame of the black hole, everything that we see has to be in the reference frame of the black hole. So it all has to be exactly aligned at the same time. And there are many, many different aspects in the instrumentation of each telescope that needs to be measured very accurately, and corrected at the time of combining the signals. And then we also go through a very lengthy data processing step in which we correct all the additional effects, we also have to deal with the Earth atmosphere, because we observe such a high frequency and the radio, which is 230 gigahertz dealing with very high frequency radio waves, and they're very affected by the atmosphere. So water vapor and instability in the atmosphere can scramble and destroy our signal. So we have to correct it after the fact and kind of model the atmosphere at each site, remove it, and then within average, or data down and build signal that late build signal to noise. And then finally, at the end of that, all that processing is where imaging comes in. And then we can combine the data with our imaging software's and then reconstruct images that best fit the data. Yeah, you had a really cool visualization showing how obviously the Earth is curved. I liked how you tried to reinforce that with yes, I'd really know that the Earth is round. Otherwise, it would make our job very weird.

16:27 This portion of talk Python to me is brought to you by Cox, automotive, Cox. Automotive isn't a car company. It's a technology company that's transforming the automotive industry. The team at Cox automotive understands the future of carbine and ownership. And they're looking for software developers, data engineers, Scrum Masters and other experts to help make that happen. If you're interested in innovating with brands, like Kelley, Blue Book, auto traders, and others, then you should check out Cox automotive just visit talk python.fm/cox To find out more. Thank you to Cox automotive for sponsoring the show.

17:06 Yeah, we definitely have to take into account the curvature of the earth because of this delay in curvature to telescopes far enough apart have to deal with the Earth's curvature because then the signal doesn't arrive at the same time at each place. It arrives down here earlier than arrives here. And so we have to take into account this curvature difference to make the signal be at the exact same time. And we do that at the supercomputer speed. Yeah. So maybe let's follow the data for a minute. Because I think that that's pretty interesting. So it starts by things hit the different telescopes, how many did you say were involved? Yeah, eight telescopes at six different location. Yeah, you got these eight different telescopes. And they're all at different points on the earth. So the signals hitting them at different times, just because it's not a lot of time difference, but it's enough to make a difference in the light waves or the radio waves. But then the next step gets pretty interesting, because it's not like we're using our fiber or a gigabit connection to upload the data, you record it to a huge block of hard drives, and you physically send the hard drives, right? Yes, that's right. Yeah, they're basically just standard kind of helium drives that we use just stuck together the kinds that are in laptops on and then we record our data onto the hard drives. They're kind of split into random folders with some names that I can't decipher. But people who work at their computer supercomputers do, and then we pack them up, we drive them down each mountain, and then ship them usually via FedEx or UPS to our two central correlation facilities where we combine the data one is at MIT haystack in Boston, right here. And then one is in Bonn, Germany, at the Max Planck Institute for radio astronomy, so they get half the data and they get half the data. Yeah, there's one guy who used to be a police officer, who is now in charge of logistics of this. Right, right. Yeah, I was like telling the story of Don because of So Don Sosa used to be a police officer for the state of Massachusetts. And then since then, for the last 20 years, he has been dealing with the shipments for MIT haystack observatory. And all of the data goes for him, including the ones that go to Vaughn. And for he's been doing this for years. And he is our main shipping guy for the DHT. And he's never lost a single package. That's incredible. Yeah, there was just one I think, funny mishap that happened once where we had a package incoming from the south pole that was meant to be ourself south-pool equipment. And then he opened it and it was just a bunch of fabric that he's missed. So that was a little bit of an issue. I don't remember if it was still because I think that just regular fabric, but then luckily, the help of the shipping company, the mishap was found and then the fabric went to the rightful owner. I mean, the equipment got returned to us.

19:52 Yeah, like they didn't get what they expected and we definitely didn't. That's right. How bad would it have been if you'd lost it?

20:00 would have been catastrophic or would have been just so it wasn't a data, it was just a piece of equipment, it would have been expensive to get another one and go through the test. And they would have probably delayed a couple things. But other than that it would not have been catastrophic. Yeah. The other thing I think is interesting, as you said that the color code the hard drives, with green, ready to record and empty and red being awful, and they gotta be processed. Yeah, we have a little round stickers on our modules that have the hard drive that color code. Yeah, yeah. Yeah. Why not? Right, you want to look over and see like, oh, that pile, that pile is going out, not coming in.

20:37 And often that the sites, we have spare modules that we have just in case we have issues recording onto certain ones. So it's good to know which ones we've recorded on and which ones are spare. Yeah. So there's a lot of data here, which is why Yeah,

20:53 yeah, if you had a couple of gigabytes, nobody wants to ship hard drives, right? They come back and get lost, it's relatively slow, unless you have petabytes of data. And then all of a sudden, you realize this is actually a problem. I've been just doing some backups of all the podcast episodes and stuff that I did. And they're like, 20 gigs per episode. And there's hundreds of 700 apps. I've been uploading for days for four or five days. I'm only through one of the podcasts for part of the time. I'm like, This is insane. And I appreciate just like how hard is it to send this much data. So when you're talking petabytes, and these are remote locations as well, for the often right, so maybe not the best connection? Yeah, I think it would be pretty much unfeasible to send data from our site, especially the South Pole is our most remote location. And during the South Pole, winter, the internet is not very good, that we have to rely on some satellite connections that happen twice a day. And so the connections are about fast enough that it would take about 40 years to get all the data from the south pole that we take during our camp. So it's really just not feasible. Our founding director always says that you cannot be the bandwidth of the 747 sold with hard drives. Because shipping is always on the backs. Yeah, that's incredible. But apparently apparently true. Yeah. So we have a picture, right? We have this this amazing picture of the black hole. We'll talk about the story about how we get it, but maybe you just describe it for people. I'm sure everyone listening has seen it. But it's maybe been a couple of years. Yeah. So this is a picture of the M87 seven black hole, what we call the first image of the black hole. We released this picture in April 2019. So three years ago now. And this picture comes from our 2017 observing campaign. And it is of the central black hole inside the galaxy, and M87, which is 55 million light years away from us. So the light you're looking at, in this picture is 55 million years old, which is pretty cool. And when it left the black hole, the first novels were starting to appear on planet Earth. Wow. That's amazing. That's pretty cool. Yeah. So what we're seeing in the middle is this kind of a dark patch, which we call the black hole shadow. And there's this big ring of light around it. And the dark patch is where the black hole is inside. And the reason why it's dark is because inside, there's the black hole event horizon, which is actually a little bit smaller than the shadow. And the path of the light rays that are around the event horizon makes the darkness in which the light can no longer escape and makes it puff up. So the light rays kind of graze a bigger area, which is why the shadow is actually bigger than the event horizon. And then this ring of light around it is very, very hot gas. So we're talking billions of degrees, hot gas swirling around in the black holes, either accretion disk or bottom of the famous jet pen, this gas is rolling around so much that it's glowing. And it emits light in the radio, which are those radio waves that we observed. And so this is light that hasn't crossed the event horizon. It's not close enough to be pulled in into the darkness and not being able to come out. But it's just close enough to show us this shadow. The Black Hole illuminates the darkness. So there's all this mass has been attracted. But it's like whipping around kind of like a ball right in the water is not just shooting through the hole. It's kind of spinning, but it's spinning near the speed of light. And there's lots of friction and stuff, which makes it glow. Right? That's right. Yeah. And what about the jet to the jet come from this accretion disk? Or does it somehow come out of the black hole? Or where does it come from me? It can't write Yeah, it doesn't come out of the black hole. But it does come pretty close to the black hole. So this image in particular didn't teach us too much about the jet other than we can't see it in our image because at the time, we didn't have enough telescopes to see the jet because it's very, very faint at the frequency we're looking at. But our polarization image which came out in 2021, which showed that kind of a more spiral structure on top that trace magnetic field around the black hole showed us that the magnetic

25:00 fields around the black hole are kind of in a spiral shape that twist enough, and are ordered enough to be able to launch the depth. So they're coming from the rotation speed of the black hole that makes the kind of the spacetime twist and the magnetic fields twist. And that is launching the jet. Okay, amazing. I was just thinking last night, getting ready for this and doing some of the research, you know, the black hole, the lights and light and all that stuff goes into it, all this matter is completely dark. But inside, it must have an insane amount of light and energy somehow in there, right? Like, if you could get into it, it must be really bright. If it none of the light ever created can ever leave, it must just be building up in there, it's hard to tell, it would be hard to explain what you would see inside, once you fall in the space time is getting stretched like crazy that the more in you go, the more stretch it gets, because it it inevitably should end up in the singularity in which while of space and time is compressed into a single point. So everything is kind of shifted and spaghettified. And so even the light ahead of you, you wouldn't be able to see it because it can never travel back in the direction where you are. So wherever you're falling, whatever is in front of you, you cannot see because it cannot go out. It can only go in. And so yeah, maybe more like Time stops. And there's nothing you know, maybe maybe it's not that . Yeah, if you look back behind you at the horizon, and not the outside, you would seem the entire history of the universe. Yeah, it's kind of a, it's a very mind boggling place. And I think people do get very creative about what could happen inside a black hole, what you would see, there are a lot of thought experiments around it. But obviously, because of the physical boundary that is the event horizon, we can never know what is beyond, I often get this question in talks like literal, what you show us is just the shadow of the black hole. It's not the actual black hole itself, which I think is not a true question. Because I don't think this statement is true, because this is everything we can see of the black hole, it is the black hole in the only way we can see it, whatever is inside we can never see. So this is always a picture of a black hole. This is it. This is yeah, this is as far as you can go. You cannot go further him milling the audience's Whoa, that is weird. It definitely is weird I think we get a lot of very emotional reactions. Whenever we talk about kind of the the whole aspect of looking at this image and what it represents this end of space and time, this kind of point of no return. And this whole kind of world of what is beyond it, what is outside of it. Very strange. Alright, so back to following the data along, we've got petabytes of data collected to these hard drives on site at these six different locations, they get shipped to the two places in Cambridge in the US, and then Bonn, Germany. And then what happens? Yeah, then once all of the desks arrive, they get plugged into the kind of readout machines that kind of read out all the data from all the different sides. And then we have supercomputers that align the data using a bunch of it's a very complicated machine. But it takes into account the Earth's curvature, a model of the Earth, the positions of the telescopes, different instrumentation effects that we measure at the sites then get put into this big model. And then they have the signal between each pair of telescopes is aligned precisely to maximize the detection. And then once all of them are aligned, what remains so that we record at the site is kind of all of the signal the telescope receives, which includes signal from the black hole, but also the atmosphere, their surroundings, the air, the instrument and stuff, but because the only thing that telescopes have in common is the black hole signal, the rest of the data doesn't what we call doesn't correlate. So it is not seen in common. So at that point, a lot of the data that's noise can we can get rid of because we only care about what the telescope see in common, which is what comes from space. And so then we can reduce the data from Parab petabytes to like the kind of hundreds of gigabytes level character terabyte gigabyte level. And then at that point, after the correlation, the data are ready to move on to the data calibration stage. So we end up with kind of data files per observing day of about maybe 100 gigabytes, 300 gigabytes, kind of manageable for software, right? Yeah. You could transfer it over the internet sort of levels of data, kinda Yeah. So usually, we can put it on a common archive. That's where we haven't stored the correlated data. And then the data teams go in and get the data and then process them for their software. And we use cloud computing facilities to do data processing as well. It really speeds up things. I bet. These are like big supercomputers in those two locations, not like AWS or Azure is now there at the locations. That's right. Yeah, yeah. They're basically computer clusters. Are these shared computers. Are they dedicated, especially for this job? Are they more general they're dedicated, especially for dealing with

30:00 A radio interferometry data necessarily, they're not used all the time by us, they're also used a lot by geodesy experiments. So geodesy is the science of studying tectonic plates on earth. So they basically do the exact opposite to us, they use telescopes at different locations on Earth, looking at a common source, that doesn't change. And then that way, they can track the movement of the locations of the telescopes on earth. And they do these observations many, many times. And that's where they get all combined. And in fact, through geodesy. That's how they get the model of the Earth that we use in our experiment to correct for this movement of the telescope. So there's a kind of very nice cycle that happens on the same location. Yeah, yeah, that is a symbiotic cycle. That's, that's great.

30:45 This portion of talk Python, to me is brought to you by sentry. How would you like to remove a little stress from your life? Do you worry that users may be encountering errors, slowdowns or crashes with your app right now? Would you even know it until they sent you that support email? How much better would it be to have the error or performance details immediately sent to you, including the call stack and values of local variables, and the active user recorded in the report was sentry this is not only possible, it's simple. In fact, we use sentry on all the talk Python and web properties. We've actually fixed a bug triggered by a user and had the upgrade ready to roll out as we got the support email. And that was a great email to write back. Hey, we already saw your error and have already rolled out the fix. Imagine their surprise, surprise and delight your users create your sentry account at talk python.fm/sentry. If you sign up at the code, talk Python all one word. It's good for two free months of sentries business plan, which will give you up to 20 times as many monthly events as well as other features, create better software, delight your users and support the podcast visit talkpython.fm/sentry and use the coupon code talkpython.

32:00 In this calibration stage where Python starts to make its appearance, right, yeah, so we have our data processing. software's are, I think mostly built in, let's see Fortran and C code in so more of the kind of heavier data crunching parts. And then there are a number of kind of post processing tasks that do use Python, especially we have a kind of suite of post processing tasks that make use of pandas a lot, because it offers a very data structure that allows us to organize data correction across, you know, kind of different columns of types of data. And that's been very helpful for us, you also get a lot of matrix multiplication vector math, yeah.

32:44 Yeah, it's very helpful. And it has made our data processing a lot cleaner to do this for that data structure. Now, this cycle, where you take the data, you factor out the atmosphere and other interference, and then you add in all the good parts of the signal, the positive parts of the signal from the different telescopes, you analyze it, he said that this cycle goes around a couple of times and takes like a year and a half or something, right? Yeah. So from the time we took the data, which was in April 2017, to the time we released the image in April 2019, about one and a half years of that time was spent on the data processing. And the feedback loop between our correlation centers and the data processing team, I was part of the data processing team, we were quite a small team, working kind of round the clock on a pretty fast turn around, just because we had kind of a lot of data issues identified at each site that needed to be rundown, and then going back upstream, and then back down. And we also had a feedback cycle with the imaging teams that looked at early versions of the data, and then discovered issues and then again, going back upstream. So there was a long process of really trying to fix all of the instrumentation parts that would contaminate the black hole data. And then eventually, we finally had a data set mature enough that would allow for actual imaging. And then once that was done, where we're dealing with data files in the kind of 100 megabyte sizes, after data processing, we could really average our data down. So we're dealing with complex numbers, the data we get from the black hole are complex visibility's, they have an amplitude and phase, like seven plus two is complex in that sense, exactly. Yes, yeah. And so because we have complex numbers, if we try to average numbers in which the phase part don't align, we end up destroying our signal because it's destructive interference. So we needed to align the phases in order to average down our signal to have constructive interference and build up signal. And that's really the key part of the data processing. That's what we do in order for the imaging teams to have the best data possible, and then ended up kind of 100 megabytes data. Then we move on to our imaging software did a quick question from the audience. What sort

35:00 Wait times do you experience when you're iterating? Like this step when you're sort of working in, say, a Jupyter notebook, and you press Run? How long until you see the results? I guess? What's the experience of working like this is a lot of batch processing and you walk away for lunch? Or is it small enough that you're getting immediate responses? Yeah. So I think outside of the Python packages is where we run most of the big data crunching, I think those parts on it would take a couple of days to run through our calibration pipeline, and the Python parts would take maybe a couple hours, then when we're running diagnostic Jupyter Notebooks, it would take maybe not long, because then at that time, we're basically in the 100 megabyte parts, just doing data checks in which we make use a lot of Jupyter Notebooks, in fact, and matplotlib, to plot through our entire data steps and cross check every part, we have all these diagnostic notebooks that that look at our data. And those don't take much time to generate, I would say, maybe 20 minutes would generate kind of a single suite from a Jupyter. Notebook. Sure. Very interesting. So most of the work is done by this supercomputer, and C and Fortran and then you kind of are working with the raw answers and just numerical form, but you want them visually, or do you want them correlated or things like that? Yeah, then when we're really looking to diagnose or data to find instrumentation errors, then visual representations are extremely useful. And we've made a lot of use of kind of Jupyter Notebooks, especially in our data processing, check to make sure everything looks good at each step of the pipeline. One more audience question, then I'll move on. Mellie asks, do you do logging or any staging to keep track of the stages they're at and not in a loop? Yeah, so each data pipeline has has multiple stages in which we have kind of checkpoints at each stage, where we have a suite of kind of Jupyter notebooks or our suite of plots generated, so that we can go back, if there's an issue, we can go back and each stage and look at the outputs and make sure that we can generate, or we can identify where the issue has happened or when failure has occurred, we have more on this also of every process that comes in, we also do very strict version control of each kind of software version, every time we run something, we do have a kind of version control package of all the software within one single pipeline run that we keep track of as well, that's really interesting, because you want to have reproducibility, right? And if absolutely, maybe something minor changed about the way pandas does floating point operations, I'm just making this up. But you know, theoretically, over versions, it gives you a slightly different answer. Right? So you, you want to have a really clear snapshot of the entire system. Right? That's right. And this was also very important for us, when we released the papers for this image, we also release the version controlled pipelines, both on the data side and on the imaging side that contained you know, the exact version and kind of container form of the software is that we use and the scripts that we used to generate these exact result. Oh, awesome. So this won the Nobel Prize, right? It didn't.

38:11 No, I'm sorry, I don't know why I thought I did. Now we win the Breakthrough Prize in Physics in 2019. It won the heart and the minds though, he didn't point out your talk about how much this made a splash among people who are not scientists. Yeah, that was something that was very shocking for us. Because when we building up the project, obviously, we had talked about the projects in various media forums, and it was interesting in the astronomy community. So we knew there was some traction about the project that people would be interested in the results. But we did not expect the magnitude at which this image would really go into pop culture, especially the fact that the next day after this release, it was on front pages of newspapers around the world, just really unusual for fines results, especially around tightens around these times when it was you know, the most have

39:03 plenty of other stuff to occupy. So it's definitely unusual. And you know, it ended up kind of trending number one on Twitter, which is, again, really surprising for a science news. We did not have a trending campaign, really, we didn't prepare a hashtag. It was not something we were not really social media savvy. I know people. So it was it was not planned. It just kind of organically happen. And then there were so many meetings and references. And we were kind of referenced on talk shows and documentaries and new segments, though it was really beyond anything we had expected. And there was a lot of demand also to hear from us, not just from astronomy, or physics, you know, academics, but also a lot of demand from planetariums and science camps and even museums. We did a bunch of talks at museums, which was really enjoyable and kind of reach to the different audience. And that was all really cool. I think really nice experience kind of reach out to the community and make sure that people are part of it.

40:00 gives us these times the results happen in very big teams. And not just that our teams are funded by taxpayer dollars. They build on resources from various communities, including, you know, the developer community in programming and instrumentation community and you know, all the giants in physics that that came before us. So it's really not this kind of a personal victory. As a member of our team. It's kind of a global effort. And we want to make sure that we send a signal that teamwork is really a global thing. And that big science happens in big teams and that everybody's playing a part in this. And in some way, are you familiar with the Mars Rover? GitHub? Not the rover itself? But the Mars the badge? I guess it's called? No, I don't think so. People got this for the people who worked on or is this, somebody? Show me the badge? So for people who worked on the Mars lander, they got a special badge? I don't know where it is. Oh, really? Oh, I feel like there should be one of these as well. That would be cool. You're right. For the people that worked on this project. Yeah, like a little black hole badge. So I don't have like a little badge. Like, here's the Arctic contributor. There's a Mars rover run, so people can't get out there. Let's make a black hole batch. Because there's so many of these open source projects are indirectly using contributors to it and they could have a black hole badge. And that would be a great idea. I would be happy to reach out to people I'm trying to see if I can make that happen is really awesome. Yeah, indeed. I think yeah, I don't know the username and other people just search for it. But anyway, I think that would be really cool. I think there should be a badge. But let's go and talk matplotlib for a minute. So here you are presenting this keynote. And you have my favorite. Yeah, tell us about this picture that you got here. And then we can maybe dive into how you know, the picture that we see is actually the picture. Yeah, this is my favorite plot ever, which is kind of underwhelming when you show it to most people. But people who have radio astronomers like me would be extremely excited seeing this. And this is really the first kind of depiction of that M87, I saw after we finished the data processing pipeline. This is before anyone made any image, by the way, it was in May 2018, when we actually finished the very first run of the full pipeline in data calibration. And this was the endpoint this plot. And what this shows is kind of the, on the y axis, there's flux density, which is basically the brightness of signal seen by pairs of telescopes. And then on the x axis, there is baseline length, which tells us kind of the separation between pairs of telescopes. So telescopes close together and see a lot of signal and common telescopes far apart, see kind of smaller signal in common, and depending on their location, they probe different different sections of the image. And so what we see here is this kind of, how would I say that this kind of bump, double bump in the image, and what this double knob actually tells us in the plot is, if you look at the dashed line, this is what we would expect if the image on the sky was a uniform ring. Uniform as in the brightness along the Ring is the same everywhere. So it's just the close ring, and the brightness is the same everywhere. And so it turned out that this bump structure dribbled, close to a uniform ring. And this was already really exciting, because we were already trying to imagine what does the ring look like and how this thing, this plot also tells us how big it is it tells us it was actually 42 microseconds wide this ring without any image. And what was also really cool is that if you look at the first dip location, you will see that the data actually split into two levels, there is a lower level and an upper level. And these correspond to two different directions on the sky, that the stations are looking at the lower level are looking at the source in the east west direction, so across, and they're seeing a kind of dip heading, the other ones are looking at it in the north south, and they're seeing more signal. And what this difference tells us is that the reading is not uniform. But it's actually asymmetrical. In the north south, there's one part that is brighter, either north or south that is brighter than the other side. If you look at our final image, it is brighter on the bottom than it is at the top. Where are they if you were to slice across, you will see it's the same right here to slice across is the same. But if you were to slice like this, you have less signal up here and more signal down here. Already. The plot told us so much about what we're looking at. Yeah, yeah. So before you the picture, you're like, Oh, we we think we're onto something here, right? Yes, it was super exciting. But then we were very careful because we thought, okay, there's a lot we can talk from this plot. But let's not try to get ahead of ourselves. Because there there are a lot of types of images that could be tweaked to mimic this kind of structure. And so that was kind of part of the next stages in which we wanted to flip our teams into our big imaging team into four independent teams. And each team works independently for about seven weeks on analyzing the data in any means.

45:00 They wanted with any software, they wanted any decision they wanted. And we came together after seven weeks at a workshop to compare the images. And each team submitted a single image. And it turns out that the images between the four teams were very, very similar to each other. And so that was really exciting. And everybody got a ring of the same size 42 microseconds and brighter on the bottom, even though there are kind of some small differences across the images that came from the user base choices in the software. Tell them the main structure was there for everyone. And I think that was one of my favorite days in the collaboration, because we have so much fun that day, it was such a big relief, because we were using so many different techniques, kind of imaging software, that we weren't sure we would all get the same answer. So we were really relieved that we did. And then, you know, at the end of the day, we all went drinking at a bar and turned out it was karaoke night. So we always sang karaoke together. It was such a lovely diary. Were there any space themed songs like Hello, Major Tom, or any of these things we want for fun? Ah, there you go. Yeah.

46:06 That was a massive. Yeah, well, we changed the lyrics to Black hole shadow. If you're taking petabytes of data and running them through a bunch of algorithms until they come down to megabytes or you said the final picture is actually like less. Yeah. And Meg, yeah, then, you know, what you're seeing is that actually, what you're seeing, or is that artifacts have built up along the way, right. So this must have been really comforting to try these different algorithms and go, they all come up with the same picture. Yeah, I think we were confident about what the data were telling us. But we wanted to make sure that we understood how our software were responding to the data. So you know, fun, kind of objective perspective, when we looked at the data blocks, we knew what it was telling us. But we wanted to make sure that we looked through every avenue of possible ways of creating images that look similar to the data. And in fact, it's really very hard to create images that don't look like what we have. And so no matter how much we tweaked our software, we would always end up with something very similar to what to the final image. But we started wanting to put our software through this test into a kind of process in which we would test it. And in the end, the final stage ended up being trying to find what is the best kind of combinations of our software choices, that gave us the best image because we generated so many images with user based choices. And they ended up pretty similar to each other with the base structure. But how do you pick one image. So that was kind of the the dilemma we had at the final, we didn't know how to pick a single image. So we decided to create these kind of fake data sets that mimic or our data, in which we knew the real truth images, and they looked like rings or disk or double sources. And then we made our software's go through 1000s of combinations of these parameters, and kind of tested how well they could reproduce all four of them, all four images, and then the ones that were kind of, we write them. And then number one was the best image for each doctor. And then the final image, you know, the famous image of a black hole is actually just the average of the three software images, the three best images from the software. And this is also only one image of the black hole. This is only April 11, we observed seven for four days in 2017. So we have four images of this black hole from 2017. And they all look the same. And they all went through separate day observing kind of separate processing, and always giving the same answer. Y'all are pretty sure that this this is the picture. This is the radio astronomy picture. Yeah, this is the picture at 230 gigahertz. Yeah, exactly. There's also pictures of that general space with Hubble and some other things. Yeah. laid out. Yes, this is a really fun part of our campaign also, because obviously, we're trying, the purpose of our experiment is to learn about black holes we want. I mean, these objects are super fascinating. They're kind of mind boggling, and how they exist in what they do to galactic environments. And these jets are enormous, and they're pure through entire galaxies. And yet they come from a very tiny part of the, of the galaxy itself. And so in really understanding black holes, it's obviously really useful to look at one up close, but we need to combine that knowledge with how it looks across all colors, all wavelengths. And so we had a big multi wavelength campaign at the same time as these observations across lots of different instruments. So we had instruments in the high energy in X ray, infrared, and optical in although other radio wavelengths, so we had a lot of partnerships with other observatories, to to make that happen. And we learned a great deal about the black hole. And it's through that process. Yeah, it seems that black holes, supermassive black holes are a fundamental part of galaxies almost right? Absolutely. They live in the centers of most galaxies, and M87 is one of the special ones because it has this big jet coming out of it. So it's very nice to get a look at one that has a jet up close and actually are

50:00 Milky Way black holes Sagittarius A star is one of the very common boring ones. And so the fact that we can have both of them and look at both of them is kind of special because it gives us kind of two very different classes of black holes that end up looking very much the same Sagittarius the stars image looks a lot like M87. This is really just because even though they're totally different black holes, they they differ in mass by a factor of about 1500. And our Milky Way is a spiral galaxy, right, but M87 galaxy is elliptical, and it's much older and much bigger than our Milky Way. And M87 has a jet and Sagittarius the star doesn't seem to have one. So there are two very different black holes. And yet, once you get up close, it's really just the effects of gravity is so strong, that it's really all you see. And it's so cool that they look is amazing. I guess it's worth just pointing out just how large the mid seven black hole is, oh, yes, the shadow of the mid seven black hole could fit our entire solar system very easily. It could you know, the orbit of Pluto would be about in the middle in the kind of mid range should fit inside the shadow and wider one, which is the furthest human made object. I would be just at the edge of the shadow right now. As anything if it were there, it would, it would get sucked back in. But if it were affected, it would be at the edge. Yeah, so also 6.5 billion solar masses. That's right. Yeah, really large. That's a lot. Actually one of the heaviest black holes in our universe. I think the current heaviest is about one is about 10. I think 10 billion solar masses. Maybe a little more.

51:38 Yeah, that was one of the larger one. Yeah. All right. Well, we're getting a little short on time. So let me close this out with a bit of a future looking question. And I'll give it to that EcoBlue from the audience. So what's next EHT I'll ask it that way. Yeah. But now, since our 2017 campaign, we've kept observing, we observed in 2018, where we added one telescope and increased our bandwidth, more, we can have more sensitivity in cooler observations of magnetic fields. We also have there in 2021, which was a kind of mid COVID observation, which was a very strange one. And then 2022, we added two new stations after that, that we're now up to 11. With EHT, and we're we're trying to expand our array to have more telescopes. So there's currently an effort called The Next Generation Event Horizon Telescope. So the EHT so far only borrowed telescopes that just happened to be at the right locations, that were actually built for completely different science programs, just radio telescopes that were able to observe at the frequency we wanted and at the right place. So we didn't have much of a control over the locations of the telescopes. And so in order to see more of our image, for example, to see the jet in M87, we need a lot more telescopes. So the next generation EHD plans to have more control over where telescopes are going to be put. So we're planning to get smaller dishes, because we had the sensitivity now with the current need to monitor dishes at the locations where we need them, where we have holes in our in our virtual mirror to recover this structure. And eventually we'd like to observe more often. So now we only observe two weeks a year, we want to be able to observe every two weeks or every month. And eventually we could make movies of M87, and also if they have enough telescopes, we will be able to make movies of pagetiger z star as well. And that will really show us the connection between the black hole, the gas swirling around it and its jet in the next field, I think we could learn so much about black holes, you'll be so precise speed of the Yeah, the accretion disk and how it evolves. And yeah, and that's about a 10 year 10 year timescale than in the far future and kind of 20 to 30 years, a member of our team are currently actively thinking about putting satellites into space, since we're running out of press diameter. To make bigger telescopes, there's only two ways to make your telescope bigger, either make the separation bigger, or go to higher frequency. So we have higher resolution. And because of the Earth's atmosphere we're and the size of the Earth, we're limited in both. But by going into space, we'd go above the atmosphere, so you can go to higher frequency. And you can have larger distances. So by putting satellites into space, we could have enough resolution to maybe see up to 10 or even 50 More shadows of black holes, which would be really cool. And then we'd have amazing, sharp images of the two that that we know and love to really test theories of gravity, which could be terribly exciting. But of course, the technology to do that is not there yet, but it is moving forward. And it's something we're thinking actively about getting there. Now the thing you pointed out was all the results you got from this so far still backup the general theory of relativity right that's right. Yeah, I shine seems to be still correct are while cool. Shadows are still very circular, which is what Einstein theory predicts, but we need much sharper images to see bigger differences. We were able to rule out some theories, but there are still some theories that survived. So we need to get

55:00 To sharper and sharper images and a stronger and stronger test, I think we're moving in that direction. Our first image is really just opened this new laboratory of images of black holes. So it's gonna be an exciting new field. Yeah, this is possible. Now. Let's build on it. Right? Yeah, absolutely. Awesome. All right. Well, I think we'll leave it there for the black holes. But congratulations to you. And I know that all the other people often you point out that you're just just the representative of this great big team. So yeah, how only play a small part. It's really a big team effort and a community effort. And everybody is part of it. Yeah, indeed. Now, before you get out of here, let me ask you the final two questions. I always ask if you're gonna write some Python code. What editor do you use these days? What editor? Oh, my God. Super boring things. deatta. Oh, perfect. Awesome. Yeah, very boring. If there's any Python package library, you want to give a shout out to maybe that was central to this work. Now, you already mentioned pandas and matplotlib. Anything else that was really useful here? Yeah, there was a really cool nested sampling algorithm that was developed by Harvard grad student called dynasty that was super useful for our modeling work and really kicked off a lot of that analysis. So I'd like to shout that out, too. And it's open sourcing, yeah, I'll put a link to it in the shownotes for people. Alright. Well, Sarah, thank you so much for coming on the show for sharing your work. As you saw, it's been an inspiration to everyone around the world. So thank you. Thank you for having me. You bet. i

56:25 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show. Join Cox automotive and use your technical skills to transform the way the world buys, sells and owns cars. Find an exciting position that's right for you at talk python.fm/cox Take some stress out of your life. Get notified immediately about errors and performance issues in your web or mobile applications with Sentry. Just visit talk python.fm/sentry and get started for free. And be sure to use the promo code talk Python all one word.

57:02 On level up your Python we have one of the largest catalogs of Python video courses over at talk Python. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription in sight. Check it out for yourself at training dot talk python.fm Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the iTunes feed at /iTunes, the Google Play feed at /play and the direct RSS feed at /RSS on talk python.fm. We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.FM/YouTube. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon