Learn Python with Talk Python's 270 hours of courses

#385: Higher level Python asyncio with AnyIO Transcript

Recorded on Thursday, Sep 29, 2022.

00:00 Do you love Python's async and await, but feel that you could use more flexibility or higher order constructs like running a group of task and child task as a single operation, or streaming data between tasks, combining tasks with multi processing or threads, or even async file support? You should check out AnyIO, on this episode. We have Alex Gronholm, the creator of Any IO, here to give us the whole story. This is talk Python to me. Episode 385, recorded September 29, 2022.

00:42 Welcome.

00:43 To Talk Python to Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow me on Twitter where I'm @mkennedy and keep up with the show and listen to past episodes at Talkpython.FM and follow the show on Twitter via @Talkpython. We've started streaming most of our episodes live on YouTube, subscribe to our YouTube channel over at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode. This episode of Talk Python to Me is brought to you by Compiler from Red Hat, an original podcast. Listen to an episode of their show as they demystify the tech industry over at 'Talk Python.FM/compiler'. It's also brought to you by us over at Talk Python Training, where we have over 240 hours of Python courses. Please visit 'Talkpython.FM' and click on courses in the Navbar.

01:32 Transcripts for this in all of our episodes are brought to you by Assembly AI. Do you need a great automatic speech to text API? Get human level accuracy in just a few lines of code? Visit talkpython. FM/AssemblyAI. Alex welcome to Talk Python to me.

01:47 Thank you.

01:48 Yeah, it's fantastic to have you here. You have so many cool open source projects out there. We're here to talk about AnyIO, but actually several of them I've covered on Python Bytes on the other podcasts that I run, and we talked about SQL code gen and type guard, and I didn't associate that with you specifically and back over to AnyIO. So yeah, a lot of cool projects you got going on there.

02:10 Yeah, too many actually. I managed to hand over a couple of them to other people Seaboard 2 and Sphinx audiolog typehints because I'm really the search thing at the moment. I barely have time for all of the projects that I'm maintaining.

02:26 I can imagine.

02:28 How do you juggle all that?

02:30 I'm sure you have a full time job and you have all these different projects. Right. How do you prioritize that?

02:35 Yes, basically I get the equivalent of a writer spot from time to time. So when that happens, I just either don't try to code up at all, or I just switch to another project.

02:47 Yeah, true. If you're talking about typehints versus Async programming, if you're stuck on one, you probably are not stuck on the other, right? Yeah. Interesting. Well, we're going to have a lot of fun talking about all of them. I doubt there's going to be any writer's block or speakers block here. Podcasters block. It'll be good. We'll have a good time chatting about it and sharing with everyone. Before we get to that, though, let's hear your story. How do you get into programming a Python?

03:09 I got into a programming at the age of eight, it was on an MSX compatible machine. I started with Basic, as so many others did. I did some simple text based games at first, just playing around. At some point I got Commodores 128 and I did some simple of demos, graphical demos with it. Then I got an Amiga 500.

03:38 Oh, yeah, the Amigas were cool. They were special.

03:41 Yeah. I doubled in almost Basic then other kinds of tools also. I don't really remember that much of it. Then at some point I did something with Mac that this Mac was classic. There was this tool called HyperCard. It's precursor for flash. Basically. So that's something. I did some things with simple games and whatnot. Skipping forward a bit. I got into PC programming like C++. Mostly C. Then I think it was in the later house of 2005. No, actually it was much earlier.

04:22 I started with Perl. Hated it.

04:25 Then I think the next step was in 2005 when I got to learn PHP. Hated that too, and kept searching. Then finally in 2007, I got to know Python. And then that was love at first sight, really. At that point it was, I think, Python2.5. And of course, I stuck with it. I did some Java professionally for a while, but I never really got to love it. It had this corporate industrial feeling to it.

04:59 So I'm not dressed up enough to program my Java today. Let me go get my tile. I'll be right back.

05:04 Yeah, Python was really cool when I started learning. It my first practical application I made in 30 minutes. After starting to learn, it's really staggeringly easy to learn. That's one thing I love about it.

05:20 It really is. It's one of the few languages you can be really successful with with a partial understanding of what's going on. Right. You don't even have to know what a class is or what modules are. You can just write a few functions in a file and you're good to go.

05:34 It's almost like it's English.

05:36 Yeah, very cool. Raul out in the audiences. Third time's the charm. The third language. You found the one you like there. Excellent. And how about now? What are you doing these days?

05:44 I've been working for several years on the project.

05:49 Very complicated project where this is always a hard part to describe it. It's a sort of working wellness application. I'm part of a bigger team. I'm the lead backer developer. It collects IoT data and visualizes it and it provides all sort of peripheral services to it. This is the first time I really had to spread my wings with databases oh, yeah.

06:15 Okay. What technologies are using there?

06:17 On the back end, we use Timescale DB, which is postgres sql extension.

06:22 Okay.

06:23 This is for storing that time series data. Then on the back end, we use my framework called Asphalt. I don't know if you've encountered that one. I think it's really cool, but it's not in wiseproduce.

06:38 It's not a web framework per se. It's more like a generic framework where you can compose applications from a mix of premade components and custom made components.

06:49 What was it called?

06:50 Asphalt.

06:51 Asphalt, like the road? Yes.

06:55 No. I did a search, and I just found a snake. A python thing.

07:02 The snake on some asphalt road? Yeah. You got to be careful here. Okay. Is it a little bit like Flask, or what makes it special?

07:10 Well, Flask is a web framework. This is a generic framework. You can build any kinds of applications with this. It doesn't have to be involved with web.

07:17 I see.

07:19 Http so it could be UDP, or it could be just it doesn't even.

07:23 Have to do the networking at all. You can build command line tools with it. You can just have this mix of components and the YAML configuration to give settings to the mall. I think this really would require a whole different session.

07:39 It does sound like it would be a whole different session, but this is news to me and very interesting.

07:44 Yeah, I haven't really either touched much. I'm working on version five at the moment, which does incorporate AnyIO support, and it brings the tech up to date with the current standards.

07:57 Okay. Yeah. This looks very asynchronous based. It's an Async IO based micro framework for network oriented applications. It says, I'm built upon UV loop, which is how all the good Python Async things seem to be back these days. So it has a lot of modern Python features. It's got Async IO, it's got UV loop, it's got Type hints, those sorts of things. When you started in Python in 2007, none of those existed.

08:25 How do you see the recent changes to Python in the last five years or so?

08:29 I would say that Python has been developing at an incredible speed. I really love it. So many useful stuff are coming out with every release.

08:37 I agree. Yeah.

08:39 Basically from 3.5 to 3.8 or something, there's just so many amazing features that came out there, and now we're seeing these libraries built upon it. Right, right. Well, let's transition over to our main topic that we're going to talk about, which is what I reached out to you for not realizing. The other two interesting projects that I already give a shout out to are also yours. We'll get to those if we got time. So with Python in 3.4, we had Async IO, introduced the actual frameworks that supported that, and then when it really came into its own was Python 3.5. When the async and Await keywords were added to the language, and Python came out of the box with some support for great Async programming. But then there are these other libraries that developed on top of that to make certain use cases easier or add new capabilities. And Any IO falls into that realm, right? Yeah.

09:34 So before we talk about AnyIO, we should talk about Trio. Have you heard about Trio?

09:39 Yes, I have heard about Trio. I even had Nathaniel on the show, but it's been a little while.

09:46 It's been a while. That was back in 2018. I talked to Nathaniel Smith, so there's probably quite a few changes since then. Actually, let's talk about trio.

09:54 Yeah, actually, the last version of Trio was released just yesterday. The thing about AnyIO is that it's an effort to basically bring the Trio features to Async IO land. So trio is fundamentally incompatible with asyncio. There is a compatibility layer called Trio Async IO, but it's far from perfect. So what AnyIO does, really, is allow users to allow the developers to add these features from their Asynchio applications and libraries one by one, without making a commitment. For example, at my work, I use AnyIO for just a handful of tasks. I think we should talk about the features. So one thing at least, I should say, to this dispel any confusion, is that the Trio and Async IO are both like, top level Async frameworks in that they provide an event loop.

10:55 Right.

10:55 And AnyIO does not. So it's kind of a meta Async framework.

11:01 I see. So it builds on top of these different frameworks, right?

11:04 Yeah. So it builds on top of these frameworks and their underlying primitives.

11:09 Yeah. So for people who are not familiar, trio adds things like this concept of grouped tasks. So normally in Async IO, you start one task, you start another. They're kind of unrelated, but even if they're conceptually solving parts of the same problem and with Trio, you can do things like create what's called a nursery, and then you can have them all run, or you could potentially cancel unstarted tasks. And there's other coordination type of operations as well. Right. That's the kind of stuff that Trio adds. Yeah.

11:40 So the point of AnyIO is, as I said, to bring these Trio features to Async IO.

11:47 Right. Because when you do Trio, it's an end to end Async stack, which means things have to be built for Trio. Right. It's like, if I have, let's say, httpx, I don't know how easy it is to integrate those kind of things that are expecting an Async IO event loop over into Trio. I already have my own event loop running. Like, it's hard to coordinate the tasks. Right.

12:10 If you're talking about httpx it had trio and asyncio backends. Now it defaults to the AnyIO back end. So it runs by default on both about AnyIO features. It provides trio like test groups on top of Asynch IO. In here we should mention that Python 311 has its own concept of a test group, but the mechanics are quite a bit different. That requires a bit of explaining.

12:40 Yeah. How's it work here?

12:42 The thing is the AsyncIO task group, so not this one, but the standard library test groups, which are in puzzle tripod eleven. They basically just start normal Asynch IO task, and you can cancel individual tasks with using the task culture to come out of the Create task method. What sets AnyIO task groups apart from Async IO task groups is the way Cancellation is done. And since Any IO was designed based on Trio, when you do start soon, it doesn't return any task scope, just that you can cancel. Instead, cancellation is done via so called cancel scopes, so each task group has its own cancel scope. If you cancel that, you basically cancel all the underlying tasks. But it goes even deeper because this is a bit complicated. So bear with me.

13:40 Cancellation is not done on a per test basis, but on a per cancel scope basis. You can have cancel scopes nested, so that if you start a task and it starts cancel scope, you can just cancel that scope and it cancels everything up to that point.

13:57 Okay, so like, if I call a task, if I create a task, and then somewhere inside for it to do its job, it also creates a task those can be grouped into the same basic scope. Right. So there's not these children tasks running around.

14:12 Yeah, you don't even have to start another task. If you cancel or cancel scope, then anything you basically await on is automatically canceled bam.. This is called Level Cancellation, in contrast to the Edge Cancellation mechanism employed by asyncio in Edge Cancellation, you just cancel the task once and it gets canceled. Error erased in the task. So you can ignore it, which is, by the way, a bad thing to do, but then the task won't be canceled again. Usually there are exceptions, which are a topic of debate in the community, but canceled scope, basically they define boundaries for cancellation. So if you say you cancel a test group's cancel scope, only the tasks started from that test group are canceled. So when those tasks return back, so all the tasks are done, then the code just goes forward from this is in context Manager. Basically, when all the tasks are done, then however, they end. Unless some raised exceptions, that's a different situation. If they were either canceled or successful, then the code just goes forward to the old task finished part.

15:32 This is really neat. The other thing that's standing out here, as I think about these task groups, for those of you listening, you just create an Async with Block to create the task group. And then in there you can just say task group start soon and give it a bunch of Async methods to start running. One of the things that's cool about this is it automatically waits for them all to be finished at the end of that context manager that's the standard.

15:56 Library task groups work the same way actually.

15:58 Okay. And those are in 311.

16:00 We'll see how the mechanism will work. There's a new mechanism for cancellation called on cancelation of tasks. It's not really battle test. They did something that was added fairly late in the game to 3.11. So it's not yet clear if there are edge cases where it fails totally. This is also a debated topic in the community.

16:25 True. The other thing here that I want to ask you about is you don't say create task and you say start soon. Why do you say what's this like uncertainty about? Tell us about that.

16:38 Okay, so start soon. It actually does the same thing as create a task because creating task doesn't start running it right away, it starts only running it on the perhaps on the next iteration of the event.

16:51 Right? Or maybe not. Maybe the event loops all backed up, maybe it takes a while. Right?

16:55 Yeah. So it's basically the same as loop call soon. So you schedule a callback that's all the tasks are, there are callbacks with bells and whistles. Start soon is modeled based on trio. But I should mention that there's also a method called Start which works a bit differently. This showcases the start method. So this is very, very useful. This feature is not present in the standard library task groups. So basically it's very useful starting a background service that you need to know that the task has actually started before you move on.

17:33 So an example that you have in the docs here is you create a task group and the first thing is to start a service that's listening on a port. The next thing is to talk to that service on the port. Right. And if you just say kick them both off, who knows if that thing is actually going to be ready by the time you try to talk to it.

17:48 Exactly. This is something I use in practice.

17:51 All the time and this is different than what you would get just with the Async IO create task or whatever, right?

17:56 Yeah. And even the new task group feature doesn't have this.

18:03 This portion of Talk Python to me is sponsored by the Compiler podcast from Red Hat just like you, I'm a big fan of podcasts and I'm happy to share a new one from a highly respected open source company compiler and original podcast from Red Hat. Do you want to stay on top of tech without dedicating tons of time to it? Compiler presents perspectives, topics and insights from the tech industry. Free from jargon and judgment. They want to discover where technology is headed beyond the headlines and create a place for new It professionals to learn, grow and thrive. Compiler helps people break through the barriers and challenges, turning code into community at all levels of the enterprise.

18:39 One recent and interesting episode is there the Great Stack Debate. I love, love, love talking to people about how they architect their code, the tradeoffs and conventions they chose, and the costs, challenges and smiles that result.

18:52 This great Stack Debate episode is like that. Check it out and see if software is more like an onion or more like lasagna or maybe even more complicated than that. It's the first episode in Compiler series on software stacks. Learn more about compiler at talkpython.FM/ compiler. The link is in your podcast player show Notes. And yes, you could just go search for compiler and subscribe to it, but follow that link and click on your players icon to add it. That way they know you came from us. Our thanks to the compiler podcast for keeping this podcast going strong.

19:27 So I guess in the standard library you could start it and then you would have to just wait for it to finish and then you would carry on. But it's like two steps, right?

19:37 The workaround would be to create a future, then pass that to the task and then wait for the future. So it's a bit cumbersome and then you have to remember to use a try, except in case that task happens to fail. Otherwise you end up waiting on the future forever.

19:56 I really like this idea. Now the other thing that I don't see in your examples here we're on creating a task group and starting these tasks and waiting for them to finish is management of the event loop. If I was doing it with my code, I'd probably have to create a group or a loop and then like, you know, call some functions on it. And here you just use AnyIO. What? Where's the event loop being managed?

20:21 I'm not sure. What do you mean?

20:23 Well, like a lot of times when you're doing Async stuff you have to go and actually create an Async event loop and then use the loop directly. And you're working with a loop for various things.

20:35 Well, there's that run command at the bottom.

20:38 Yeah. Okay, so basically you just say anyio.run

20:41 or Async Io.run

20:43 Okay. Even though I'm using the AnyIO task groups, I can still just, I can mix and match this with like more standard Async IO event loops.

20:53 That's the premise. So you can just like ease into it.

20:58 So for example, if I have a FastAPI web app and you know, FastAPI is in charge of managing the event loop and I've got like an Async API endpoint, I could still go and use an any I O task group and get all the benefits in there. Okay, that's beautiful.

21:15 I should mention that FastAPI also depends on Any IO.

21:18 Oh really? Okay, yes. Interesting. Yeah. I've seen Sebastian Ramirez talking about some little functions that he wrote, and he's like, I would love to see these just get back into AnyIO. I didn't realize that FastAPI itself was using them. Yeah. Okay. So very useful. We've got these task groups. We've got the concept of cancelation another one that's not exactly cancellation, but as sort of cancelation is timeouts. Do you want to talk about timeouts?

21:46 Yeah, I meant to talk about that. As I recall, in puzzle 311, there's a similar contract. I think it was with timeout or async.io timeout or something similar. I don't remember, really. But what this move on after does is it creates a canceled scope with a timeout. Basically, this is a very, very practical use of cancel scopes. What it does is it starts a timer, and after 1 second, it cancels this scope. So anything under that gets canceled. So in this case, just at the sleep command gets canceled and then the task just keeps going.

22:27 So the way you described it before, it sounds like if there was a bunch of tasks running, if any of them tried to await something, they're also going to get canceled. Is that right?

22:35 In this case, you mean? No, only the part that is within the waitt block.

22:40 Right? Well, that's what I mean. But if you had done multiple tasks within the MoveOn after right. I tried to talk to the database to insert a record, and I tried to call an API and the database.

22:50 Time out within a single move on after block. You can only have one thing going on, which is the wait here. So even if you start multiple tests from that test group, they are not enclosed within that cancel scope. I realized that cancel scopes are a complex and difficult concept, and I don't think I can adequately explain them, but I hope that this will at least some light into that.

23:16 Yeah, it's a really cool idea because your code could it's async, so it's not as bad as if you were to lock up waiting for an API call or something that's going to time out on the network eventually after a really long time. But it's still you don't want it to clog up your code. Right. You want to just say, sorry, this isn't working.

23:36 Yeah. One place where I often use contracts like this is finalization. So when you are closing up things, then you can use this move on after to the time out for closing resources.

23:50 Yeah, that makes sense because you want to be a good citizen in terms of your app and release the resources as soon as possible, like a database connection or a file handle. But if it's not working, I made a try at it after a second, we're done.

24:04 Exactly. And also I should mention that this is where AnyIO biggest caveat lies It is in finalization. I often run into problems with the cancel scopes. Because the thing with cancel scopes is that when you run, quote, within a cancel scope and that scope gets canceled, then anything awaiting on anything within that cancel scope is always canceled time after time. So you cannot wait on anything as long as you are within the cancels goal. And Async IO code is not expecting that. So it might have a final clause where it does await, say, connection.close, but that also gets canceled if you are within an AI or cancel scope. And it's one of the biggest practical issues with AnyIO right now. And we are trying to figure out a solution for that. Just something to keep in mind when you are writing any of your stuff.

25:04 Yeah, that is tricky. Right?

25:05 I hope this one I'm going to.

25:06 Try to call this web service and I'm going to wait it if it fails probably internally, what you want to do is close the network connection as well, right? But if you try to wait closing network connection.

25:19 So what happens there? Does it eventually just get cleaned up by the Garbage Collector or do you.

25:25 Reference well, Garbage Collector doesn't work that well with Async stuff because the structures could be called in any strategy. So you can't rely on that. You can't do any Async callbacks in the destructor. So it's a good idea not to try any of that. Instead, just raise a resource warning. If you're writing AnyIO where code, you would have either this shielded cancel scope or better yet, a move on after with Shield troop.

25:54 What does that do?

25:55 At least temporarily protect the current test from cancellation.

26:00 So let's say you have move on after, say, five and we shield two. It means that even if the outer cancel scope is cancelled, your actual test will start running until it exists as cancel scope or if the timeout expires. So if you have a five second window to close any resources that need closing got it.

26:22 You could do that, say within your exception handler or something, right? Yes.

26:27 Or actually, I think finally block might be the best place to do that. But depending on your use case, of course.

26:34 Of course. Okay, very interesting. So all this stuff about tasks, groups and scheduling tasks and canceling them, that's very trio esque, but it's also just a small part of AnyIO, there's a bunch of other features and capabilities here that are probably worth going into. Some cool stuff about taking Async IO code and converting it to threads or converting threads to Async IO and similarly for subprocesses. But let's maybe just talk real quick about the synchronization primitives. These are things like events semaphores, maybe not everyone knows what events and semaphores are in this context. Give us a quick rundown of that.

27:13 Yeah, well, these are pretty much the same as they are on AsynchIO. Many of them use just the AsynchIO counterparts straight up. So events are a mechanism for telling another task that something, something happened, something significant happened, and they need to react to it. It's often used to coordinate tasks, so one thing doesn't happen before something else has happened in another task.

27:38 Yes, there might be two tasks running, and one says, I'm going to wait until this file appears, and the other one's going to eventually create the file, but you don't know the order. So one option is to just do polling like, well, I'm going to Async IO.sleep for a little while and then see if the file is there, try to access it, and do that over and over. A much more responsive way and deterministic way would be to say, I'm going to wait on an event to be set, and the thing that creates the file will create the file and then set the event, which will kind of release that other task to carry on. Right, okay.

28:12 Right. Moving on semaphores mechanism for saying that you have this limited you have a number of limit, let's say a connection pool or something, and you want to specify that whenever some part of the code needs access to this resource, it needs to acquire the semaphore. So you set a limit, and then each time that semaphore decrements the counter. And when you hit the limit, then it starts blocking until something else really easy, is it?

28:46 Right. So people might be familiar with thread locks or Async iO equivalent where you say only one thing can access this at a time, so you don't end up with deadlocks or race conditions and so on. But semaphores are kind of like that, but they allow multiple things to happen. Say maybe your database only allows ten connections, or you don't want it to have more than ten connections. So you could have a semaphore that says it has a limit of ten and you have to acquire it to talk to the database. That doesn't mean it stops multiple things from happening at once. It just doesn't let it become a thousand at once. Right, exactly. I really like this idea and I was showing some people some web scraping work with AsyncIO where it's like, oh, let's go create a whole bunch of Acpx requests, or whatever type of request, something asynchronous talking to some servers to download some code. And if it's a limited set, no big deal, but if you have thousands of URLs to go hit, well then how do you manage not killing your network or overloading that? And the semaphore actually would be perfect. So for people listening, the way that you do it is you create a task group and then just you pass the semaphore to start soon that's really clean, and then any IO takes care of just making sure it gets access and then runs and then gives it back. How's that work?

30:02 I'm not sure. What sort of transfer you are expecting. But as I recall, this current implementation is actually using the underlying async library events.

30:14 Okay.

30:14 So there are actually methods to acquire and release the semaphores. It just implements an Async context manager that acquires it at the beginning and releases it at the end. There's an event involved for notifying any awaiting task that it has a similar for a slot available.

30:34 It's super clean. And the fact that you don't have to write that code, you just say the Semaphore is associated with this task through your task group. I really like it, actually.

30:42 Semaphores are not associated with a particular task. That's what capacity limiters are for.

30:47 Okay.

30:47 So you can release a set of four from another task, while capacity limiters are bound to the specific task that you acquired them in.

30:56 Okay.

30:57 Yeah.

30:57 In the semaphore, for example, was it being passed? Oh, yeah, it's just being passed as an argument, isn't it, to the task, and it's up to the task to use it. I see. Okay. So this other concept of capacity limiter sort of does that.

31:11 Yeah. So this is from trio. It's very similar to Semaphore. So you can actually set the borrower, but in most circumstances you want the current task to be the borrower. And limiters are actually used in other parts of any IO, as they are in Trio, for example, to limit the number of trips that you allocate or number of subprocessors that you spawn.

31:38 Sure. You don't have too many subprocessors. You've got 1000 jobs and you just for each thing and job start in the subprocessor, you're going to have a bad time.

31:45 Right. So, as the documentation says, they are quite like semaphores, but they have additional safeguards.

31:53 Such as?

31:54 Well, they check that the borrower is the same.

31:57 Okay, yeah, that makes sense.

31:58 So by default, they check that the task used for both acquiring or releasing are the same.

32:04 Yeah. Nice. Okay. Well, I didn't know about capacity limiters. That's fantastic. I love the idea. Okay, let's jump over to you talked about the threads and the sub processes. Let's talk about this thread capability that you have here. This is very nice. AnyIO to thread. What is this? Yeah.

32:23 So this is also a model based on Trio. It's basically any IO's way of doing worker threads. So in Async IO, you have these thread full executors that do the same as run sync. Asynchio's API is somewhat problematic because you have basically two methods.

32:46 I forget the older one. The newer one is called To Thread. The first one was run in Executor, whatever it was. They both had their own issues. To trade doesn't allow you to specify any thread pool. So it always uses the default thread pool. And there's no way to add that to the API because it was done in such a manner. Then the older function does have this parameter at the front but the problem is that it doesn't propagate context variables unlike the newer to thread function. So context variables, if you don't know about them, they are a fairly recent addition to Python.

33:26 Yeah. What are those?

33:27 They are basically thread locals. Are you familiar with trade locals?

33:31 Make a comment for people who don't know like thread local variables which also exist in Python allow you to say I'm going to have a variable, maybe it's even a global variable. And it's thread local, which means every thread that sees it gets its own copy of the variable where it appoints to and what value it is. And so that way you can initialize something that start of a thread. If you have multiple threads, they can all kind of have their own copy so they don't have to share it. But that falls down because Async IO event loops, when you await that all that stuff is running on one thread, just the one that's running the loop. That's what you're talking about is that equivalent but for Async IO, right? Yeah.

34:13 So quantifiables are much more advanced concept. Basically, as you said, thread locals for Async tasks.

34:21 That sounds very tricky. I've thought about that. I have no idea how to implement that. So that's pretty cool, I guess. Python right now, yeah.

34:27 The thing is the event loop, when it starts running, when it switches between tasks, it runs that callback within that context that the task is tied to. And when you start to work with worker threads, then you need to see the same context variables in that worker set. Right. So this has been somewhat of a problem because the older method in Async IO for running worker threads, it doesn't propagate these variables, but the newer function to thread does. But then you can specify which thread pool you want to use.

35:07 Right. Okay. So it's similar to the built in one, but it gives you more capability to determine where you might run it.

35:13 Yeah. When you call run sync, it allows you to specify a limiter and it uses the default limiter which has capacity of 40 threads.

35:23 40?

35:24 Yeah. That seems like a pretty good default. Much more than that. And you end up with memory and contact switching issues.

35:30 Yeah, it was arbitrary, so I just followed suit.

35:37 Sure. Yeah. So basically if you've got some function that is not Async, but you want to be able to await it. So that basically run it on a background thread here. You just say anyio.to-thread and then run sync and you give it the function to call and now you can await it and it runs on a background thread in this thread pool, which is really nice.

35:59 Yeah, it works.

36:00 So one thing that I've noticed a lot in the API here for neighbor is often now for the synchronous functions, it's completely obvious why you wouldn't do it, but even in the sort of creating tasks. One what I'm noticing is that even the Async functions, you pass the function name and then the arguments to start soon, as opposed to saying call the function, passing the arguments and getting a coroutine back. Why does it work that way? It seems like it would make it a little less easy to use like save type ins and autocomplete and various niceties of calling functions and editors.

36:38 Yeah, there was a good reason for that. I can't remember that offhand, but at the very least it's consistent with the synchronous counterparts. Like run sync.

36:48 Yeah. Cool. Alright, so we have this stuff about threads and you have the to thread also from thread, which is nice. What's from thread too?

36:57 So when you are in a worker thread and you occasionally need to go something in the event loop thread, then you need to use this from def to run to run stuff on event looptrade.

37:08 Right. Because the worker method is not Async, otherwise you would just await it. Right. It's a regular function. But if in that regular function you want to be able to await a thing, you can kind of reverse back. That's an interesting bi directional aspect. All right, subprocesses. We all know about the Gil how Python doesn't necessarily love doing computational work across processes. Tell us about the subprocess equivalent.

37:34 This is a really relatively easy way to both run task in a subprocess and then opening arbitrary executables for running Asynchronously. Async IO has similar facilities for running Async processes. But these subprocess facilities are not really up to par with, say, multi processing which has some additional native, like, I think shared queues and other synchronization primitives. But they are still pretty useful as they are.

38:09 Yeah. So basically you can just say any I O to process runs ink and you give it a function and now you can await that sub process. Multi processing.

38:20 Yeah, there are usual caveats like that because they don't share memory. Then you have to serialize the arguments and that could be a problem in some cases.

38:30 Sure. So basically it pickable the arguments and the return values and sends them over.

38:35 Yeah, it could be even so that the arguments are pickable but the return value is not. Which obviously causes some confusion.

38:44 Yeah, absolutely. People may hear that pickling is bad and you can have all sorts of challenges like code injection and whatnot from pickling through security stuff. This, I would think is not really subject to that because it's you calling the function directly.

39:01 Right. It's like completely just there's no way.

39:04 To inject anything bad.

39:05 Exactly. It's all the multi processing that's handled in it anyway, so it should be okay. Yeah, cool. Another one that is pretty exciting has to do with file support. So we have open in Python, you would say with Opensomething as F, but there's no Async equivalent, right? Yes.

39:25 These files facilities are really just a convenience that wraps filters around this file objects. If somebody wants to know there is no actual Async file I O happening because that's not really a thing on Linux and even on Windows, it has terrible problems.

39:44 Okay, so what's happening with this? Any IO open file, it opens a.

39:48 File in a thread and then this opens starts an Asynchronous Manager that on exit closes a file.

39:55 It also closes the thread, I guess. Right?

39:57 Well, it uses throw away threads, basically from the thread pool.

40:01 All right, I see. So it'll use a thread to open the file and get the file handle and then throw away the thread. And then does it do something similar?

40:09 Oh, return the thread to the pool.

40:10 Yeah, return to the pool. Which is much better than creating it and turning it away completely.

40:17 Also, that read call is done in the thread.

40:20 Okay. So it just is sort of a fancy layer over top of thread pool. But it's really nice. You write basically exactly the same code that you would write with regular Open and a context Manager. But the Async version, although I gotta say, the opening part of creating a context Manager here, you see Async with, which people are probably used to and then async with, await open file, which is a bit of a mouthful.

40:45 The reason for that is because you can just do await open file and then go about your business and then manually close the file.

40:53 Got it.

40:53 I'm not sure if there's a more convenient way to do this. I might be open to adding that to any IO. But for the moment, this is how you do it.

41:01 Yeah. You probably wrap it in some kind of class or something that's synchronous, but then has an enter. But yeah, I'm not sure if it's totally worth it, but yeah, that's quite the statement there. And the other area that's interesting about this is if you want to loop over line for line, instead of doing a regular for loop, you can do an Async for line in file and then read it asynchronously line by line. Right? Yeah.

41:25 So the A next method just gets the next line in the worker sight. Yeah.

41:31 This is fantastic. So people are working with files. They definitely can check this out. And also related to that is you have an Asynchronous Path lib Path.

41:40 Yeah. That's a fairly recent addition.

41:42 That looks great. So you have, like, an Async integer. You have an Async is it a file async retext all built into the path, which, you know, is like what's built into regular path, but not Asynchronous? Yeah. Quite nice. We're getting kind of short on time here. What else would you like to highlight here that's really important?

41:59 I would like to highlight the streaming framework here because it's one of the unique things in AnyIO.

42:05 Okay, yeah, let's talk about it.

42:06 Trio has its channels and then it has sockets and whatnot, but AnyIO has something that trio doesn't have. And really, AsyncIO has some kind of streaming abstraction, but not quite on this level. In AnyIO, we have a streaming address based class hierarchy. We have object streams and byte streams. The difference between these are the object streams can have anything like the integers, strings, any arbitrary objects. And byte streams have only bytes and they are modeled according to TCP. With byte streams, you can send a number of bytes or receive bites, but they might be taught differently. These are abstract streams. So you can say, build a library that wraps another stream. Say you build an SSH client that creates a tunnel and it exposes that as a stream. So long as you are able to consume a stream, you don't have to care how the stream works internally or what other streams it wraps. A good example of a stream wrapper is the TLS support. That's right there. So TLS support in any IO can wrap any existing stream that gives you bites, even if it's an object. Whether it's an object on or byte stream. So it does the handshake using the standard library actually, standard library contest is Sensor protocol for TLS.

43:41 Okay.

43:41 Sensor protocol. If you are not aware of it, it's basically a state machine without any actual IO calls. It's a very easy protocol implementation that lets you add whatever kind of IO layer you want on top of that. So this is what AnyIO uses to implement TLS. So you have both listener and the connect and the TLS wrapper. So any kind of TLS, you can even do TLS on top of TLS if you like. That's useful, but you can do it.

44:10 It's super encrypted.

44:12 Yeah, this is very flexible. We have all sorts of streams, even these unreliable streams which are modeled based on UDP.

44:21 Oh, really?

44:22 Okay, so UDP is implemented by using these unreliable streams. I don't think there are any more unreliable streams implementations on UDP, but yeah, I'm really proud of this streaming class hierarchy. And it's too bad that there are no cool projects to show off this system. But maybe in the future we will have another thing that I would like to highlight this system of type attributes that are really useful in practice.

44:52 Yeah, before we moved on to type attributes, let me just ask you really quickly. Can I use these streams and these bi directional streams where you create like a send stream and a receive stream? Can we use those across, like multi processing or to coordinate threads? I'm sure we could use it for threads, right?

45:10 Sadly not.

45:12 This memory object stream, they call it intrigue channel. This is one of the most useful pieces of AnyIO. Really? I use this every day at work. So these are basically cues or steroids.

45:25 Yeah, exactly. That's what I was thinking.

45:27 Unlike asyncio cues, you can actually close this. You can clone them. So you can have multiple tasks waiting to receive something. Like workers. We have multiple vendors.

45:39 I see like a producer consumer where some things are put in, but there's like five workers who might grab a job and work on it.

45:45 You have five consumers and five producers all talking to each other.

45:48 Yeah.

45:49 And then you can just iterate over them. Also not possible with queues. You can close the queues streams so that when you iterate on them, if all the other INTs are closed, then the iterator just ends.

46:03 Oh, wow. Okay. So if the send stream goes away, then the receive stream is done at the end. Yeah, that's fantastic. Actually, I really like that.

46:12 Yeah, I think this is also coming.

46:14 To go ahead, sorry. Okay.

46:15 I think it is also coming to the standard library.

46:18 Nice. All right, the last thing we have for time for here that you wanted to highlight is typed attributes. What's the sort of typed attributes?

46:26 If you knew about asyncio extras, I think they have both in protocols and streams. For example, with TLS, you want to get the certificate from the stream. Like if you have negotiated the TLS stream, you want to get the client certificate. Right. So this is a type of way to do it. You can add any arbitrary extra attributes to a stream, just declare it in the extra attributes method. But the niftiest part here is that it can work across web streams. A very good example is that say you have a stream that is based on Http, you have an Http server and you have access to a stream, let's say WebSockets. Then you want to get the client IP address. Well, usually you may have a front end web server like NGINX at the front. Normally what you get when you ask for an IP address, you actually get the IP address of the server. What you need to do is look at the headers. This is something you can do transparently with these type attributes. So basically a stream that understands Http, you can have that handle the request for the IP address, the remote IP address, and have it look at the headers and look for forwarded header and return that instead.

47:45 Nice. Let me see if I got this right here. So people are probably familiar with Pydantic, and Pydantic allows you to create a class and it says what types are in the class and the names and so on. And those serialize out a JSON message. It sounds to me like what this is built for is when I'm talking binary messages over a stream, like a TCP stream, I can create a similar class that says, well, I expect a message that is a string and then a float read that out of the stream. Is that right? Yeah.

48:17 A good example here is if you go back to the streams part text streams.

48:22 Okay.

48:23 So this is something that translates between bites and streams on the fly. So this is a perfect trivial example of a stream prepper.

48:32 Okay. Yes. You have a text receive stream that will do the byte decoding. Very nice.

48:38 And you don't have to care like what's downstream of that. And even if you have three layers on top, you can just still ask for say, client remote IP address. If there's a network stream somewhere downstream, that's the stream that will give you your answer.

48:54 Oh, yeah, the stream work here is really nice. There's a lot of things that are nice. The coordinating, the task groups, the coordinating limitations, like with the limiter capacity limiter, a lot of cool building blocks on top of it. And also the fact that this runs against or integrates with regular asyncio means you don't have to completely change your whole system in order to use it, right? Yeah, very cool. Alright, well, I think we're about out of time to talk about Any IO. Do you want to take 30 seconds and just give the elevator pitch for SQL a code gen? This is a really exciting project that you created here.

49:34 Yeah, this is one of those site projects that are on the verge of a major release. So what this does is it takes an existing database, it connects an existing database, reflects the schema from that, and then writes model code for you. The next major version even supports data classes and other kinds of formats.

49:57 The SQL model is most exciting for me because that'll give you Pydantic models that use the sequel model, which is very exciting. Yeah, so nice. What else think if you're a consultant or you talked about Java earlier, I imagine you've got like a Java code base and you want to do a proof of concept in Python and SQL model or SQLAlchemy. And somebody says, well, why don't you try building a simple version that talks to our database. And if that thing has like 100 tables and complicated relationships, it's no fun to sit down. A big portion of that project might be just modeling the database. And here I can just say SQL a code jen connect to postgres boom outcomes, sequel Alchemy classes, that's a huge jumpstart for getting started. Or if you're a consultant jumping into a new project.

50:45 Yeah, exactly. If you have a really large database, this will save your time, at least hours.

50:53 Yes.

50:54 And a lot of frustration. Right. Because with like sequel Alchemy, you've got to have the model match the database just right and this will do that for you.

51:03 Okay. Super cool project typeguard is another one you have about super complicated but interesting capability to grab on.

51:10 Yeah, also one of those that are on major release. I certainly have not had enough time to finish the next major version. And then there's of course Python 311, which brings a whole bunch of new features that I have not been able to yet incorporate into Type Guard. And sadly, I also have not started using it myself.

51:39 It's a sad story, really.

51:40 Okay.

51:41 But the premise is that you have this fighters plug in, you activate it during the test run, and then in addition to static type checking your application, which usually do with my pipe Pyrite or what have you, you can also do runtime type checking because the static tools don't always see the correct types.

52:04 You might not be in control of that. Right. You might write a library. Your library might have types declared on what it's supposed to take. The person consuming your library has no requirement to run mypy. And they have no requirement to make sure what they typed matches what you said you expect. And because they're hints, they're not compiled options. In Python at Runtime, you might get something you don't expect, even though you put a type there.

52:30 Yeah, exactly. So this is how you get the Runtime assurance that you have the right type there.

52:37 Nice. So all you do with this type guard library is you put in at type checked decorator on a function.

52:42 The best way would be to use the import hook. Okay. So there's an import hook that will automatically add these decorators while doing the import, so you don't have to alter your code at all. Interesting, there are some open issues with that import hook. Like somebody reported that this import hook is installed too late, that the modules in question were already imported. So that's something I have yet to fix or find a workaround for. Sure, that's the idea.

53:11 Right. So you can either use the decorator and be somewhat guarded about how you're doing it and only apply to certain parts, like, say, your public API, or you could say Install Import hook. And then everything that gets imported gets wrapped in the type checked decorator. And what that does is it looks at the type hence and the declared return value and will raise an exception. If you say that your function takes an integer and it's passed a string, that becomes a Runtime error.

53:38 Right. Or you can just issue a warning.

53:41 Sure. The warning may be nice, but I think it's pretty cool you can opt into having Python typehints become enforced, basically. Yeah.

53:50 What I use this for is in Asphalt, when I accept the configuration for a component, I use this decorator, or rather an assert, to check that the types are correct so I don't throw any mysterious type errors or value errors further down the line or even worse at Runtime.

54:11 Sure. Okay. That's cool. There is one of the features of Asphalt or highlights is Runtime type checking for development and testing to fail early when functions are called with incompatible arguments and can be disabled for zero overhead. So it sounds like you're maybe doing an import hook in development mode that's handling all this for you. Is that right?

54:33 Well, actually, in this current version, I'm using the assert. So in case you didn't know, when you have asserts they are normally run without any switches to Python, but if you run Python without the debug mode, then asserts are not compiled in the byte code. So just by using the switch, you can disable these potentially expensive assets.

54:54 Yeah. Okay. I didn't know that. I'm familiar with that from C and C# and other compiled languages with their pragmas and that type of thing, but didn't realize that about Python asserts.

55:04 Yeah, there's this one thing that actually if you have this code if under debug and there's a box of code under that block, that whole block gets omitted from the compiled code if you run Python with the debug mode disabled.

55:20 Okay. Yeah. Very cool. All right, Alex. Well, those are some cool additional projects. I feel like the SQL a code gen. We almost could spend a whole bunch of other time on it. Another one is the AP scheduler again, could almost be its own show, but we're out of time for this one. So thank you so much for being here. Now, before you get out of here, I've got the two final questions to ask you. If you're going to write some Python code, what editor you do use?

55:43 I've been looking at the different editors available, and so far, PyCharm wins hands down.

55:50 Right on. I'm with you there.

55:51 So it has so many these intelligent features and what have you that for example, I use database features to browse through my database. I use it to refactorate features to change my code relatively safely, and it's docker support, gives me all the completion. The list goes on and on and on. And most of these IDE's are not nearly as sophisticated.

56:17 I agree. Excellent one. Now, notable PyPI package. I mean, we talked about a bunch you could recommend any of these we talked about where you can say something else you found interesting.

56:27 Well, I think I already mentioned Trio, but this is a question, really? Maybe poetry.

56:34 Okay. Yeah. Poetry. Yeah.

56:35 Poetry is something that I use for my application at work. It's the closest thing in Python to say yarn. So I manage the dependencies and lock down the dependencies using poetry. It's quite handy for that. There are some issues with poetry, like when I just need to update one dependency, update them all, and small issues like that. But other than that, it's great.

57:01 Yes, it looks really great. I know a lot of people are loving poetry. It's a good recommendation there. Right. Final call to action. People are interested in Any IO. How do they get started?

57:11 Well, this is somewhat of a tutorial here. I really don't have a really long tutorial on. I like Trio, so I'm heavily leaning on Trio's documentation here because Any IO has such a similar design to Trio, then a lot of Trios manual can be used to draw parallels to AnyIO. You can almost use Trio's documentation tutorial to learn how AnyIO works.

57:37 Yeah, it's highly inspired. Right.

57:39 Anything else? Then you should just come to Github. I think there's a link getting help at the bottom.

57:46 Okay.

57:46 Yeah, so there's a guitar link and I'm usually available there.

57:50 Great, okay. Yeah, very, very nice. And I'm guessing you accept contributions?

57:55 Sure, yeah.

57:55 So, yeah, let's see over here we've got what is that? 33 contributors. So yeah, excellent. If people want to contribute to the project and maybe that's code or maybe even they could put together a tutorial or something like that if they're interested.

58:09 Maybe.

58:10 Yes, perhaps. Okay, excellent. Well, thank you for all the cool libraries and take the time to come share them with us.

58:16 Thanks for having me.

58:16 Yeah, you bet. Bye. Thanks everyone for listening.

58:19 Bye.

58:20 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering, it really helps support the show. Listen to an episode of Compiler, an original podcast from Red Hat Compiler unravels industry topics, trends, and things you've always wanted to know about tech through interviews with the people who know it. Best subscribe today by following Talkpython.FM/compiler want to level up your Python? We have one of the largest catalogs of Python video courses over at Talk Python. Our content ranges from true beginners to deeply advanced topics like memory and Async. And best of all, there's not a subscription in sight. Check it out for yourself at training.Talkpython.FM be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the itunes feed at /itunes, the GooglePlay feed at /Play, and the Directrss feed at /Rss on talkpythonon.FM.

59:16 We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at Talkpython.FM/YouTube. This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon