WEBVTT

00:00:00.001 --> 00:00:04.500
The relatively recent introduction of async and await as keywords in Python have spawned a whole

00:00:04.500 --> 00:00:08.320
area of high-performance, highly scalable frameworks and supporting libraries.

00:00:08.320 --> 00:00:14.040
One such library that has great async building blocks is OmniLib. On this episode, you'll meet

00:00:14.040 --> 00:00:19.740
John Reese. John is the creator of OmniLib, which includes packages such as AIO iter tools,

00:00:19.740 --> 00:00:26.880
AIO multiprocess, and AIO SQLite. Join us as we async all the things. This is Talk Python to Me,

00:00:26.880 --> 00:00:31.520
episode 304, recorded February 16th, 2021.

00:00:31.520 --> 00:00:49.960
Welcome to Talk Python to Me, a weekly podcast on Python, the language, the libraries, the ecosystem,

00:00:49.960 --> 00:00:54.420
and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I'm at

00:00:54.420 --> 00:00:59.800
mkennedy and keep up with the show and listen to past episodes at talkpython.fm and follow the

00:00:59.800 --> 00:01:05.600
show on Twitter via at Talk Python. This episode is brought to you by Linode and Talk Python training.

00:01:05.600 --> 00:01:09.220
Please check out the offers during their segments. It really helps support the show.

00:01:09.220 --> 00:01:11.900
John, welcome to Talk Python to Me.

00:01:11.900 --> 00:01:13.500
Howdy. It's good to be here.

00:01:13.500 --> 00:01:17.920
Yeah, it's great to have you here as well. It's going to be a lot of fun to talk to you about

00:01:17.920 --> 00:01:23.320
async stuff. I think we both share a lot of admiration and love for asyncioing all the

00:01:23.320 --> 00:01:23.720
things.

00:01:23.720 --> 00:01:29.360
I definitely do. It's one of those cases where the things that it enables is so different,

00:01:29.360 --> 00:01:32.920
and you have to think about everything so differently when you're using asyncio,

00:01:32.920 --> 00:01:39.860
that it's a nice challenge, but also has potentially really high payoff if it's done well.

00:01:39.980 --> 00:01:44.860
Yeah, it has a huge payoff. And I think that it's been a little bit of a mixed bag in the terms of

00:01:44.860 --> 00:01:48.740
the reception that people have had. I know there have been a couple of folks who've written articles

00:01:48.740 --> 00:01:53.380
like, well, I tried it. It wasn't that great. But there's also, you know, I've had examples where I'm

00:01:53.380 --> 00:01:57.780
doing something like web scraping or actually got a message from somebody who listened. Maybe they were

00:01:57.780 --> 00:02:01.800
listening to Python Bytes, my other podcast. But anyway, I got a message from a listener after we

00:02:01.800 --> 00:02:06.040
covered some cool asyncio things and web scraping. They had to download a bunch of stuff.

00:02:06.300 --> 00:02:10.960
Like it takes like a day to literally it takes all day or something. It was really crazy. And then

00:02:10.960 --> 00:02:14.660
they said, well, now I'm using async and now my computer runs out of memory and crashes. It's

00:02:14.660 --> 00:02:19.820
getting it so fast. Like that's a large difference right there. Right. So there's certainly a category

00:02:19.820 --> 00:02:21.040
of things where it's amazing.

00:02:21.040 --> 00:02:27.680
Yeah, I think the case we've seen it most useful for is definitely doing those sorts of like concurrent

00:02:27.680 --> 00:02:33.880
web requests. Internally, it's also extraordinarily useful in monitoring situations where it's like you

00:02:33.880 --> 00:02:38.400
want to be able to talk to a whole bunch of servers as fast as possible. And maybe the amount of stuff

00:02:38.400 --> 00:02:42.380
that comes back from it is not as important as being able to just talk to them repeatedly.

00:02:42.380 --> 00:02:42.780
Yeah.

00:02:42.780 --> 00:02:46.920
But you're right. There's definitely a lot of cases where people are not necessarily using

00:02:46.920 --> 00:02:50.940
it correctly or they're hoping to like add a little bit of async into an existing thing.

00:02:50.940 --> 00:02:54.940
And that doesn't always work as well as just building something that's async from the start.

00:02:54.940 --> 00:02:59.220
Yeah. And there's more frameworks these days that are welcoming of async from the start,

00:02:59.280 --> 00:03:02.020
I guess. Yeah. We're going to talk. Yeah. We're going to talk about that. But before we get too

00:03:02.020 --> 00:03:05.680
far down the main topic, let's just start with a little bit of background on you. How'd you get

00:03:05.680 --> 00:03:06.600
into programming in Python?

00:03:06.600 --> 00:03:12.260
Sure. So my first interaction with the computer was when I was, you know, maybe like five or six years

00:03:12.260 --> 00:03:19.920
old. My parents had a TI-99-4A, which is like the knockoff Commodore attached to the television.

00:03:19.920 --> 00:03:25.900
Yeah. And I think back to that, like, how could you have like legible text on the CRT TV?

00:03:26.060 --> 00:03:27.760
It was pretty bad.

00:03:27.760 --> 00:03:28.500
It's bad, right?

00:03:28.500 --> 00:03:34.300
It's like my biggest memory of it is really just every time we would try to play a game and the

00:03:34.300 --> 00:03:39.040
cartridge or tape or whatever wouldn't work correctly, it would just dump you at a basic

00:03:39.040 --> 00:03:44.000
prompt where it's just expecting you to start typing some programming in. And like nobody in

00:03:44.000 --> 00:03:48.320
my family had a manual or knew anything about programming at the time. There was like, I think

00:03:48.320 --> 00:03:52.580
maybe we figured out that you could like print something to the screen, but nothing beyond that.

00:03:52.720 --> 00:03:58.460
And it wasn't until we ended up getting a DOS computer, you know, a few years later that really

00:03:58.460 --> 00:04:04.140
started to actually do some quote unquote real programming where we were writing like batch

00:04:04.140 --> 00:04:11.480
scripts to do menus or like, you know, deciding what program to run or things like auto exec on a

00:04:11.480 --> 00:04:13.860
floppy disk in order to boot into a game.

00:04:13.960 --> 00:04:17.980
I was just thinking of all the auto exec bad stuff that we had to do like, oh, you want

00:04:17.980 --> 00:04:21.740
to play Doom, but you've got you don't have enough high memory, whatever that was. And so

00:04:21.740 --> 00:04:25.840
you've got to rearrange where the drivers are. I mean, what's what a weird way to just I want

00:04:25.840 --> 00:04:28.060
to play a game. So I've got to rework where my drivers are.

00:04:28.060 --> 00:04:33.420
Make sure you don't load your mouse driver when you're booting into this one game that doesn't

00:04:33.420 --> 00:04:36.500
need the mouse because otherwise you run out of memory. Yeah, it was kind of crazy.

00:04:36.740 --> 00:04:41.900
And my biggest memory of programming there was there was QBasic on it and it came with

00:04:41.900 --> 00:04:47.180
this gorilla game where you just like throw bananas at another gorilla from like some sort

00:04:47.180 --> 00:04:48.500
of like city skyline.

00:04:48.500 --> 00:04:52.020
Like a King Kong knockoff, Donkey Kong knockoff type thing.

00:04:52.020 --> 00:04:57.580
Yeah, exactly. And I would struggle to figure out how that was actually doing anything.

00:04:57.580 --> 00:05:01.780
And he's like, I'd try to poke at it and figure it out as I went. Didn't really do that much,

00:05:01.780 --> 00:05:06.980
but it was actually my first opportunity for quote unquote open source projects because

00:05:06.980 --> 00:05:10.280
there's a video game that I really, really liked called NASCAR racing.

00:05:10.280 --> 00:05:16.020
And one of the things that I learned was you is on the burgeoning part of the internet for

00:05:16.020 --> 00:05:21.020
me, at least was people would host these mods for the game on like geo cities or whatever.

00:05:21.020 --> 00:05:26.300
And so these, these would change like the models for the cars or the wheels or add tracks or

00:05:26.300 --> 00:05:30.940
textures or whatever. And I actually wrote a batch script that would let you like at the

00:05:30.940 --> 00:05:34.620
time that you wanted to play the game, pick which of the ones you had enabled because

00:05:34.620 --> 00:05:37.140
you couldn't have them all enabled. So it would like, right.

00:05:37.140 --> 00:05:41.220
It's basically just a batch script that would go and like copy a bunch of files around from

00:05:41.220 --> 00:05:45.520
one place to another. And then when you're done with the menus or whatever, then it would

00:05:45.520 --> 00:05:51.220
launch the game. And I remember posting that on geo cities and, you know, having the silly

00:05:51.220 --> 00:05:56.500
little like JavaScript counter or whatever it was, take up to like a couple hundred page

00:05:56.500 --> 00:06:01.280
views of people downloading just this script to switch mods in and out. And so that was

00:06:01.280 --> 00:06:05.700
like the, the first real taste of like open source programming or open source projects that

00:06:05.700 --> 00:06:10.660
I had, but that actually like led into the way that I really learned programming, which was

00:06:10.660 --> 00:06:14.560
I wanted to have my own website that was more dynamic than what geo cities had.

00:06:14.560 --> 00:06:22.320
And so I ended up basically picking up Pearl and eventually PHP to write your web pages that I hosted

00:06:22.320 --> 00:06:26.960
on my own machine at home from like IIS and active.

00:06:26.960 --> 00:06:31.620
And how did you get a, what did you do? Do you use like dying DNS or something like that?

00:06:31.620 --> 00:06:37.420
Yes, exactly. Dying DNS. It was the jankiest setup, but it at least worked and I could impress

00:06:37.420 --> 00:06:43.560
my friends and it wasn't until I got to college and I was working on my first internship where

00:06:43.560 --> 00:06:49.200
the main project I was working on was essentially improving an open source bug tracker written

00:06:49.200 --> 00:06:55.320
in PHP in order to make it do the things that my company wanted to be able to do in it. So

00:06:55.320 --> 00:07:00.820
like adding a plugin system and things like that. And in the process of that, they, I eventually

00:07:00.820 --> 00:07:05.900
became a maintainer of the project and they had a bunch of Python scripts for managing releases,

00:07:05.900 --> 00:07:10.780
like doing things like creating the release tar balls, running other sort of like linter type

00:07:10.780 --> 00:07:16.540
things over the code base. And that was my very first taste of Python. And I hated it because it

00:07:16.540 --> 00:07:21.660
was just like, I couldn't get past the concept of like, you're forcing me to do white space.

00:07:21.660 --> 00:07:27.380
Like how barbaric is this? But it actually didn't take long before I realized that that actually makes

00:07:27.380 --> 00:07:32.460
the code more readable. It's like, you can literally pick up anybody else's Python script and it looks

00:07:32.460 --> 00:07:34.760
almost exactly like how you would have done it yourself.

00:07:34.760 --> 00:07:39.880
Yeah. And you've got a lot of the PEP 8 rules and tools that automatically re, you know,

00:07:39.880 --> 00:07:45.260
format stuff into that. So it's very likely, you know, you've got black and PyCharm's reformat

00:07:45.260 --> 00:07:46.240
and whatnot, right?

00:07:46.240 --> 00:07:50.440
This was all before that. So I think this was when like Python 2.6 was the latest.

00:07:50.440 --> 00:07:52.080
This was quite a while ago.

00:07:52.080 --> 00:07:53.780
Right before the big diversion.

00:07:53.780 --> 00:08:00.080
Yeah, yeah, exactly. Like I had no idea what Python 3 was until like 3.2 or 3.3 came out because

00:08:00.080 --> 00:08:05.060
it was just sequestered in this world of writing scripts for whatever version of Python was on

00:08:05.060 --> 00:08:06.520
my Linux box at the time.

00:08:06.520 --> 00:08:10.960
Right. You know, I suspect in the early days, probably the editors were not as friendly or

00:08:10.960 --> 00:08:15.420
accommodating, right? Like now, if you work with PyCharm or VS Code or something, you just write

00:08:15.420 --> 00:08:20.460
code and it automatically does the formatting and the juggling and whatnot. And once you get used to it,

00:08:20.460 --> 00:08:23.740
you don't really think that much about it. It just magically happens as you work on code.

00:08:23.740 --> 00:08:29.160
I'm wanting to say at the time I was just doing something stupid like Notepad++ or,

00:08:29.160 --> 00:08:32.740
you know, one of the other like really generic free text editors.

00:08:32.740 --> 00:08:34.500
Like Notepad, but Consolas fonts.

00:08:34.500 --> 00:08:36.520
Or it was Eclipse. It might have been Eclipse.

00:08:36.520 --> 00:08:38.860
Yeah. Was it maybe PyDev?

00:08:38.860 --> 00:08:42.860
I don't think I ever use a Python specific editor.

00:08:42.860 --> 00:08:48.680
Like, yeah, I think I've tried PyCharm exactly once and I do just enough stuff that's not Python

00:08:48.680 --> 00:08:53.360
that I don't want to deal with an IDE or editor that's not generalized.

00:08:53.720 --> 00:08:54.760
Right. Sure. Makes sense.

00:08:54.760 --> 00:08:59.060
Speaking of stuff you work on, what do you do day to day? What kind of stuff do you do?

00:08:59.060 --> 00:09:06.080
I'm a production engineer at Facebook on our internal Python foundation team. And so most of

00:09:06.080 --> 00:09:12.140
what I do there is, you know, building infrastructure or developer tools, primarily enabling engineers,

00:09:12.140 --> 00:09:21.120
data scientists, and AI or ML researchers to do what they do in Python every day. So some of that is like

00:09:21.120 --> 00:09:27.620
building out the system that allows us to integrate open source third party packages into the Facebook

00:09:27.620 --> 00:09:33.680
repository. Some of that is literally developing new open source tools for developers to use.

00:09:33.680 --> 00:09:51.840
A while back, I built a tool called called Bowler that is basically a refactoring tool for Python. It's based off of lib two to three that's in, you know, open source Python essentially gives you a way to make safe code modifications rather than using regular expressions, which are terrible.

00:09:52.020 --> 00:09:56.900
Yeah, for sure. And based on like the AST or something like that. Yeah, exactly. Yeah. Okay.

00:09:56.900 --> 00:10:18.020
And it's like the benefit of lib CST is that it takes in the concrete syntax tree. So it keeps track of all the white space comments and everything else. So that if you modify the tree in lib two to three, it will then allow you to write that back out exactly the way the file came in. Whereas the AST module would have thrown all of that, you know, metadata away.

00:10:18.280 --> 00:10:20.680
Right, right. Formatting spaces, whatever. It doesn't care.

00:10:20.680 --> 00:10:45.520
Yeah. And one of the newer projects I've worked on is called USort and it's microsort. Essentially, it's a replacement that we're using internally for ISort because ISort has some potentially destructive behaviors in its default configuration. And our goal was essentially to get import sorting done in a way that does not require adding comment directives all over the place.

00:10:45.520 --> 00:11:15.500
Right, right, right, right.

00:11:15.500 --> 00:11:31.300
And you can't just put a skip directive on that function call, because that just means ISort won't try to sort that one, but it'll sort everything else around it as well. And so what we ended up seeing was a lot of developers doing things like ISort skip file, and just turn off import sorting altogether.

00:11:31.660 --> 00:11:56.600
One of the things of ISort is like, first do no harm. It's trying its best to make sure that these common use cases are just treated normally and correctly from the start. In most cases, it's a much safer version of ISort. It's not complete. It's not a 100% replacement, but it's the thing we've been using internally. And it's one of the cases where I'm proud of the way that we are helping to build better tools for the ecosystem.

00:11:56.600 --> 00:12:10.740
Yeah, this is really, I never really thought about that problem. One thing that does drive me crazy is sometimes I'll need to change the Python path, so that future imports, regardless of your working directory behave the same if you don't have a package or something like that, right? Something simple.

00:12:11.000 --> 00:12:14.060
That's super common in the AI and ML type of workflows.

00:12:14.060 --> 00:12:25.340
Yeah, and I get all these warnings, like, you should not have code that is before an import. Like, well, but this one is about making the import work. If I don't put this, it's going to crash for some people if they run it weirdly and stuff like that, right?

00:12:25.340 --> 00:12:25.860
Yeah.

00:12:25.860 --> 00:12:30.500
Interesting. Yeah, very, very cool project. Nice. All right. So let's dive into async, huh?

00:12:30.500 --> 00:12:30.760
Sure.

00:12:30.760 --> 00:12:49.240
Yeah. So maybe a little bit of history. You know, Python, it's hard to talk about asynchronous programming in Python without touching on the gill, global interpreter lock, normally spoken as a bad word, but it's not necessarily bad. It has a purpose. It just its purpose is somewhat counter to making asynchronous code run really quick and in parallel.

00:12:49.240 --> 00:13:08.940
I mean, it's one of those things where if you imagined what Python would be without the global interpreter lock, you end up having to do a lot more work to make sure that, let's say, if you had multi-threaded stuff going on, you'd have to do a lot more work to make sure that they're not clobbering some shared data. Like, you look at the way that you have to have synchronizations and everything else in Java or C++.

00:13:09.340 --> 00:13:26.760
We don't generally need that in Python because the GIL prevents a lot of that bad behavior. And the current efforts to kind of remove the GIL that have been ongoing for the past eight to 10 years, in every single case, once you remove that GIL and add a whole bunch of other locks, the whole system is actually slower.

00:13:26.760 --> 00:13:34.240
So this is one of those things where it's like, it does cause problems, but it also enables Python to be a lot faster than it would be otherwise.

00:13:34.240 --> 00:13:35.240
And probably simpler.

00:13:35.240 --> 00:13:35.600
Yeah.

00:13:35.780 --> 00:13:56.400
Yeah. So the global interpreter lock, when I first heard about it, I thought of it as a threading thing and it sort of is, but you know, it's primarily says, let's create a system so that we don't have to do locks as we increment and decrement the ref count on variables. So basically all the memory management can happen without the overhead of taking a lock, releasing a lock, all that kind of weirdness.

00:13:56.400 --> 00:13:56.600
Yeah.

00:13:56.600 --> 00:14:05.720
So we've got like a bunch of early attempts. I mean, we've got threading and multiprocessing have been around for a while. There's Jeevent, Tornado, but then around, I guess, was it?

00:14:05.720 --> 00:14:16.420
Python 3.4. We got asyncio, which is a little bit of a different flavor than, you know, like the computational threading or the computational multiprocessing side of async.

00:14:16.420 --> 00:14:37.260
It's actually an interesting kind of throwback to the way that computing happened in like the 80s and early 90s, where like Windows 3.1 or classic macOS, essentially you can, you know, run your program or your process and you actually have to cooperatively give up control of the CPU in order for another program to work.

00:14:37.260 --> 00:14:47.820
So there'd be a lot of cases where, like, if you had a bad behaving program, you'd end up not being able to do multitasking in, you know, these old operating systems because it was all cooperative.

00:14:47.820 --> 00:14:57.660
In the case of asyncio, it's essentially taking that mechanism where you don't need to do a lot of context switching in threads or in processes.

00:14:57.660 --> 00:15:08.460
And you're essentially letting a bunch of functions cooperatively coexist and essentially say when your function gets to a point where it's doing a network request and it's waiting on that network request,

00:15:08.460 --> 00:15:20.020
your function then will nicely hand over control back to the asyncio framework, at which point the framework and event loop can go find the next task to work on that's not blocked on something.

00:15:20.020 --> 00:15:24.740
Yeah. And it's very often doesn't involve threads at all or, you know, the one main thread, right?

00:15:24.740 --> 00:15:27.760
Like, so, yeah, you know, it's not a way to create threading.

00:15:27.760 --> 00:15:31.120
It's a way to allow stuff to happen while you're otherwise waiting.

00:15:31.120 --> 00:15:34.000
Yeah. In the best case, you only ever have the one thread.

00:15:34.060 --> 00:15:41.720
And now in reality, it doesn't work like that because a lot of our, you know, modern computing infrastructure is not built in an async way.

00:15:41.720 --> 00:15:47.940
So like if you look at file access, there's basically no real way to do that asynchronously without threads.

00:15:47.940 --> 00:15:55.540
But in the best case, like network requests and so forth, if you have the appropriate hooks from the operating system, then that can all be completely in one thread.

00:15:55.540 --> 00:16:14.800
And that means you have a lot less overhead from the actual runtime and process from the operating system because you're not having to constantly throw a whole bunch of memory onto a stack and then pull off memory from another stack and try to figure out where you were when something interrupted you in the middle of 50 different operations.

00:16:16.800 --> 00:16:19.300
This portion of Talk Python To Me is sponsored by Linode.

00:16:19.300 --> 00:16:23.660
Simplify your infrastructure and cut your cloud bills in half with Linode's Linux virtual machines.

00:16:23.660 --> 00:16:27.720
Develop, deploy, and scale your modern applications faster and easier.

00:16:27.720 --> 00:16:35.120
Whether you're developing a personal project or managing large workloads, you deserve simple, affordable, and accessible cloud computing solutions.

00:16:35.120 --> 00:16:39.380
As listeners of Talk Python To Me, you'll get a $100 free credit.

00:16:39.380 --> 00:16:43.520
You can find all the details at talkpython.fm/Linode.

00:16:43.520 --> 00:16:49.240
Linode has data centers around the world with the same simple and consistent pricing regardless of location.

00:16:49.240 --> 00:16:52.100
Just choose the data center that's nearest to your users.

00:16:52.100 --> 00:16:58.780
You'll also receive 24-7, 365 human support with no tiers or handoffs regardless of your plan size.

00:16:58.780 --> 00:17:09.020
You can choose shared and dedicated compute instances, or you can use your $100 in credit on S3-compatible object storage, managed Kubernetes clusters, and more.

00:17:09.020 --> 00:17:11.500
If it runs on Linux, it runs on Linode.

00:17:11.500 --> 00:17:18.880
Visit talkpython.fm/Linode or click the link in your show notes, then click that create free account button to get started.

00:17:18.880 --> 00:17:27.920
Right, if it starts swapping out the memory it's touching, it might swap out what's in the L1, L2, L3 caches.

00:17:27.920 --> 00:17:28.760
Yeah, exactly.

00:17:28.760 --> 00:17:33.920
It can have huge performance impacts, and it's just constantly cycling back and forth out of control a lot of times, right?

00:17:34.080 --> 00:17:34.320
Yeah.

00:17:34.320 --> 00:17:39.760
In a lot of our testing internally when I was working on things that would talk to lots and lots of servers,

00:17:39.760 --> 00:17:48.140
it's like we would hit a point where somewhere between 64 and 128 threads would actually start to see less performance overall

00:17:48.140 --> 00:17:53.440
because it just spends all of its time trying to context switch between all of these threads.

00:17:53.440 --> 00:17:54.280
Right, right, right.

00:17:54.400 --> 00:18:01.680
You're interrupting these threads at an arbitrary point in time because the runtime is trying to make sure that all of the threads are serviced equally.

00:18:01.680 --> 00:18:06.600
But in reality, like half of these threads don't need to be given the context right now.

00:18:06.600 --> 00:18:14.860
So by doing those sort of interrupts in context, which is when the runtime wants to rather than when the functions or requests are wanting to,

00:18:14.860 --> 00:18:16.940
you end up with a lot of suboptimal behavior.

00:18:17.180 --> 00:18:17.740
Yeah, interesting.

00:18:17.740 --> 00:18:26.260
Yeah, and also things like locks, mutexes, and stuff don't work in this world because it's about what thread has access.

00:18:26.260 --> 00:18:27.620
Well, all the codes on one thread.

00:18:27.620 --> 00:18:33.880
So to me, the real zen of AsyncIO, at least for many really solid use cases, kind of like we touched on,

00:18:33.880 --> 00:18:36.260
is it's all about scaling when you're waiting.

00:18:36.260 --> 00:18:36.860
Yes.

00:18:36.860 --> 00:18:40.360
If I'm waiting on something else, it's like completely free to just go to it.

00:18:40.360 --> 00:18:48.040
If I'm calling microservices, external APIs, if I'm downloading something or uploading a file or talking to a database,

00:18:48.040 --> 00:18:51.620
or even maybe accessing a file with something like AIO files.

00:18:51.620 --> 00:18:52.020
Yeah.

00:18:52.020 --> 00:18:52.420
Yeah.

00:18:52.420 --> 00:18:57.960
Yeah, there's a cool place called AI Awesome AsyncIO by TML Fear.

00:18:57.960 --> 00:18:58.940
It's pretty cool.

00:18:58.940 --> 00:18:59.600
Have you seen this place?

00:18:59.600 --> 00:19:01.260
I have looked at it in the past.

00:19:01.260 --> 00:19:04.700
I end up spending so much time looking at and building things.

00:19:04.840 --> 00:19:09.160
It's like I haven't actually gotten a lot of opportunity to use a bunch of these.

00:19:09.160 --> 00:19:13.620
Most of my time, I'm actually not working that high enough on the stack to make use of them.

00:19:13.620 --> 00:19:14.300
Right, right, right.

00:19:14.300 --> 00:19:15.960
These are a lot of more frameworks.

00:19:15.960 --> 00:19:19.440
We do have some other neat things in there as well, like AsyncSSH.

00:19:19.440 --> 00:19:20.380
I hadn't heard of that one.

00:19:20.380 --> 00:19:21.720
But anyway, I'll put that in the show notes.

00:19:21.720 --> 00:19:27.880
That's got, I don't know, 50, 60 libraries and packages for solving different problems with AsyncIO, which is pretty cool.

00:19:27.880 --> 00:19:28.120
Yeah.

00:19:28.120 --> 00:19:32.880
Whenever I talk about AsyncIO, one of the things I love to give a shout out to is this thing called Unsync.

00:19:32.880 --> 00:19:34.080
Have you heard of Unsync?

00:19:34.260 --> 00:19:36.580
I had not heard about it until I looked at the show notes.

00:19:36.580 --> 00:19:42.340
But it sounds a lot like some of the things that I've seen people implement a lot of different cases.

00:19:42.340 --> 00:19:50.480
It's a very filling a common sort of use case where you have, like I was saying earlier, where people want to mix AsyncIO into an existing synchronous application.

00:19:50.540 --> 00:19:54.660
You do have to be very careful about how you do that, especially vice versa.

00:19:54.660 --> 00:20:08.340
Or a lot of the stumbling blocks we've seen tend to be cases where you have synchronous code that calls some Async code that then wants to call some synchronous code, but on like another thread so that it's not blocked by it.

00:20:08.580 --> 00:20:14.100
And you actually end up getting this like in-out sort of thing where you have like nested layers of AsyncIO.

00:20:15.060 --> 00:20:18.080
I'm not sure how much this may or may not solve that.

00:20:18.080 --> 00:20:21.060
I think this actually helps some with that as well.

00:20:21.060 --> 00:20:24.740
Basically, the idea is there's two main things that it solves that I think is really neat.

00:20:24.740 --> 00:20:30.320
One, it's like a unifying layer across multiprocessing, multithreading, and straight AsyncIO.

00:20:30.320 --> 00:20:30.700
Right?

00:20:30.760 --> 00:20:32.820
So you put a decorator onto a function.

00:20:32.820 --> 00:20:36.180
If the function is an Async function, it runs it on AsyncIO.

00:20:36.180 --> 00:20:38.300
If it's a regular function, it runs it on a thread.

00:20:38.300 --> 00:20:43.140
And if you say it's a regular function, but it's computational, it'll run it on multiprocessing.

00:20:43.140 --> 00:20:46.700
But it gives you basically an AsyncIO, Async and await API for it.

00:20:46.700 --> 00:20:48.540
And it figures out how to run the loop and all.

00:20:48.540 --> 00:20:49.360
Anyway, it's pretty cool.

00:20:49.360 --> 00:20:53.140
Not what we're here to talk about, but it's definitely worth checking out while we're on the subject.

00:20:53.140 --> 00:20:59.320
Ultimately, it gives you just a future that you can then either await or ask for the result from, right?

00:20:59.320 --> 00:20:59.960
Yeah, exactly.

00:20:59.960 --> 00:21:00.580
Exactly.

00:21:01.060 --> 00:21:04.900
And the result, instead of saying, you've got to wait until it's finished before you can get the result,

00:21:04.900 --> 00:21:06.100
you just go, give me the result.

00:21:06.100 --> 00:21:07.500
And if it needs to, it'll just block.

00:21:07.500 --> 00:21:10.500
So it's a nice way to sort of cap the AsyncIO.

00:21:10.500 --> 00:21:14.740
You know, like one of the challenges of AsyncIO is, well, five levels down the call stack,

00:21:14.740 --> 00:21:15.860
this thing wants to be Async.

00:21:15.860 --> 00:21:17.340
So the next thing's Async.

00:21:17.340 --> 00:21:18.540
So the next thing up is Async.

00:21:18.540 --> 00:21:20.640
And like all of a sudden, everything's Async, right?

00:21:20.640 --> 00:21:22.400
And so it was something like this.

00:21:22.400 --> 00:21:23.800
I mean, you could do it yourself as well.

00:21:23.800 --> 00:21:26.240
You can like just go create an event loop, run it.

00:21:26.240 --> 00:21:29.100
And at this level, we're not going to be Async above it.

00:21:29.140 --> 00:21:31.580
But we're coordinating stuff below using AsyncIO.

00:21:31.580 --> 00:21:33.240
And here's where it stops.

00:21:33.240 --> 00:21:33.680
Yeah.

00:21:33.680 --> 00:21:38.960
It sounds like a nicer version of what I see dozens of when you have lots and lots of engineers

00:21:38.960 --> 00:21:42.200
that aren't actually working on the same code base together, but they're all in the same

00:21:42.200 --> 00:21:42.700
repository.

00:21:42.880 --> 00:21:46.780
And we end up seeing these cases where everybody has solved the same use case.

00:21:46.780 --> 00:21:48.500
So I do think this would be useful.

00:21:48.500 --> 00:21:51.020
And I'm actually planning on sharing it with more people.

00:21:51.020 --> 00:21:51.280
Yeah.

00:21:51.280 --> 00:21:51.460
Yeah.

00:21:51.460 --> 00:21:51.800
Check it out.

00:21:51.800 --> 00:21:56.120
It's like a subtotal, I think, 126 lines of Python in one file.

00:21:56.120 --> 00:21:58.040
But it's really cool, this unifying API.

00:21:58.040 --> 00:21:58.640
All right.

00:21:58.640 --> 00:22:01.760
I guess that probably brings us to OmniLib.

00:22:01.760 --> 00:22:03.080
You want to talk about that for a little bit?

00:22:03.080 --> 00:22:07.080
So this is what I thought would be fun to have you on the show to really focus on is like

00:22:07.080 --> 00:22:12.400
AsyncIO, but then also you've created this thing called OmniLib, the OmniLib project that

00:22:12.400 --> 00:22:15.700
has solves four different problems with AsyncIO.

00:22:15.700 --> 00:22:18.080
And obviously you can combine them together, I would expect.

00:22:18.080 --> 00:22:24.820
The origins of this really is like I had built the like AIO SQLite was the first thing that

00:22:24.820 --> 00:22:26.660
I wrote that was an Async framework.

00:22:26.660 --> 00:22:28.580
And then I'd built a couple more.

00:22:28.580 --> 00:22:32.380
And at one point I realized these projects are actually getting really popular and people

00:22:32.380 --> 00:22:38.140
are using it, but they're just like one of the hundred things that are on my GitHub profile

00:22:38.140 --> 00:22:38.780
and graveyard.

00:22:38.780 --> 00:22:43.960
So I really felt like they needed to have their own separate place for like, these are the

00:22:43.960 --> 00:22:45.800
projects that I'm actually proud of.

00:22:45.800 --> 00:22:50.880
I thought that was actually a good opportunity to be able to make a dedicated like project

00:22:50.880 --> 00:22:53.340
or organization for it.

00:22:53.340 --> 00:22:59.680
And essentially say that everything under this I'm guaranteeing is going to be developed under

00:22:59.680 --> 00:23:05.380
a very inclusive code of conduct that I personally believe in and want to try and also at the same

00:23:05.380 --> 00:23:07.540
time make it more welcoming and supportive.

00:23:07.540 --> 00:23:14.240
You know, other contributors, especially newcomers or other otherwise marginalized developers in

00:23:14.240 --> 00:23:17.180
the ecosystem and try to be as friendly as possible with it.

00:23:17.180 --> 00:23:21.880
And it's like, this is something that I tried to do beforehand and I just never really formalized

00:23:21.880 --> 00:23:26.620
it on any of my projects other than like, here's a code of conduct file in the repository.

00:23:26.620 --> 00:23:27.180
Yeah.

00:23:27.180 --> 00:23:27.240
Yeah.

00:23:27.240 --> 00:23:32.240
But this is like really one of the first times where I wanted to put all these together and

00:23:32.240 --> 00:23:36.520
make sure that these are really like, this is going to be whether or not enough people

00:23:36.520 --> 00:23:37.880
make it a community.

00:23:37.880 --> 00:23:40.620
I want it to be welcoming from the outset.

00:23:40.620 --> 00:23:40.940
Right.

00:23:40.940 --> 00:23:41.500
That's really cool.

00:23:41.500 --> 00:23:45.960
And you created your own special GitHub organization that you're putting it all under and stuff like

00:23:45.960 --> 00:23:46.160
that.

00:23:46.160 --> 00:23:49.540
So it's kind of the things that are graduated from your personal project.

00:23:49.540 --> 00:23:50.220
Is that a story?

00:23:50.220 --> 00:23:50.560
Yeah.

00:23:50.660 --> 00:23:56.420
And kind of the threshold I tried to follow is like, if this is worth making a Sphinx documentation

00:23:56.420 --> 00:23:59.840
site for, then it's worth putting on, you know, OmniLib project.

00:23:59.840 --> 00:24:01.960
So they're not all asyncio.

00:24:01.960 --> 00:24:06.940
That just happens to be where a lot of my interests and utility stands at.

00:24:06.940 --> 00:24:10.860
So that's what most of them are, or at least the most popular ones.

00:24:10.860 --> 00:24:15.520
But there are other projects that I have also in the back burner that will probably end up

00:24:15.520 --> 00:24:20.720
there that maybe not as useful libraries or whatever, but either way, like I said, these

00:24:20.720 --> 00:24:21.900
are the ones that I'm at least proud of.

00:24:21.900 --> 00:24:22.180
Nice.

00:24:22.180 --> 00:24:22.500
That's cool.

00:24:22.500 --> 00:24:27.300
So you talked about the being there to support people who are getting into open source and

00:24:27.300 --> 00:24:29.400
whatnot and having that code of conduct.

00:24:29.400 --> 00:24:32.580
What other than that, is there like a mission behind this?

00:24:32.580 --> 00:24:37.700
Like I want to make this category of tools or solve these types of problems, or is it just

00:24:37.700 --> 00:24:39.540
these are the things that you've graduated?

00:24:39.540 --> 00:24:41.080
It's something I've tried to think about.

00:24:41.080 --> 00:24:42.940
I'm not 100% certain.

00:24:42.940 --> 00:24:48.120
I would like it to have maybe more of a mission, but at the same time, it's like, especially

00:24:48.120 --> 00:24:51.820
from things I've had to deal with at work, it's like, I don't want this to be a dumping

00:24:51.820 --> 00:24:52.840
ground of stuff either.

00:24:52.840 --> 00:24:57.380
Like I want this specifically, it's like, like in the opening statement, I want it to be a group

00:24:57.380 --> 00:25:00.820
of high quality projects that are, you know, following this code of conduct.

00:25:00.820 --> 00:25:06.200
So from that perspective, it's like, at the moment, it's like my personal interests are always

00:25:06.200 --> 00:25:11.340
in building things where I find gaps in, you know, availability from other libraries.

00:25:11.540 --> 00:25:16.720
So that's probably the closest to a mission of what belongs here is just things that haven't

00:25:16.720 --> 00:25:17.400
been made yet.

00:25:17.400 --> 00:25:17.700
Yeah.

00:25:17.700 --> 00:25:18.120
Yeah.

00:25:18.120 --> 00:25:18.340
Yeah.

00:25:18.340 --> 00:25:22.780
But either way, I just want to have that, you know, dedication to the statement of like,

00:25:22.780 --> 00:25:24.220
I want these to be high quality.

00:25:24.220 --> 00:25:25.260
I want them to be tested.

00:25:25.260 --> 00:25:30.780
I want them to be, you know, have continuous integration and testing and well-documented

00:25:30.780 --> 00:25:31.620
and so forth.

00:25:31.620 --> 00:25:31.860
Yeah.

00:25:31.860 --> 00:25:32.360
Super cool.

00:25:32.360 --> 00:25:32.820
All right.

00:25:32.840 --> 00:25:36.960
So there's four main projects here on the homepage.

00:25:36.960 --> 00:25:39.540
I mean, do you have the attribution one, but that's...

00:25:39.540 --> 00:25:40.140
Like helper tool.

00:25:40.140 --> 00:25:41.020
Exactly.

00:25:41.020 --> 00:25:45.520
Let's talk about the things that maybe they're the AIO extension of.

00:25:45.860 --> 00:25:48.660
So in Python, we have iter tools, right?

00:25:48.660 --> 00:25:55.080
Which is like tools for easily creating generators and such out of collections and whatnot.

00:25:55.080 --> 00:25:57.680
So you have AIO iter tools, which is awesome.

00:25:57.680 --> 00:26:00.780
And then we have multiprocessing, which is a way around the GIL.

00:26:00.780 --> 00:26:04.380
It's like, here's a function and some data, go run that in a sub process and then give me

00:26:04.380 --> 00:26:04.920
the answer.

00:26:04.920 --> 00:26:08.900
And because it's a sub process, it has its own sub GIL or its own separate GIL.

00:26:08.900 --> 00:26:09.520
So it's all good.

00:26:09.520 --> 00:26:11.960
So you have AIO multiprocess, which is cool.

00:26:11.960 --> 00:26:18.080
And then one of the most widely used databases is SQLite, already built into Python, which

00:26:18.080 --> 00:26:18.720
is super cool.

00:26:18.720 --> 00:26:20.940
And so you have AIO SQLite.

00:26:20.940 --> 00:26:25.720
And then sort of extending that, that's like a raw SQLite, you know, raw SQL library that's

00:26:25.720 --> 00:26:26.300
asynch.io.

00:26:26.300 --> 00:26:29.840
Then you have AQL, which is more ORM-like.

00:26:29.840 --> 00:26:31.460
I'm not sure it's 100% ORM.

00:26:31.460 --> 00:26:34.400
You could categorize it for us, but it's like an ORM.

00:26:34.400 --> 00:26:39.400
Yeah, I've definitely used like in quotes, in scare quotes, ORM-like.

00:26:39.820 --> 00:26:46.100
Because I want it to be able to essentially be a combination of like well-typed table definitions

00:26:46.100 --> 00:26:50.960
that you can then use to generate queries against the database.

00:26:50.960 --> 00:26:57.680
As of right now, it's more like writing a, like a DSL that lets you write a backend agnostic

00:26:57.680 --> 00:26:58.960
SQL statement.

00:26:58.960 --> 00:26:59.360
Right.

00:26:59.360 --> 00:26:59.720
Okay.

00:26:59.720 --> 00:27:00.080
Yeah.

00:27:00.080 --> 00:27:02.940
DSL domain specific language for people who aren't entirely sure.

00:27:02.940 --> 00:27:03.080
Yeah.

00:27:03.080 --> 00:27:07.940
So really it's essentially just stringing together a whole bunch of method calls on

00:27:07.940 --> 00:27:11.920
a table object in order to get a SQL query out of it.

00:27:11.920 --> 00:27:17.120
The end goal is to be able to have that actually be a full end-to-end thing where you've defined

00:27:17.120 --> 00:27:19.560
your tables and you get objects back from it.

00:27:19.560 --> 00:27:23.840
And then you can like call something on the objects to get them to update themselves back

00:27:23.840 --> 00:27:24.420
into a database.

00:27:24.420 --> 00:27:29.880
But I've been very hesitant to pick an API on it for how to actually get all that done

00:27:29.880 --> 00:27:34.880
because trying to do that in an async fashion is actually really difficult to do it right.

00:27:34.880 --> 00:27:40.580
And separately, like trying to do asyncio and have everything well-typed, you know, it's

00:27:40.580 --> 00:27:43.800
like two competing problems that have to be solved.

00:27:43.940 --> 00:27:44.040
Yeah.

00:27:44.040 --> 00:27:51.080
I just recently started playing with SQLAlchemy's 2.0, 1.4 beta API where they're doing the

00:27:51.080 --> 00:27:54.880
async stuff and it's quite different than the traditional SQLAlchemy.

00:27:54.880 --> 00:27:57.520
So yeah, you can see the challenges there.

00:27:57.520 --> 00:28:02.140
And it's also a case where it's like having something to generate the queries to me is more

00:28:02.140 --> 00:28:06.060
important than having the thing that will actually go run the query, especially for a lot of internal

00:28:06.060 --> 00:28:06.800
use cases.

00:28:06.800 --> 00:28:11.280
We really just want something that will generate the query or we already have a system that

00:28:11.280 --> 00:28:14.500
will talk to the database once you give it a query and parameters.

00:28:14.500 --> 00:28:20.500
It's the piece of actually saying, defining what your table hierarchy or structure is,

00:28:20.500 --> 00:28:26.120
and then being able to run stuff to get the actual SQL query out of it, but have that work

00:28:26.120 --> 00:28:32.260
for both SQLite and MySQL or Postgres or whatever other backend you're using.

00:28:32.260 --> 00:28:37.320
Having it be able to use the same code and generate the correct query based off of which

00:28:37.320 --> 00:28:39.460
database you're talking to is the important part.

00:28:39.460 --> 00:28:40.060
Yeah, cool.

00:28:40.180 --> 00:28:43.640
Well, there's probably a right order to dive into these, but since we're already talking

00:28:43.640 --> 00:28:48.240
about the AQL one a lot, maybe give us an example of what you can do with it.

00:28:48.240 --> 00:28:52.960
Maybe talk us through, it's hard to talk about code on air, but just give us a sense of what

00:28:52.960 --> 00:28:55.760
kind of code you write and what kind of things it does for us.

00:28:55.760 --> 00:29:00.040
This is heavily built around the idea of using data classes.

00:29:00.040 --> 00:29:04.460
In this case, it specifically uses adders simply because that's what I was more familiar

00:29:04.460 --> 00:29:06.560
with at the time that I started building this.

00:29:06.860 --> 00:29:12.420
But essentially, you create a class with, you know, essentially, all of your columns specified

00:29:12.420 --> 00:29:15.140
on that class with the name and the type.

00:29:15.140 --> 00:29:16.220
Not like SQL.

00:29:16.220 --> 00:29:17.840
But not like super heavy.

00:29:17.840 --> 00:29:18.380
Or Django style.

00:29:18.380 --> 00:29:18.900
Yeah, exactly.

00:29:19.140 --> 00:29:25.280
Like native types like id colon int, name colon str, not sa.column dot, you know, sa.string

00:29:25.280 --> 00:29:25.960
and so on, right?

00:29:25.960 --> 00:29:26.560
Yeah, exactly.

00:29:26.560 --> 00:29:32.340
Like I want this to look as close to a normal data class definition as possible and essentially

00:29:32.340 --> 00:29:33.980
be able to decorate that.

00:29:33.980 --> 00:29:39.040
And you get a special object back that when you use methods on it, like in this case, the

00:29:39.040 --> 00:29:41.640
example is you're creating a contact.

00:29:41.640 --> 00:29:45.700
So you list the integer id, the name of it, and the email.

00:29:45.700 --> 00:29:48.920
And whatever the primary key doesn't really matter in this case.

00:29:48.920 --> 00:29:53.000
Whether the id ends up getting auto incremented, again, doesn't really matter.

00:29:53.000 --> 00:29:56.580
What we're really worried about is generating the actual queries.

00:29:56.580 --> 00:29:58.900
And you're assuming like somebody's created the table.

00:29:58.900 --> 00:30:01.200
It's already got a primary key for id.

00:30:01.200 --> 00:30:03.480
It's auto incrementing or something like that.

00:30:03.480 --> 00:30:03.700
Yeah.

00:30:03.700 --> 00:30:04.960
And you just want to talk to the thing.

00:30:04.960 --> 00:30:05.280
Yeah.

00:30:07.580 --> 00:30:10.520
Talk Python to me is partially supported by our training courses.

00:30:10.520 --> 00:30:15.320
Do you want to learn Python, but you can't bear to subscribe to yet another service?

00:30:15.320 --> 00:30:18.580
At Talk Python Training, we hate subscriptions too.

00:30:18.580 --> 00:30:23.100
That's why our course bundle gives you full access to the entire library of courses for

00:30:23.100 --> 00:30:24.140
one fair price.

00:30:24.140 --> 00:30:25.260
That's right.

00:30:25.260 --> 00:30:30.020
With the course bundle, you save 70% off the full price of our courses and you own them

00:30:30.020 --> 00:30:31.200
all forever.

00:30:31.200 --> 00:30:36.280
That includes courses published at the time of the purchase, as well as courses released within

00:30:36.280 --> 00:30:37.320
about a year of the bundle.

00:30:37.320 --> 00:30:42.800
So stop subscribing and start learning at talkpython.fm/everything.

00:30:44.960 --> 00:30:51.060
And so essentially you take this contact class that you've created and you can call a select

00:30:51.060 --> 00:30:57.400
method on it that will then, you know, you can add an aware method to decide which contacts

00:30:57.400 --> 00:30:58.180
you want to select.

00:30:58.180 --> 00:31:03.560
There's other methods for changing the order or limits or furthermore, if you wanted to do

00:31:03.560 --> 00:31:05.180
joins or other sorts of things.

00:31:05.180 --> 00:31:12.380
It kind of expects that you know what general SQL syntax looks like because you string together

00:31:12.380 --> 00:31:16.080
a bunch of stuff kind of in the same order that you would with a SQL query.

00:31:16.080 --> 00:31:20.780
But the difference is that in this case, like when you're doing the where clause, rather than

00:31:20.780 --> 00:31:26.940
having to do an arbitrary string that says, you know, column name like, and then some string

00:31:26.940 --> 00:31:32.820
literal, in this case, you're saying like where contact dot email dot like, and then passing

00:31:32.820 --> 00:31:34.620
the thing that you want to check against.

00:31:34.620 --> 00:31:39.280
And the other alternative is you could if you wanted to look for a specific one, you could

00:31:39.280 --> 00:31:44.800
say like where contact dot email equals equals, and then the value you're looking for.

00:31:44.800 --> 00:31:52.440
And so you're kind of using or abusing Python's expression syntaxes to essentially build up

00:31:52.440 --> 00:31:56.440
your query, definitely using a domain specific language in this case.

00:31:56.440 --> 00:32:01.460
But essentially having the fluent API, once you string all this together, you have this query

00:32:01.460 --> 00:32:08.100
object, which you can then, you know, pass to the appropriate engine to get an actual finalized

00:32:08.100 --> 00:32:12.700
SQL query and the parameters that would get passed if you were doing a prepared query.

00:32:12.700 --> 00:32:17.800
But you could also potentially like in the future, the goal was you would also be able

00:32:17.800 --> 00:32:24.000
to make manage your connection with a QL and basically be able to tell it to run this query

00:32:24.000 --> 00:32:25.380
on that connection.

00:32:25.380 --> 00:32:29.900
And regardless, you'd be able to do this the same with SQLite or MySQL or whatever.

00:32:29.900 --> 00:32:38.840
And the library is the part that handles deciding what specific part of the incompatible SQL languages

00:32:38.840 --> 00:32:41.280
that they all use will actually be available.

00:32:41.280 --> 00:32:41.780
Right.

00:32:41.840 --> 00:32:42.020
Yeah.

00:32:42.020 --> 00:32:45.700
Like, for example, MySQL uses question mark for the parameters.

00:32:45.700 --> 00:32:46.440
Yeah.

00:32:46.440 --> 00:32:49.860
SQL server uses, I think, at parameter name.

00:32:49.860 --> 00:32:51.760
There's like, they all have their own little style.

00:32:51.760 --> 00:32:52.780
It's not the same, right?

00:32:52.780 --> 00:32:53.200
Yeah.

00:32:53.200 --> 00:32:59.640
And some of that is kind of moot because of the fact that the most of the engine libraries

00:32:59.640 --> 00:33:05.460
that we use commonly in Python, like AIO, MySQL or SQLite or whatever, they're already kind

00:33:05.460 --> 00:33:13.260
unified around the, there's a specific PEP that defines what the database interface is going to look like.

00:33:13.260 --> 00:33:13.480
Right.

00:33:13.480 --> 00:33:15.400
The DB API 2 or whatever.

00:33:15.400 --> 00:33:16.060
Yes, that.

00:33:16.480 --> 00:33:21.020
So some of that work has already been done by the PEPs and by the actual database engines.

00:33:21.020 --> 00:33:28.880
But there's a lot of cases where it's a little bit more subtle, like the semantics, especially around using a like expression.

00:33:28.880 --> 00:33:35.620
MySQL does a case insensitive matching by default, but SQLite doesn't.

00:33:35.860 --> 00:33:39.560
AQL tries to kind of like unify those where possible.

00:33:39.560 --> 00:33:50.020
But also there's cases, especially when you're getting into joins or group buys, things like that, where the actual specific syntax being used will start to vary between the different backends.

00:33:50.020 --> 00:33:58.180
And that's where we've had more issues, like especially the whole point of SQLite for a lot of people is as a drop in replacement to MySQL when you're running your unit tests.

00:33:58.180 --> 00:34:03.320
And so you want your code to be able to do the same thing regardless of what database engine it's connected to.

00:34:03.320 --> 00:34:04.680
And this is one way to do that.

00:34:04.680 --> 00:34:05.380
Okay, that's cool.

00:34:05.480 --> 00:34:09.320
Yeah, with SQLite, you can say the database lives in, you know, colon memory.

00:34:09.320 --> 00:34:10.180
Yeah, exactly.

00:34:10.180 --> 00:34:14.160
And then you can just tear it up for your unit tests and then it just goes away.

00:34:14.160 --> 00:34:14.620
Nice.

00:34:14.620 --> 00:34:18.840
So maybe that brings us to the next one, the AIO SQLite.

00:34:18.840 --> 00:34:19.260
Sure.

00:34:19.260 --> 00:34:20.820
This one is all about async.io.

00:34:20.820 --> 00:34:22.460
You can see from the example here.

00:34:22.460 --> 00:34:23.280
You want to tell us about that?

00:34:23.280 --> 00:34:32.220
Yeah, this was again, born out of, you know, a need for using SQLite, especially in testing frameworks and so forth to replace my SQL.

00:34:32.220 --> 00:34:33.220
Cool.

00:34:33.220 --> 00:34:44.100
And essentially what I was doing was taking the normal SQLite API from Python and essentially saying like, how would this look in an asyncio world?

00:34:44.100 --> 00:34:50.540
Like if we were re-implementing SQLite from the ground up in an asyncio world, how can we do that?

00:34:50.540 --> 00:35:08.900
And essentially, so in this case, we're heavily using async context managers and awaitables in order to actually run the database connection to SQLite on a separate thread and provide as much of an async interface to that as possible.

00:35:08.900 --> 00:35:18.420
So when you connect to a IO SQLite, it spawns a background thread that actually uses the standard SQLite library to connect to your database.

00:35:18.420 --> 00:35:25.900
And then it has methods on that thread object that allow you to actually make calls into that database.

00:35:25.900 --> 00:35:29.060
And those are essentially proxied through futures.

00:35:29.060 --> 00:35:45.280
So if you want to execute a query, when you await that query execution, it will basically queue the function call on the other thread and basically tell it, here's the future to set when the result is ready.

00:35:45.280 --> 00:35:59.620
So once the SQLite execution or cursor or whatever has actually completed doing what it's supposed to do on that background thread, it then goes back to the original threads event loop and says, you know, set this future to finished.

00:35:59.620 --> 00:36:06.480
And so that allows the thing originally awaiting it to actually come back and do something with the result.

00:36:06.480 --> 00:36:08.860
Yeah, it sounds a little tricky, but also super helpful.

00:36:08.860 --> 00:36:13.840
And people might be thinking, didn't we just talk about the GIL and how threading doesn't really add much?

00:36:14.140 --> 00:36:22.340
But when you're talking over the network or you're talking to other things, a lot of times the GIL can be released while you're waiting on the internal SQLite or something like that, right?

00:36:22.340 --> 00:36:22.640
Yeah.

00:36:22.640 --> 00:36:31.180
So the internal SQLite library on its own will release the GIL when it's calling into the underlying SQLite C library.

00:36:31.180 --> 00:36:32.180
And that's where it's waiting.

00:36:32.180 --> 00:36:32.880
So that's good.

00:36:32.880 --> 00:36:33.280
Yeah.

00:36:33.280 --> 00:36:36.080
The other side of this is that it's one thread.

00:36:36.080 --> 00:36:43.140
I'm not really aware of anybody who's opening, you know, hundreds of simultaneous connections to a SQLite database.

00:36:43.140 --> 00:36:48.440
The way that people expect to do with, say, like AIo HTTP or things like that.

00:36:48.440 --> 00:36:59.700
So while it is, you know, potentially less efficient, if you wanted to do a whole bunch of parallel SQLite connections, the problem really is that SQLite itself is not thread safe.

00:36:59.700 --> 00:37:02.560
So it has to have a dedicated thread for each connection.

00:37:02.560 --> 00:37:06.280
Otherwise, you risk corruption of the backing database.

00:37:06.280 --> 00:37:07.220
Which sounds not good.

00:37:07.380 --> 00:37:07.580
Right.

00:37:07.580 --> 00:37:07.940
Yeah.

00:37:07.940 --> 00:37:24.160
It's like, basically, you end up either where two threads clobber each other or more specifically, what SQLite says is, if you absolutely try to talk to a connection from a different thread, the Python module will complain unless you've specifically told it, no, please don't complain.

00:37:24.260 --> 00:37:31.060
I know it's unsafe, at which point SQLite will be really upset if you try to do a write or modification to that database.

00:37:31.220 --> 00:37:37.380
So there are layers of protections against that, but it is one of the underlying limitations that we have to deal with in this case.

00:37:37.380 --> 00:37:45.300
So if you wanted to have simultaneous connections to the same database, you really have to spin up multiple threads in order to make that happen safely.

00:37:45.300 --> 00:37:55.720
You could always do some kind of thread pool type thing, like we're only going to allow eight connections at a time and you're just going to block until one of those becomes free and finished or whatever, right?

00:37:55.720 --> 00:37:57.160
It's definitely a tricky thing.

00:37:57.160 --> 00:38:04.060
So like the expected use case with AIO SQLite is that you'll share the database connection between multiple workers.

00:38:04.060 --> 00:38:13.420
So you'll like in the piece of your application that starts up, it would make the connection to the database and store that somewhere and then essentially pass that around.

00:38:13.420 --> 00:38:22.600
And so AIO SQLite is basically expecting to use a queue system to say whoever gets the query first is the one that, you know, gets to run it first.

00:38:22.600 --> 00:38:26.440
And whoever asked for the query second, you know, is the second one to get it.

00:38:26.440 --> 00:38:33.040
So you're still doing it all on one thread and it's slightly less performant that way, but it's at least safe and still asynchronous at least.

00:38:33.040 --> 00:38:33.860
Yeah, that's good.

00:38:33.860 --> 00:38:34.420
Very nice.

00:38:34.420 --> 00:38:43.320
And one of the things that looking at your example here, which I'll link in the show notes, of course, is Python has a lot of interesting constructs around async and await.

00:38:43.320 --> 00:38:47.560
You know, a lot of languages, you know, you think C# or JavaScript or whatever.

00:38:47.560 --> 00:38:50.560
It's kind of async function await function calls.

00:38:50.560 --> 00:38:51.200
We're good.

00:38:51.200 --> 00:38:58.320
But, you know, we've got async with, async for, a lot of interesting extensions to working with async and other constructs.

00:38:58.320 --> 00:39:01.560
Yeah, it actually makes it really nice in some ways.

00:39:01.560 --> 00:39:07.100
And essentially these are just syntactic wrappers around a whole bunch of magic methods on objects.

00:39:07.300 --> 00:39:07.500
Right.

00:39:07.500 --> 00:39:10.640
Like await thing, enter, do your thing, right?

00:39:10.640 --> 00:39:11.640
Then await exit.

00:39:11.640 --> 00:39:11.960
Right.

00:39:12.420 --> 00:39:21.500
The nice part is that for some amount of extra work in the library, setting up all those magic methods everywhere and deciding, you know, the right way to use them.

00:39:21.660 --> 00:39:28.500
The benefit at the end is that you have this very simple syntax for asynchronously iterating over the results of a cursor.

00:39:28.500 --> 00:39:40.400
In that case, you don't have to care that after, you know, 64 elements of iteration, you've exhausted the local cache and now SQLite has to go back and fetch the next batch of 64 items.

00:39:40.400 --> 00:39:43.240
In that case, it's like that's transparent to your application.

00:39:43.240 --> 00:39:50.800
And that's where the coroutine that's iterating over that cursor would then hand back its control of the event loop.

00:39:50.920 --> 00:39:56.540
And the next coroutine in waiting essentially is able to then, you know, wake up and go do its own thing, too.

00:39:56.540 --> 00:39:57.280
Oh, how cool.

00:39:57.280 --> 00:39:58.540
I didn't even really think of it that way.

00:39:58.540 --> 00:39:58.880
That's neat.

00:39:58.880 --> 00:40:02.880
Maybe next one to touch on would be AIO multiprocess.

00:40:02.880 --> 00:40:03.300
Sure.

00:40:03.300 --> 00:40:06.640
It just now crossed a thousand stars today or recently.

00:40:06.640 --> 00:40:07.200
Oh, yeah.

00:40:07.200 --> 00:40:07.780
Yeah, it did.

00:40:07.780 --> 00:40:08.420
Yeah, very recently.

00:40:08.420 --> 00:40:08.860
That's awesome.

00:40:08.860 --> 00:40:11.900
That's my real pride and joy here is getting all those stars.

00:40:11.900 --> 00:40:16.380
There's this interesting dichotomy set up between threading and multiprocessing in Python.

00:40:16.380 --> 00:40:20.480
So with multi-threading, you're able to interleave execution.

00:40:20.480 --> 00:40:28.100
So with the gil, it means that only one thread can actually be modifying Python objects or running Python code at any given time.

00:40:28.100 --> 00:40:31.380
So you're essentially limited to one core of your CPU.

00:40:31.380 --> 00:40:33.540
And these days, that's a big limitation, right?

00:40:33.540 --> 00:40:34.120
Right, right.

00:40:34.120 --> 00:40:34.400
Exactly.

00:40:34.400 --> 00:40:39.100
Like I see servers on a regular basis that are like 64 to 100 cores.

00:40:39.100 --> 00:40:42.520
So only using one of them is basically a non-starter.

00:40:42.520 --> 00:40:45.700
You get a lot of people with pitchfork saying, why aren't we using Rust?

00:40:46.020 --> 00:40:53.860
And so essentially what the alternative of this multiprocessing, where you're spinning up an individual process and each has its own gil,

00:40:53.860 --> 00:41:01.300
this does allow you for CPU intensive things to basically use all of the available cores on your system.

00:41:01.300 --> 00:41:05.660
So if you're crunching a whole bunch of numbers with NumPy or something like that,

00:41:05.660 --> 00:41:09.720
you could use multiprocessing and saturate all of your cores with no problem.

00:41:10.060 --> 00:41:16.000
In this case, essentially what happens is it spawns a child process or forks the child process on Linux.

00:41:16.000 --> 00:41:21.600
And then it uses a pickle module in order to send data back and forth between the two.

00:41:21.600 --> 00:41:23.300
And this is great.

00:41:23.300 --> 00:41:24.800
And it's really transparent.

00:41:24.800 --> 00:41:29.220
So it's super easy to write code for multiprocessing and make use of that.

00:41:29.340 --> 00:41:36.640
But the issue becomes if you have a whole bunch of really small things, you start to have a big overhead with pickling of the data back and forth.

00:41:36.640 --> 00:41:36.880
Right.

00:41:36.880 --> 00:41:40.200
And the coordination back and forth is like really challenging, right?

00:41:40.200 --> 00:41:40.540
Yeah.

00:41:40.620 --> 00:41:53.680
So like if you're pickling a whole bunch of smaller objects, you actually end up with a whole bunch of overhead from the pickle module where you're serializing and deserializing and creating a bunch of objects and, you know, synchronizing them across those processes.

00:41:53.680 --> 00:42:00.600
But the real problem is when you start to want to do things like network requests that are IO bound.

00:42:00.600 --> 00:42:07.600
In an individual process, like with multithreading, you could probably do 60 to 100 simultaneous network requests.

00:42:07.600 --> 00:42:07.920
Right.

00:42:07.920 --> 00:42:10.520
But you guys maybe have more than 60 servers or something.

00:42:10.540 --> 00:42:10.720
Sure.

00:42:10.720 --> 00:42:11.100
Right.

00:42:11.100 --> 00:42:21.620
But like if you're trying to do this with multiprocessing instead, where you have like a process pool and you give it a whole bunch of stuff to work on, each process is only going to work on one request at a time.

00:42:21.620 --> 00:42:29.660
So you might spin up a process and it waits for a couple seconds while it's doing that network request and then it sends it back and you haven't really gained anything.

00:42:29.660 --> 00:42:34.120
So if you actually really want to saturate all your cores, now you need a whole bunch more processes.

00:42:34.120 --> 00:42:37.520
And that then has the problem of a lot of memory overhead.

00:42:37.820 --> 00:42:49.520
Because even if you're using copy on write semantics with forking, the problem is that like Python goes and touches all the ref counts on everything and immediately removes any benefit of copy on write forked processes.

00:42:49.520 --> 00:42:49.900
Right.

00:42:49.900 --> 00:42:51.600
Which might do like the shared memory, right?

00:42:51.660 --> 00:42:55.820
So if I create 10 of these things, like 95% of the memory just might be one copy.

00:42:55.820 --> 00:43:01.400
But if you start touching ref counts and all sorts of stuff, you know, Instagram went so far as to disable the garbage collector.

00:43:01.400 --> 00:43:02.020
Right.

00:43:02.020 --> 00:43:03.180
To prevent that kind of stuff.

00:43:03.180 --> 00:43:03.880
It was insane.

00:43:03.880 --> 00:43:04.160
Yeah.

00:43:04.160 --> 00:43:20.760
So it turns out that if you fork a process, as soon as you get into that new process, Python touches like 60 to 70% of the objects and it's in the pool of memory, which basically means it now has to actually copy all of the memory from all of those objects.

00:43:20.760 --> 00:43:26.160
And so you don't actually get to share that much memory between the child and the parent process in the first place.

00:43:26.480 --> 00:43:35.200
So if you try to spin up, you know, a thousand processes in order to saturate 64 cores, you are wasting a lot, a lot of memory.

00:43:35.200 --> 00:43:46.620
So that's where I kind of built this piece of AIO multiprocess, where essentially what it's doing is it's spinning up a process pool and it only spins up one per core.

00:43:46.620 --> 00:43:51.500
And then on each child process, it then also spins up an asyncio event loop.

00:43:51.500 --> 00:43:51.820
Right.

00:43:51.820 --> 00:43:59.620
And rather than giving a normal synchronous function as the thing that you're mapping to a whole bunch of data points, you give a coroutine.

00:43:59.620 --> 00:44:09.780
And in this case, what AIO multiprocess is capable of doing is essentially keeping track of how many in-flight coroutines each child process is executing.

00:44:09.780 --> 00:44:22.980
And essentially being able to say that, like, if you wanted to have 32 in-flight coroutines per process and you had 32 processes, then of course you have whatever 32 times 32 is.

00:44:22.980 --> 00:44:25.340
I can't do that in my head because I'm terrible at math.

00:44:25.340 --> 00:44:31.020
Essentially, you get, you know, the cross product of those two numbers.

00:44:31.020 --> 00:44:35.780
And that's the number of actual concurrent things that you can do on AIO multiprocess.

00:44:35.780 --> 00:44:45.380
So the idea is like, instead of creating a whole bunch of one-off run this thing with these inputs over there, you say, well, let's create a chunk.

00:44:45.380 --> 00:44:50.260
Like, let's go 32 here, 32 there and run them, but do that in an async way.

00:44:50.260 --> 00:44:52.280
So you're scaling the wait times.

00:44:52.280 --> 00:44:53.380
Yeah, exactly.

00:44:53.380 --> 00:44:54.220
Anyway, right.

00:44:54.220 --> 00:44:55.600
Because you're probably doing network stuff.

00:44:55.600 --> 00:44:56.060
Yeah.

00:44:56.120 --> 00:45:03.920
And the benefit of this is essentially like you're scaling the benefits of asyncIO with the benefits of multiprocessing.

00:45:03.920 --> 00:45:07.120
So for math, that's easier for me to figure out.

00:45:07.120 --> 00:45:20.560
In reality, what we've seen is that you can generally do somewhere around 256 concurrent network requests on asyncIO on a single process before you really start to overload the event loop.

00:45:20.780 --> 00:45:28.000
Have you looked at some of the other event loop implementations like UV loop or any of those alternate event loop functions?

00:45:28.000 --> 00:45:35.800
So UV loop can make things faster, but the things that it makes faster are the parts that process like network request headers.

00:45:35.800 --> 00:45:43.660
The real problem at the end of the day is that the way that the asyncIO framework and event loops work is that for each task that you give them,

00:45:43.660 --> 00:45:48.260
it basically adds it to a round robin queue of all of the things that it has to work on.

00:45:48.640 --> 00:45:57.280
So at the end of the day, if you want to run a thousand concurrent tasks, that's a thousand things that it has to go through in order before it gets to any one task.

00:45:57.280 --> 00:45:57.620
Right.

00:45:57.620 --> 00:45:59.260
And it's going around asking, are you done?

00:45:59.260 --> 00:45:59.640
Are you done?

00:45:59.640 --> 00:45:59.860
Yeah.

00:45:59.860 --> 00:46:01.040
Or something like that, basically.

00:46:01.040 --> 00:46:09.180
And if you're doing anything with the result of that network request before you actually return the real result from your coroutine,

00:46:09.620 --> 00:46:17.220
then you're almost certainly going to be starving the event loop of or starving other coroutines on the same event loop of processing power.

00:46:17.220 --> 00:46:32.180
And so what we've seen actually is you end up with cases where you technically time out the request because it's taken too long for Python or asyncIO to get back to the network request before it hits like a TCP interrupt or something like that.

00:46:32.260 --> 00:46:32.740
That's interesting.

00:46:32.740 --> 00:46:33.040
Yeah.

00:46:33.040 --> 00:46:38.460
So this way you could say like, well, throw 10 processes or 20 processes at it and make that shorter.

00:46:38.460 --> 00:46:52.400
If you're willing to run 256 network requests per process and you have 10 processes or 10 cores, then suddenly you can run 2,500 network requests simultaneously from asyncIO and Python.

00:46:52.400 --> 00:46:58.220
At that point, you're probably saturating your network connection unless you're talking to mostly local hosts.

00:46:58.220 --> 00:47:07.860
At Facebook, when you're talking about a monitoring system, that's actually what you're doing is you're almost certainly talking to things that have super low latency to talk to and super high bandwidth.

00:47:07.860 --> 00:47:29.440
And so this was essentially the answer to that is like run asyncIO event, asyncIO event loops on a whole bunch of child processes, and then do a bunch of really like smart things to balance the load of the tasks that you're trying to run across all of those different processes in order to try and make them execute as quickly as possible.

00:47:29.440 --> 00:47:35.840
And then also, whenever possible, try to reduce the amount of times that you're serializing things back and forth.

00:47:35.840 --> 00:47:53.460
So one of the other common things that having more processes enables you to do is actually do some of the work to process, filter, aggregate that data in those child processes, rather than pickling all the data back to the parent process and then, you know, dealing with it and aggregating it there.

00:47:53.560 --> 00:47:56.480
Right, because you've already got the like scale out for CPU cores.

00:47:56.480 --> 00:48:13.320
Yeah, so it kind of gives like a local version of MapReduce, where essentially you're mapping work across all these child processes, and then inside each batch or whatever, you're aggregating that data into the result that you then send back up to the parent process, which can then process and aggregate that data further.

00:48:13.560 --> 00:48:20.740
Yeah, super cool. You gave a talk on this at PyCon in Cleveland, one of the last real actual in-person PyCons.

00:48:20.740 --> 00:48:25.360
Yeah, first one I've ever attended and the first one that I've ever given a talk at.

00:48:25.360 --> 00:48:27.040
Yeah, that was a good one, the one in Cleveland.

00:48:27.040 --> 00:48:32.640
Yeah, the room was absolutely massive and terrifying, and I don't know how I managed to do it all.

00:48:32.640 --> 00:48:41.780
Yeah, it's just kind of block it out, block it out. But no, it's all good. Cool, yeah, so I'll link to that as well. People can check that out. And it really focuses on this AIO multiprocessing part, right?

00:48:41.780 --> 00:48:48.420
Yeah. Nice. All right, last of the AIO things at OmniLib is AIO iter tools.

00:48:48.420 --> 00:49:11.680
Yeah, so you kind of hinted on this before, like iter tools is mostly a bunch of helpers that let you process lists of things or iterables in nicer ways. And AIO iter tools is just basically taking the built in functions, like iterating, getting the next thing from an iterable or mapping, or chaining between multiple iterables or whatever.

00:49:11.680 --> 00:49:27.360
And essentially bringing that into an async first world. So all of the functions in AIO iter tools will accept both like normal standard iterators or lists or whatever, as well as async iterables or generators or whatever.

00:49:27.520 --> 00:49:36.560
And essentially, it up converts everything to an async iterable, and then gives you more async iterable interfaces to work on these.

00:49:36.560 --> 00:49:46.100
So I know how to create a generator with like yield. So I can have a function, it does a thing, and then it goes through some process and it says yield an item, like here's one of the things in the list.

00:49:46.100 --> 00:49:51.980
That's already really good because it does like lazy loading, but it doesn't scale the waiting time, right? It just waits.

00:49:52.500 --> 00:49:55.920
So for the async generator, what's the difference there?

00:49:55.920 --> 00:49:56.680
In this case,

00:49:56.680 --> 00:49:57.540
Never tried one of those.

00:49:57.540 --> 00:50:13.160
If you just call the function async def and then have a yield statement in it, it creates an async generator, which is just an async iterable object that similar to how when you call a coroutine, it's an object, but it doesn't actually run until you await it.

00:50:13.160 --> 00:50:18.300
With an async generator, calling it creates the generator object, but you don't actually...

00:50:18.300 --> 00:50:20.680
Then the async part is done, right? At that point.

00:50:20.800 --> 00:50:30.020
Well, it's like it doesn't, it still doesn't even start running it until you actually start to use the async for some other async iteration to then iterate over it.

00:50:30.020 --> 00:50:38.820
If you're using the async iterator, you still get the lazy loading of everything like with a normal generator, but you also have the potential for your thing to be interrupted.

00:50:38.820 --> 00:50:49.800
The common use case here or the expected use case would be if you're doing something like talking to a whole bunch of network hosts and you want to return the results as they come in,

00:50:50.100 --> 00:51:02.140
as an async iterable, then you could use something like AIO iter tools to then do things like batch up those results or run another coroutine across every result as it comes in, things like that.

00:51:02.140 --> 00:51:07.620
The other added benefit in here is that there's also a concurrency limited version of gather.

00:51:08.100 --> 00:51:13.540
So as I said earlier, when you have a whole bunch of tasks, you're actually making the event loop do a whole bunch more work.

00:51:13.540 --> 00:51:24.660
One of the common things I've seen is that people will spawn 5000 tasks and each task or they'll all have some semaphore that limits how many of them can execute at once.

00:51:24.660 --> 00:51:28.580
But you still have 5000 tasks that the event loop is trying to service.

00:51:28.580 --> 00:51:33.460
And so you're giving a whole bunch of overhead every time it wants to switch between things.

00:51:33.460 --> 00:51:38.420
It's got to potentially go through up to 5000 of them before it gets to one that it can actually service.

00:51:38.820 --> 00:51:47.540
So the concurrency limited version of gather that AIO iter tools has lets you specify some limit, like only run 64 things at a time.

00:51:47.540 --> 00:51:55.360
And so it will, you know, try to fetch the first 64 things of all of the coroutines or awaitables that you give it.

00:51:55.360 --> 00:51:57.900
And it will start to yield those values as they come in.

00:51:57.900 --> 00:52:05.480
But essentially, it's making sure that the event loop would never see more than 64 active tasks at a time, at least from that specific use of it.

00:52:05.480 --> 00:52:07.260
Yeah, yeah, they're just hanging out of memory.

00:52:07.260 --> 00:52:10.360
They don't really get thrown into the running task.

00:52:10.360 --> 00:52:20.300
So one of the challenges or criticisms almost I've seen around asyncio is that it doesn't allow for any back pressure or whatever, right?

00:52:20.300 --> 00:52:25.540
Like if I'm talking to a database, it used to be that the web front end would have like some kind of performance limit.

00:52:25.540 --> 00:52:27.540
It could only go so hard against the database.

00:52:27.540 --> 00:52:33.280
But if you do just await it, like all the traffic just piles in until it potentially can't take it anymore.

00:52:33.280 --> 00:52:37.840
And it sounds like this has some mechanisms to address that kind of generally speaking.

00:52:38.240 --> 00:52:47.220
That's at least the general intent of it is to be able to use this concurrency limit to try and prevent overloading either the event loop or your network or whatever.

00:52:47.220 --> 00:52:55.060
So even if you have 5000 items, by setting the limit to 64, you know that, you know, you're only going to be doing that many at a time.

00:52:55.060 --> 00:53:02.880
And then you can combine that that concurrency limited gather with something like the result of that is its own async iterable.

00:53:02.880 --> 00:53:13.720
And then you could also combine that with things like chain or other things in order to mix that in with the rest of the like iter tools functional lifestyle, if you will.

00:53:13.720 --> 00:53:14.740
Yeah, yeah, yeah.

00:53:14.740 --> 00:53:15.480
Super cool.

00:53:15.480 --> 00:53:18.680
I can imagine that these might find some way to work together.

00:53:18.680 --> 00:53:26.940
You might have some asyncio, AIO iter tools thing that then you feed off to AIO multiprocessing or something like that.

00:53:26.940 --> 00:53:28.040
Do you put these together any?

00:53:28.360 --> 00:53:29.020
Yeah, exactly.

00:53:29.020 --> 00:53:33.680
These are definitely a whole bunch of tools that I've put together in various different use cases.

00:53:33.680 --> 00:53:34.640
Yeah, very neat.

00:53:34.640 --> 00:53:35.040
All right.

00:53:35.040 --> 00:53:37.040
Well, we're getting quite near the end of the show.

00:53:37.040 --> 00:53:40.900
I think we've talked about a lot about these very, very cool libraries.

00:53:41.080 --> 00:53:47.320
So before we get out of here, though, we touched on this at the beginning, but I'll ask you this as one of the two main questions at the end of the show.

00:53:47.320 --> 00:53:50.000
If you're going to write some Python code, what editor do you use?

00:53:50.000 --> 00:53:53.200
The snarky answer is anything with a Vim emulation mode.

00:53:53.200 --> 00:53:55.580
That was the thing that I learned in college.

00:53:55.580 --> 00:54:01.200
And I specifically avoided answering that earlier when we were talking about it.

00:54:01.200 --> 00:54:04.660
But that's what I learned when I was writing a whole bunch of PHP code.

00:54:04.660 --> 00:54:06.780
And that's what I used for years.

00:54:06.780 --> 00:54:09.620
And then eventually I found Sublime Text.

00:54:09.620 --> 00:54:10.680
And I really liked that.

00:54:10.680 --> 00:54:12.900
But it kind of seemed dead in the water.

00:54:12.900 --> 00:54:15.220
Adam came out, but Adam was slow.

00:54:15.340 --> 00:54:31.540
And so these days I'm using VS Code primarily because it has excellent Python integration, but also because it has a lot of like Facebook builds a lot of things that we used to have on top of Adam called Nuclide, which especially were a lot of like remote editing tools.

00:54:31.540 --> 00:54:32.020
Okay.

00:54:32.020 --> 00:54:40.000
We've rebuilt a lot of those on top of VS Code because VS Code is faster, nicer, you know, has better ongoing support from the community and so forth.

00:54:40.000 --> 00:54:40.280
Nice.

00:54:40.280 --> 00:54:40.480
Yeah.

00:54:40.480 --> 00:54:43.500
VS Code seems like the natural successor to Adam.

00:54:43.500 --> 00:54:43.900
Yeah.

00:54:43.900 --> 00:54:55.480
And like I said before, it's like I had tried PyCharm at one point, but it's one of those cases where I touch just enough stuff that's not Python that I really want my tools to work and function the same way regardless.

00:54:55.480 --> 00:55:09.560
And so VS Code has the better sort of like broader language support where it's like there's some days where I just have to write a bash script and I want it to be able to do nice things for bash or, you know, I use it as a markdown editor and it has a markdown preview.

00:55:09.560 --> 00:55:10.500
Things like that.

00:55:10.500 --> 00:55:10.740
Yep.

00:55:10.740 --> 00:55:11.200
All right.

00:55:11.200 --> 00:55:11.400
Cool.

00:55:11.400 --> 00:55:11.840
Sounds good.

00:55:11.840 --> 00:55:13.500
And then notable PyPI package.

00:55:13.500 --> 00:55:15.960
I mean, I guess we spent a lot of time on four of them, right?

00:55:15.960 --> 00:55:16.420
Yeah.

00:55:16.420 --> 00:55:18.700
I've also talked about microsort, usort.

00:55:18.700 --> 00:55:19.000
Yeah.

00:55:19.000 --> 00:55:25.500
So the joke answer is I have a package called AIO Seinfeld that's built on top of AIO SQLite.

00:55:25.820 --> 00:55:35.820
And essentially you give it a database of Seinfeld scripts and you can search for things by actor or by keyword of what they're saying.

00:55:35.820 --> 00:55:42.400
And it will essentially give you back some elements of dialogue from a script that contains your search query.

00:55:42.400 --> 00:55:53.780
And this is powering a site I have called Seinfeld quote.com, which is basically just a really old bootstrap template that lets you search for pieces of Seinfeld quotes.

00:55:53.780 --> 00:55:59.520
I also implemented a chat bot in Discord for some of my friends that also uses this.

00:55:59.520 --> 00:56:14.220
The more serious answer would be the other one that we didn't talk about from OmniLib, which is attribution, which is essentially a quick program to automate the generation of change logs and to automate the process of cutting a release for a project.

00:56:14.220 --> 00:56:16.760
And so I use this on all of the OmniLib projects.

00:56:16.760 --> 00:56:20.440
And essentially I type one command attribution release.

00:56:20.440 --> 00:56:23.100
I'm sorry, attribution tag and then a version number.

00:56:23.100 --> 00:56:29.380
And it will drop a dunder version.py in the project directory.

00:56:29.380 --> 00:56:32.020
It will create a get tag.

00:56:32.020 --> 00:56:35.300
It lets you then type in what you want the release notes to be.

00:56:35.300 --> 00:56:37.100
It's assuming, you know, a markdown format.

00:56:37.100 --> 00:56:44.500
And then once it's made that tag, then it regenerates the change log for that tag and retags it appropriately.

00:56:44.500 --> 00:56:52.680
And so you get this really nice thing where the actual tag of the project has both the updated change log and the appropriate version number file.

00:56:52.680 --> 00:56:55.160
So you only ever type the version in once.

00:56:55.160 --> 00:56:57.800
You only ever type the release notes in once.

00:56:57.800 --> 00:57:02.000
And it gives you, you know, as much help and automation around that as possible.

00:57:02.000 --> 00:57:02.520
Oh, yeah.

00:57:02.520 --> 00:57:03.400
Okay, very cool.

00:57:03.400 --> 00:57:04.280
That's a good one.

00:57:04.280 --> 00:57:04.720
All right.

00:57:04.720 --> 00:57:05.580
Final call to action.

00:57:05.800 --> 00:57:10.520
If people are excited about AsyncIO, maybe some of the stuff at OmniLib, they want to get started.

00:57:10.520 --> 00:57:11.220
What do you tell them?

00:57:11.220 --> 00:57:19.500
If they want to get started on the projects, going to OmniLib.dev is the easiest way to find the ones that are currently hosted on the project.

00:57:19.500 --> 00:57:22.960
We're always welcoming of code review from the community.

00:57:22.960 --> 00:57:31.900
So even if you're, you know, not a maintainer, if you are interested in reviewing pull requests and giving feedback on things, always welcoming of that.

00:57:32.040 --> 00:57:36.760
There's never enough time in my own personal life to review everything or respond to everything.

00:57:36.940 --> 00:57:46.680
Otherwise, if there are things in these projects that you are interested in adding, like new features or fixing bugs or whatever, either open an issue or just create a pull request.

00:57:46.680 --> 00:57:52.400
And I am more than happy to engage in design decisions or discussions or whatever.

00:57:52.400 --> 00:57:59.160
Make sure that ideally, if you open an issue first, make sure you're not wasting your time on a pull request that's going in the wrong direction.

00:57:59.420 --> 00:57:59.840
Right, right.

00:57:59.840 --> 00:58:05.540
Because people might have this idea, but you're like, this is really inconsistent where this project is going to go or whatever.

00:58:05.540 --> 00:58:08.000
So even if it's perfect, you can't accept it, right?

00:58:08.000 --> 00:58:09.020
So good advice.

00:58:09.020 --> 00:58:15.240
If it's just like a bug fix or something, then, you know, probably just worth creating a pull request and I'm not going to bite your head off.

00:58:15.240 --> 00:58:22.780
But otherwise, the only other thing I would say is that LGBTQ things are very personal to me.

00:58:22.780 --> 00:58:30.360
And so I would ask that if you're in a position to do so, that you please donate to an LGBTQ charity that will help in the community.

00:58:30.360 --> 00:58:32.560
There's two that I really like.

00:58:32.560 --> 00:58:33.780
One is called Power On.

00:58:34.060 --> 00:58:39.260
And that's a charity that donates technology to LGBTQ youth.

00:58:39.260 --> 00:58:41.340
They're either homeless or disadvantaged.

00:58:41.340 --> 00:58:45.680
And they have that at poweronlgbt.org.

00:58:45.680 --> 00:58:52.440
And then the other one is the Trevor Project, which is crisis intervention and a suicide hotline for LGBTQ youth.

00:58:52.440 --> 00:58:54.800
And that's at thetrevorproject.org.

00:58:54.800 --> 00:58:55.500
Yeah, awesome.

00:58:55.500 --> 00:58:57.280
Those are just two examples, but there are plenty.

00:58:57.280 --> 00:58:59.740
Worst case, just donate to a food bank near you.

00:58:59.740 --> 00:59:00.000
Cool.

00:59:00.000 --> 00:59:01.580
Yeah, that's a great advice.

00:59:01.580 --> 00:59:02.600
Great call to action.

00:59:02.600 --> 00:59:06.980
Seems like your projects are also really open to new contributors, people getting into open source.

00:59:06.980 --> 00:59:10.080
So participating in that way seems like a great thing.

00:59:10.080 --> 00:59:10.700
Fantastic.

00:59:10.700 --> 00:59:11.300
All right, John.

00:59:11.300 --> 00:59:13.180
Well, thank you so much for being on Talk Python.

00:59:13.180 --> 00:59:14.500
It's been great to have you here.

00:59:14.500 --> 00:59:15.640
Thank you for having me so much.

00:59:15.640 --> 00:59:16.440
I really appreciate it.

00:59:16.440 --> 00:59:19.820
This has been another episode of Talk Python to Me.

00:59:19.820 --> 00:59:25.040
Our guest on this episode was John Reese, and it's been brought to you by Linode and Talk Python Training.

00:59:25.040 --> 00:59:30.140
Simplify your infrastructure and cut your cloud bills in half with Linode's Linux virtual machines.

00:59:30.140 --> 00:59:33.500
Develop, deploy, and scale your modern applications faster and easier.

00:59:33.500 --> 00:59:38.480
Visit talkpython.fm/Linode and click the Create Free Account button to get started.

00:59:38.480 --> 00:59:40.180
Want to level up your Python?

00:59:40.180 --> 00:59:44.320
We have one of the largest catalogs of Python video courses over at Talk Python.

00:59:44.320 --> 00:59:49.420
Our content ranges from true beginners to deeply advanced topics like memory and async.

00:59:49.420 --> 00:59:52.080
And best of all, there's not a subscription in sight.

00:59:52.460 --> 00:59:54.980
Check it out for yourself at training.talkpython.fm.

00:59:54.980 --> 00:59:57.080
Be sure to subscribe to the show.

00:59:57.080 --> 00:59:59.880
Open your favorite podcast app and search for Python.

00:59:59.880 --> 01:00:01.180
We should be right at the top.

01:00:01.180 --> 01:00:06.360
You can also find the iTunes feed at /itunes, the Google Play feed at /play,

01:00:06.360 --> 01:00:10.560
and the direct RSS feed at /rss on talkpython.fm.

01:00:10.560 --> 01:00:13.980
We're live streaming most of our recordings these days.

01:00:14.280 --> 01:00:17.380
If you want to be part of the show and have your comments featured on the air,

01:00:17.380 --> 01:00:21.760
be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

01:00:21.760 --> 01:00:23.660
This is your host, Michael Kennedy.

01:00:23.660 --> 01:00:24.960
Thanks so much for listening.

01:00:24.960 --> 01:00:26.120
I really appreciate it.

01:00:26.120 --> 01:00:28.020
Now get out there and write some Python code.

01:00:28.020 --> 01:00:48.640
I'll see you next time.

01:00:48.640 --> 01:01:18.620
Thank you.