Monitor performance issues & errors in your code

#365: Solving Negative Engineering Problems with Prefect Transcript

Recorded on Monday, May 9, 2022.

00:00 How much time do you spend solving negative engineering problems, and can a framework solve them for you? Think of negative engineering as things you do to avoid bad outcomes in software. At the lowest level, this can be writing good error handling with try accept is broader than that logging observability like with Sentry tools retries failovers as in what you might get from Kubernetes, and so on. We have a great chat with Chris White about Prefect, a tool for data engineers and data scientists, meaning to solve these problems automatically. It's also a conversation I think is applicable to the broader software development community as well. This is Talke Python to Me episode 365, recorded May 9, 2022.

00:55 Welcome to Talk Python to Me, a weekly podcast on Python.

00:58 This is your host, Michael Kennedy. Follow me on Twitter while I'm @mkennedy and keep up with the show and listen to past episodes at talk.

01:05 Python.

01:06 Fm and follow the show on Twitter via @talk

01:09 Python.

01:10 We've started streaming most of our episodes live on YouTube.

01:13 Subscribe to our YouTube channel over at 'talkpython.fm'.

01:16 /Youtube to get notified about upcoming shows and be part of that episode.

01:21 This episode is sponsored by Microsoft for startups foundershub. Check them out at 'talkpython. Fm/foundershub' to get early support for your start up, and it's brought to you by us over at talkpython.training.

01:35 Did you know we have one of the largest course libraries for Python courses and they're all available without a subscription, so check it out over at talkpython. Fm. Just click on Courses. Transcripts for this, and all of our episodes are brought to you by Assembly AI. Do you need a great automatic speech to text API? Get human level accuracy in just a few lines of code? Visit 'talkpython.fm/assemblyai'.

01:59 Chris, welcome to Talk Python to Me yeah.

02:01 Thanks for having me. Happy to be here.

02:03 I'm happy to have you here. It's going to be a lot of fun, and we're going to talk a lot of data engineering things.

02:08 Loop that back to the more traditional software development side. You have a really cool open source project and startup in Prefect, and we're both going to talk about the product as well as making an open source successful business model, which is really cool. Yeah, I'm very into that. That's very important to me, and I love to highlight cases of people doing that. Well, it looks like you all are excellent.

02:31 As we'll see as we get into it. It's a core part of how we view the world.

02:35 Yes, that's awesome. Before we get to all that, though, let's start with your story. How did you get into programming in Python?

02:40 Cool. So I think, like a lot of I was born in the 80s. Like a lot of people in my generation, we got into HTML and building websites on Angel Fire and GeoCities. So from the early days, I was into playing around with computers. I would say didn't really get into Python until probably high school is when I first started Dabbling. And it was really just I was working in a bank. I was trying to automate some small things. And to be clear, I did not get very far. It was a mildly successful undertaking. And then college kind of similar story. Like, it's one of those things. I had a couple of books I would play around with it. It was always just a fun activity for me, but I never had any serious focus on it. I think in college when I started to get into really econometrics, when I started to play more seriously and start to try to understand some of the performance implications of what I was doing and things like that, and then taking to the next level. In grad school, a friend of mine and myself thought that we were smart enough to build some machine learning models for trading, and we did all of that in combination of Python and R. Surprise.

03:51 Okay, that's amazing.

03:53 Yeah.

03:53 Lost some money, but it was a great lesson in understanding data and exactly what you're doing. And genuinely so I was studying pure math in grad school, and that experiment and kind of the actual visceral outcome of it is what really got me into studying machine learning more deeply because. Wait a second, there's something interesting here. And so started to dig in more.

04:16 That must have been a really cool experience. Even if you did lose money.

04:19 It'S important to put some skin in the games.

04:21 Exactly. Right. It's a hobby until you start to take it seriously, try to get real outcomes, because there's always these layers of these levels. Right. Like I'm going to learn this thing and poke around and kind of get it to work, or I'm going to learn this thing so that it actually works, or I'm going to learn this thing so I can explain it. When there are three ways, they all kind of do the same thing. I can explain when to choose which one. Right. And the more seriously you take it and the more is on the line, the more you get that real understanding of it.

04:50 I mean, this is a little off of Python, but I think that resonates with me so much on every dimension. When it comes to learning something, you have to kind of passive activity. You have to engage with it. In my opinion, the only way to learn anything. Math, programming, business, whatever.

05:06 Totally agree. Speaking of math, you have a math background, right?

05:09 I do, probably. So, yes.

05:11 Yeah. Awesome. So do I. I just last episode spoke with the SymPy S-Y-M Py guys about doing symbolic math with Python, and that was pretty fascinating stuff. Have you played with any of the symbolic stuff?

05:25 I have played with it, yeah. I have no serious project, like genuinely just playing around with it to see kind of what it's capable of, but it is really cool. I will definitely go look up that episode.

05:35 Awesome. What kind of math did you study?

05:38 So I started my PhD program focusing on arithmetic geometry, which for people out there, it's one of the more abstract forms of math. I still really like that stuff a lot, honestly. But what I found was that like manifolds and stuff like that. Not quite. It's a little bit. So they use a lot of the same.

05:58 A lot of their arguments are kind of by analogy with things like manifolds. But you're studying geometric structures that are way more discreet than a manifold. And so the spectrum of prime numbers on the integers is like a geometric object for the purposes of arithmetic geometry and hard to visualize. But it just turns out a lot of the formal definitions of geometry have these really deep analogs and arithmetic and you can actually learn a lot. And the most famous example of this that I think a lot of people will probably be familiar with is Vermont's Last Theorem.

06:31 Yeah, that's right.

06:32 So that was all arithmetic geometry I solved using some of those techniques.

06:35 Yes, that's right.

06:36 So they're really heavy duty to solve very simple two state problems.

06:39 There was a really good book about Fairmont's Last Theorem. This guy named something like Amol or something like that. Maybe I'll find it for the show notes. Just absolutely fascinating book.

06:54 Yeah, it really talks about the struggle that guy who solved it went through.

06:58 Cool.

06:58 Well, how about now? You're a prefect, right? What do you do day to day?

07:02 I didn't finish where I went with grad school, which is relevant to how I got to prefix, which I ended up going into optimization theory. Still on the peer side, so still very much proof. But that was right when machine learning was becoming a thing, right after I had done this experiment with my friend and started getting into it. And so that's kind of how I started to head more towards industry. And for me, I consider myself a problem solver. And so I was always very good at solving problems. But I'll admit I wasn't always the best at maybe like justifying a grant proposal or something like that. And so that's kind of when I started to think more about industry and started doing some consulting to test the waters.

07:39 Anyways, long story short, got into data, got into machine learning, got into tooling for machine learning, got into back end engineering for hosting the tooling, et cetera, et cetera. Jeremiah, who is the CEO of Prefect.

07:53 Just sucks you in, right?

07:55 Exactly. It's a black hole.

07:57 Yeah.

07:58 Cool. So you started out working at banks doing sort of homegrown data engineering, right? Is that maybe a good way to describe it?

08:08 Homegrown wattage things. Yes. So homegrown data engineering, one of them. Homegrown model building was another big thing. Homegrown data governance. At one point, homegrown data platform.

08:21 Actually, the data platform is particularly interesting because looking back, it kind of felt like a microcosm of a lot of the tools that we've seen we've seen lately where the platform we were building was for data scientists to deploy their model to and connected up to data sources that we would keep up to date. And that's where the data engineering comes in. And then business analysts or the actual downstream users, and they would interact with these models through an API that we would build for them on top of the models. It was still in Python, so they would actually have to write Python, which is really interesting. So you got to teach them classes on it.

08:52 That is interesting. Yeah.

08:55 I've seen non developer types do that before. I've seen it in real time stock trading, brokerages and hedge funds.

09:03 Right.

09:05 You're going to learn Python because you need to talk to the tool. I've also seen people learn SQL who have no business knowing SQL otherwise, but for sort of a similar reason.

09:13 Yeah. And SQL is a little bit more approachable. I think it's easier to shoot yourself in the Python if you don't really know it.

09:22 It's just more open ended. Right. It's just way more open ended. Cool. All right, so you talk a lot about negative engineering concepts and how you've structured prefect to help alleviate, eliminate, solve, prevent some of those problems. So maybe we should start the conversation in sort of twofold, maybe give us a quick overview of what you might call data engineering. And then what are these negative engineering things that live in that space?

09:51 Yeah.

09:51 And I think these two concepts are related, but also I think negative engineering is definitely more general.

09:56 Let me start with negative engineering, and we'll kind of drill in. So negative engineering, we've got a blog post that we published, I think, three years ago at this point on negative engineering. Encourage anyone who's interested, go read it. I think since we have released a blog post, we have refined kind of our own understanding and thinking about this. And one thing that I kind of notice is negative engineering got this sentiment. Like, it's just anything I don't really like to do, and that's not accurate.

10:22 Those are negative engineering. I'll tell you what. Nothing.

10:27 I don't think so to be really precise with the way we think about this, positive engineering is code or interacting with software systems that you do explicitly to achieve an outcome. So I run a SQL query to populate my dashboard or something like that. It's very concrete connection to some sort of outcome. And the negative engineering are code you write systems you interact with that ensure those outcomes. I in insurance. And so defensive code is a great example of negative engineering. It's something that you're writing when you're writing those try accepts and everything you're really hedging against anticipated failure modes that you're trying to account for. Right now.

11:12 Right. If the data was always well formed, it would never crash. If the servers was Holding up, always up, it would never crash.

11:18 Exactly.

11:19 Then you get the reports, the Sentry messages or whatever.

11:23 Exactly. And so observability. I think complete negative engineering observability is not something you do for its own sake. You do it in anticipation of an unknown future failure mode and it allows you negative outcome if you want to avoid. Yeah. Failure is only the first class citizen here, right. Something failed and you want to figure out what happened so you can fix it to really tie it to insurance even more directly. All of the things we're talking about are situations where a small error has a disproportionately large negative impact on an outcome. So scheduling is an example here. If you have Cron running on a server, running a Python script and something you do, maybe you load just far too much data in your script and the machine crashes out of memory. You don't get an alert, you wake up the next morning and 30 jobs have not run. You don't know why. You have to figure out why. By the time you figure it out, you're 5 hours deep. Maybe not that long, but 2 hours deep. And using a service or systems like Prefect or some other type of observability negative engineering tool, you would potentially get a text alert, or at a minimum, you would wake up and immediately see that happened to 01:00 A.m.. I know what happened. Let me just fix it really quickly and your backup is speed.

12:36 Interesting.

12:37 Yeah.

12:37 Frank on the audience says defensive programming, that means good handling on exceptions and so forth. And I think that's interesting, Frank. I do agree.

12:47 But it sounds to me, Chris, like you're even talking way broader.

12:52 If you're writing an API, do you have to even think about hosting that or making sure that it's scaled out correct or observability as tracking error reporting in the broad sense of sure, you should be doing the small defensive programming, but also to deal with these negative engineering problems, but like businesses around dealing with segments of it.

13:12 Exactly. And I think putting a word to it, as simple as it may seem, really helps, especially for building a company and a product like Refine and Target. Like, what are the features that are important to us and which ones are not important, at least at this time and especially in orchestration and data engineering. I mean, it's very tempting to build cool stuff because there's lots of cool stuff you can build. But are you guaranteeing an outcome? Are you insuring against some outcome? Like, are you sure you know exactly what you're providing here when you build that cool thing?

13:42 Right. So you guys try to identify some of these areas of negative engineering that data engineers run into, and you're like, how do we build a framework so that they don't have to worry about or think about that.

13:53 Exactly. And so for data engineering, I think of this as any software engineering that you do that either moves data, cleans data, or prepares data either for another person to ingest or maybe another system to ingest. But it's all the activities surrounding that I know. Maybe not everyone listening in the data space specifically. So just as the easiest example, we have a production database running behind some web server, some API, and you want to do analytics on it. Well, maybe you're using Postgres, not the best analytics database. And also you don't actually write a query that takes down the database. So what do you do? You take the data out, you put it into BigQuery or Snowflake or somewhere else. You run your analytics over there, the schemas.

14:38 You probably totally change the schemas because you want to in a relational database, somewhat document database, maybe a little less, but definitely in a relational database, your job is third normal form. Like how do I not have any data that repeats? I'll have a ten way join rather than have something repeat. But when you want to do reporting, those joins are killers. You just want I want to do a straight query where this column is that and just like wreck the normalization for performance reasons.

15:07 Right, right.

15:07 You just want to have fun so you can ask the questions in very interesting ways, like many ways in simple queries rather than being a SQL master.

15:18 Exactly. And keeping that system running, keeping the data fresh, keeping the schemas in sync. That's a lot of work, actually. And that's one of the classic examples of data engineering. There's a lot of other stuff, too. That's the classic.

15:32 This portion of Talk Python to Me is brought to you by Microsoft for Startups Founders Hub Starting a business is hard. By some estimates, over 90% of startups will go out of business in just their first year. With that in mind, Microsoft for Startups set out to understand what startups need to be successful and to create a digital platform to help them overcome those challenges. Microsoft for Startups Founders Hub was born. Founders Hub provides all founders at any stage with free resources to solve their startup challenges. The platform provides technology benefits, access to expert guidance and skilled resources, mentorship and networking connections, and much more. Unlike others in the industry, Microsoft for Startups Founders Hub doesn't require startups to be investor backed or third party validated to participate. Founders Hub is truly open to all. So what do you get if you join them? You speed up your development with free access to GitHub and Microsoft cloud computing resources and the ability to unlock more credits over time. To help your startup innovate, Founders Hub is partnering with innovative companies like OpenAI, a global leader in AI research and development, to provide exclusive benefits and discounts through Microsoft for Startups Founders Hub Becoming a Founder is no longer about who you know. You'll have access to their mentorship network, giving you a pool of hundreds of mentors across a range of disciplines and areas like idea validation, fundraising, management and coaching, sales and marketing, as well as specific technical stress points. You'll be able to book a one on one meeting with the mentors, many of whom are former founders themselves. Make your idea a reality today with the critical support you'll get from Founder Hub. To join the program, just visit talkpython. Fm/foundershub all one word the links in your show notes. Thank you to Microsoft for supporting the show.

17:24 One of the things I see stand out. I don't want to get the API right away, but I just see coming out of the API that you all builders there's like retries, right front and center. Like here's a task and I wanted to retry with this plan either this number of times, but there's probably like a back off story and stuff.

17:40 Exactly.

17:41 Another great example of small error, the tiniest network. Blow your Kubernetes. I don't know. I've seen Q DNS sometimes it just doesn't do what it's supposed to do.

17:51 Somebody is flipping over the load balancer and you hit it at just the wrong time. And there it goes. Right?

17:55 Exactly. Now, next thing you know, I mean, a lot of different things can happen depending on the script you wrote. Maybe you did a lot of good defensive programming yourself and the try accept was a little bit too much. And so your next task actually runs despite the first one failing, and maybe it passes the exception downstream. And now you have this cascade of errors that you have no idea what they mean. And another thing, the negative engineering is dependency management, making sure that if this fails, things that depend on it do not run unless they are configured to run failure.

18:23 Yeah. Worst case scenario, they say, yes, this is a good investment, you should buy it. Or yes, this is a good decision.

18:29 Exactly.

18:31 Well, it's zero because the task failed to find the price. Of course, you should buy it.

18:34 Exactly. You want to know that it happened and make sure that the effect, the blast radius is minimized. And that's really what it's all about. Retry is a perfect example of just one of those small things that can cascade. Weird. Unexpected ways

18:46 What are some of the other areas of cruster of these problems you see in data engineering?

18:50 Logging is a big one, just having a place where you can see some centralized set of important logs any and all the more use like Kubernetes are more like you kind of.

19:00 Oh yeah. Distributed system micro service it out, the harder it is to know what's going on in the logging story.

19:06 All right. And the definition we work with in the modern data stack are data tools that deliver their feature over an API. And so if you think about that you're dealing with inherently, this giant micro service system that you want to coordinate and see in some centralized place in some meaningful way, collaboration versioning. Those are all other things caching. So just configurable, like storage locations for things. And then maybe the biggest one. That is simple. But I see people building this internally all the time, which is just exposing an API parameterized API for just triggering some type of job. Next thing you know, it needs to be available throughout your whole network. It needs to be opted, it needs to be monitored and tracked and audited. And all these versions, all of a sudden you're like, okay, I'm building an entire system. My job is not done.

20:00 Yeah, it's one of those things that seems so simple. I would love it if you would just really help us out. Michael, here, if you could just give me a little API that we could just call that API. Look, here's the JSON. It's like that big. And if I could just call it but things would just unlock and then it's like a holiday and it's not working. And now I'm dealing like, how did I get this job right?

20:25 And the second someone says just anything you're like, I'm on edge. What do you mean? Are you sure?

20:30 Exactly.

20:31 I'd give it to someone else.

20:34 Awesome. All right, well, maybe that's a good time to talk a little bit more in detail about Prefect. So you all have on the GitHub page. If I track it down, you've got an interesting way to discuss it. Says Prefect is a new workflow management system designed for modern infrastructure and powered by the open source Prefect core. Users organized tasks into flows, and Prefect takes care of the rest. So there's a lot of stuff here that I thought might be fun to dive into. So new workflow management system as opposed to what was there before? Maybe we could sort of take this apart a bit.

21:10 Yeah.

21:12 What do you mean by new workflow?

21:14 I know you also have a new new one coming as well, right?

21:17 Yeah, we have new new always got to keep rebuilding. We have a great post on at least part of this that encourage people to go check out called The History of Data Flow Automation. That really will get our head of product wrote it. And it's just a great kind of tour through the history. But so for us, a lot of the different workflow. So workflow management. Right. You have some set of business logic tasks that are string together with some dependency. It could be a lot of conditionals or something like that. You want to run it usually on a schedule that sometimes ad hoc or maybe event base. And there's a lot of different systems for managing these quote unquote workflows.

21:51 Okay.

21:51 Many of them, I guess one way to think about it is they're cut by context. What context are you operating in. Is this like a data context or like Zapier for example, is a very consumer facing and what is the user percentage?

22:04 Right. If you think about Zapier with all these different automations, all these triggers, and then all these actions, it's just like the plumbing of that must be insane.

22:11 Exactly. And that's a workflow. It's totally valid. And their user persona is a no code person. Also totally valid. And so for us, new workflow management system means kind of next generation after a lot of the Hadoop tooling. So Hadoop you can kind of see in this post too. Hadoop caused an explosion of just really cool new tools. And Airflow Luigi as cabin is another one, maybe. Luigi. A lot of these kind of came out of that error to manage these distributed jobs. And so they're kind of like, I think of them as like distributed state based Cron.

22:51 You can put them on a. Welldefined schedule, they manage it's actual dependency management, which Cron does not do. And they can do it kind of across multiple computers, which is really convenient. Yeah.

23:04 In a really simple way. It's like kind of the Cron. It is kind of like chronic just look here for data and then just run this process against it. But it's so much more with the dependencies and then pass it to here and then it's just the flow.

23:18 Exactly.

23:19 You would be insane to try that with just timing.

23:21 Exactly.

23:24 New for us can be a lot of different things, but I'll just say we just talked about it. It's really taking approach of scheduling is important and alerting on failures of scheduling is important. But we're like expanding the vision there. And it's much more about this negative engineering, which includes observability configuration management, event driven work, not just scheduled work scale is really important because data scientists have a lot of the same needs as data engineers. And those tools were not meant for data scientists.

23:50 Right.

23:50 Yeah.

23:51 You were talking I heard you speak about wanting to run a bunch of experiments, like hundreds of thousands of experiments as a data scientist. And some of the other tools would talk about running operations in tens per minute or tens, something like that. And you're like, I need something that does it in tens or hundreds per second.

24:09 Yes. Anything that allows you to just explore a search space of hyper parameters and do so in a way that is easy to quickly find some subset of those parameters and see whether they succeeded or failed. You can define that criteria. You can raise an exception, for example, if like some output just violates some assumption you have, and then that way it shows up as red. It's like you're not going to look at that. And managing an interface to the infrastructure is another big part of this. So I guess maybe I'm jumping ahead, but the next part is designed for modern infrastructure and modern infrastructure can mean a lot of different things for us. It means, first off, that there's a diverse array of infrastructure people use and so creating a system that can plug and play with a lot of them. So we support, for example, some of the more popular ways of deploying prefect flows are in Kubernetes and Fargate, so kind of like a serverless style model. Also, you can do it on your local machine. And so just having that kind of unified interface to interact with all these things is one aspect of modern. Another is that local development to cloud development story that's really important, right? You want to make sure that these are as close as possible to each other so that you can debug things locally and things like that. So that's another aspect as we try really hard to 2.0 gets this way better than 1.0 for the record of mirroring what code is exactly running in cloud versus your local.

25:34 Something that always makes me nervous when I hear people talking about, oh, this is cloud native and you can just there's like 50 different services in this particular cloud. And so why don't you just leverage like nine or ten of them? And I just always think, what is the development going to feel like for that if I'm on a spotty internet connection or something like that? Is it just inaccessible to work on? Do I have to just completely sort of live in this cloud world? And it sounds like there's a more sort of local version that you can try and work with as well here.

26:08 And one of the things that we achieved with 20 is we refactored kind of where different aspects of orchestration takes place. And so all of the true orchestration logic that we want to own runs behind an API. And the reason that I'm saying I'm emphasizing that is in one of those that's not 100% true. And so when you run a workflow locally, it's talking to an API. Maybe it's your self hosted open source API, so it's maybe responding slightly differently. But the code path running on your machine and its requirements and everything else is exactly the same as what's going to run in production. It just might talk to a different URL.

26:45 Sure. Let me stick on this. We were halfway through your sentence. I do want to talk more about the cloud and stuff. So powered by the open source prefect core workflow engine. Tell us about that.

26:58 So since day one, always wanted to put as much open source as is reasonable and one of kind of the ways that we think about what we put in the open source. And then I'll tell you what kind of a workflow engine is, what are the things that we are maximally leveraged to support extensions of and new configurations of. And our core workflow engine is definitely one of those things, right? We're the experts in it and that engine is the thing that manages, for example, that downstream dependency can't run if it's upstream failed or maybe just hasn't completed yet. The caching logic is part of that workflow engine, the triggering logic for the workflow, the scheduling of the workflow, all of that stuff is open source.

27:39 That's the UI visibility towards tracking bit as well, right?

27:44 100%, yes, that is all open source. And we built it as a part. So we actually have a dedicated front end team and we build the UI and package it up in the packages of Rebuild website.

27:55 Yeah, I'm not sure where I would. Here we go. I found a cool little UI.

28:00 There we go. That's the two. Yeah.

28:03 So, I mean, this UI to see what's working, what's not working, how often is it succeeded, what succeeded, what's failing, what jobs are unhealthy, for example. That's all negative engineering. Right. Your job wasn't to start out to build this observability web front end. Your job was to get the data in and then get it into the database and start doing it for analysis or whatever.

28:25 Exactly.

28:25 Predictions. But here you are in VueJS going after it, right. Or whatever it is.

28:30 No, it is Vue. Good call.

28:32 Right.

28:33 On. One of the ways I think about this dashboard view is it gives you this landing page. You have some mental model of your expectations. You can check quickly if they are violated here and then if so, dig in further, click around. And if not, we are more than happy when people exit out of the UI. And like, we're moving on. It's like, perfect, we did our job then.

28:51 Yeah, that's good.

28:53 So that's part of the core engine.

28:55 Yeah. 100% things like off, for example, are not part of that. So in that case, a lot of ways off can get extended. There's a lot of different ways that we might implement it. And that's not exactly right. Our competitive advantage supporting different ways that you might deploy all securely. And so it's like, no, that's our platform feature. We can do it in the way we know best and can do it securely.

29:17 Sure. And so it's worth pointing out, I suppose that the way it works is there's the open source engine and then there's the Python API, and then you talk about different ways to run it and host it. Right. So one way to host it is to just use your cloud. Right. You've got the prefect cloud where it just runs with all these things. It's on there. And then the others, I could self host that core workflow engine or just run it on my laptop or whatever.

29:45 So it's a little bit more complicated than that, actually, in an interesting Jeremy and I both come from finance world, and so a lot of our first kind of early design partners and advisors come from that world. And one of the challenges one of our advisors gave us was very genuinely, I don't want to learn your tech stack so that I can host it within my tech stack. And there's no way I'm ever going to give you my code or data because it's highly proprietary. That's your problem. We're like, okay, well, that sounds impossible, right?

30:16 I think companies are already so freaked out about losing the data without even meaningfully giving it to someone else, right? They're already like, well, we might lose this, we might be ransomware and there might be other things. Right? And so the idea of just handing it over does seem probably pretty far out there for a lot of them.

30:33 Exactly. And so what we designed after a long time, we really, like, thought about it, but we did this back in 2018, maybe beginning 2018, we came up with a model where orchestration takes place over an API. And if you really think about it, think of other orchestrators. Cron Kubernetes is a container orchestrator. They operate on metadata, they operate on container registry locations and specs for how you expect it to run. And once we had that insight, we design the system so that the cloud hosted API that we run operates purely in metadata, result locations, flow names, flow versions, things like that. And then you run an open source agent anywhere that you want, and it operates on a pure outbound political model. So all of our features are based on the agent pulling, and then your workflow also potentially doing some communication. And because of that, there's still this outbound connection you have to think about. You still have to trust us with some of your parameters. There's definitely still some security surface area that we have to think about, but we do not post your data and we do not have access to your execution, and that unlocked this problem for us. And so as long as we have enough agents that can be deployed in lots of different places, then we can deliver a lot of value.

31:50 Yeah, that's pretty excellent. So you want to host it in AWS or Kubernetes or Linode or wherever? That's up to you.

31:58 Exactly. 100% up to you.

32:00 Is there a way where I can do it somewhat offline? Like, for example, with the open source core engine? Does that still go back to you guys, or is that sort of local?

32:10 No, that's totally local. And it's designed with the same hybrid approach. So you could have your platform team, maybe your DevOps team hosting the API for you and the database behind it, and then you, as the data team, can manage your agents. And just as long as you have access to the API, you can set it up the exact same way internally if you want. And we've seen places do that for sure.

32:32 If you're a regular listener of the podcast, you surely heard about Talk Python's online courses, but have you had a chance to try them out? No, matter the level you're looking for. We have a course for you. Our Python for absolute beginners is like an introduction to Python. Plus that first year computer science course that you never took. Our data driven web app courses build a full PyPi.org clone along with you right on the screen. And we even have a few courses to dip your toe in with. See what we have to offer at training.talkpython.fm, or just click the link in your podcast player.

33:07 Let's maybe talk through a quick example of using it. Oh, hold on. The last part of that sentence is users organized tasks into flows. And so let's look at a quick example maybe of the code that you might do here. Let's see here. This probably isn't it's always tricky to talk about code on audio formats, but just give us a sense of what does it look like to write code for Prefect?

33:35 Yeah. So one of our design principles. Right. We talked a little while ago about this negative engineering problem. It kind of emerges, and eventually you're doing all those activities that you didn't care about. And kind of an interesting way we try to mirror that with the way prefect gets adopted. So I love to call it incremental adoption. I want the complexity of what you're trying to achieve and the amount of code you have to write to scale. I mean, ideally like sublink or something, but scale together. And so an example we have here, 2.0 takes this way further, but we operate on this decorator model. So just really simple. You have Python functions, you already wrote them. You presumably already even have the script. You just want to sprinkle in some prefect so that you get some observability into it. And then if you want to start to do more and more things, you might have to write more and more code. But it's appropriate for the activity that you're trying to achieve.

34:24 We try to be really simple. We like it when people kind of get the feeling that this is like a toy kind of package that you put play around that just has these heavy duty impacts. So your tasks are the smallest unit of work that we can look at. Tasks can do things like retry, they can cache. They have well defined inputs and outputs. Flows are containers for managing dependencies of tasks. They also have well defined inputs, and outputs also have their own States. But flows are the things that can be scheduled and triggered via API. And tasks are kind of just the smaller, more granular units of orchestration within those workflows.

35:02 So the way that this looks is it looks just like a function, and you kind of just call it with the arguments or whatever.

35:10 Yeah.

35:10 And then you put a task decorator on there, which is pretty interesting. And that's where the retry thing can be.

35:16 Exactly.

35:16 Then you also have a context manager, which I think is a nice pattern. So you have a context manager to create the flow, and then you basically simulate doing all the work with an abstract parameter, and then you kind of set it off. Right.

35:28 So that is true in 1.0 however. Okay, what we found that's something new is coming up, which is important. That Context manager, all that code runs like you called out. And so it compiles this dag that's everyone direct to a cyclic graph. What we realized in talking to a lot of our users on 1.0 is that confronting the Dag, because sometimes people would write their own Python code that wasn't prefect in that context manager and it would actually run, it wouldn't be deferred. They get confused, why do I have to care about this? And we started to realize that this tag model really came most likely out of the constraints of YAML flat file formats, and they were mirrored in all the different tools that were built on top of that. And then all of a sudden everyone's talking about dags. Data engineer, when they're writing a script to move data around, should focus on the script. They shouldn't focus on this abstract program concept of can't do control flow essentially without really thinking deeply about it. And so in 2.0, we remove this context manager. Flows are also now specified via decorator. So the deferred computation is just function definition. And now we will discover the tasks at runtime and you can implement native Python logic close. That's totally fine by us. So it just unlocks the expressiveness of what you can write. And prefect really natively.

36:49 That's awesome. So you can have like loops or if statements or whatever you want to write.

36:54 Oh, yeah. While statements even. Yeah. You can have flows to change structure from run to run. All of this.

37:00 Okay, so the thing that strikes me here is you kind of write regular Python code and you put a decorator or two on it and it just works in a different but similar way. And that's a little bit of that negative engineering influence as well. How do I take normal stuff without too much work and make it more general for pipelines?

37:19 Exactly. We call it workflows as code instead of code as workflows. Sorry.

37:24 Code is workflows because you have the code, it is the workflow. And now you just want us to care about it. And so we should be minimally invasive when we do that, because the second you have to refactor your code significantly, you're back in negative engineering. You have to think about the consequences of the refactor and everything else. We want to avoid that as much as humanly possible. Or you should have to a little bit.

37:44 Yeah. A couple of things that I saw that stood out to me, checking out your API here, that was interesting. One was I can have async methods and async execution of these things. So async and await style async def methods and await operations. You want to talk about that support?

38:00 Yeah. So if you actually go to Oriondocs.prefect.io, that's where a lot of our 2.0 docs are currently located while we're still in beta. But they will orion and then Hyphen docs. Yeah. So this Async, that's probably where I saw it.

38:12 Yeah.

38:13 Cool.

38:14 Shout out to prefect engineer Michael Atkins, who really took a lot of time to dig into the guts of Async, and he set it up so that you can do crazy things. You can have synchronously defined flows with Asynchronous tasks, and our engine, the Executor, will manage it all for you. Just to make sure that they're running the right loops and things like that.

38:36 We got to create a loop and just run this in a wait, because internally it's synchronized or something like that. Right?

38:41 Exactly.

38:42 And so it's really slick and it gives at least users know how to write Async code, kind of this native feeling of parallelism. We all know it's not quite parallelism, but it gives you at least that feeling, especially when you're doing the modern data sector. If it's all API driven, you've got a lot of network IO.

38:58 It'S talking to databases, it's talking to file IO, it's talking to external APIs. All of those are perfectly scalable.

39:04 Exactly.

39:06 Yeah. Cool. So you can have at task, say, I'm just going to do an Async def some function. The example you have in your docs, your Orion docs is using Http async clients to go talk to the GitHub stuff.

39:22 Here's your at flow decorator, right. For this thing. Yes.

39:25 And another thing too, that we did that. I'm really proud of that. I've already started to see kind of be one of the ways people on board the Prefect is previously with the API, you had to pre register your flow and tell the API this thing exists, get ready for it, and then runs had to get created server side before they could run quiet side. With the new model, we set everything up and all of this was this deep study in bookkeeping like how can we create stable indices or stable identifiers for things that we can identify across processes and runs? And so in the new model, you can take this flow. And if you are just pointing to our Cloud API, you can call it as a function interactively, and it will still communicate with Cloud API just as if it was a deployed workflow. And so what that means though, sorry, just going back to the incremental adoption story is you can use Cron and then you can just put one line of code on your main function at flow.

40:22 Keep Cron running with that Python script and you've immediately gotten a really pretty record of all the jobs that Chron's running. And if it fails, you'll get the failure alerts and everything else. And Cron still your scheduler, which is totally fine by us.

40:34 Sure. Oh, that's interesting.

40:35 Yeah. And then at some point you want to start to see into the future, and that's when you have to use our scheduler instead of Cron. Yeah. But once again incremental adoption, yeah.

40:44 The API here is pretty wild. You're exploding in a list comprehension of calls to the task to an Async IO gather. That's a pretty intense line right there. But I like it. It's not intense in the way that it's like, oh, my gosh, what is this insanity going on there? Yes, there's the joke T shirt. Maybe you've seen it says, I learned Python. It was a great weekend. Right. That's true for variables and loops and functions. But then you see stuff like this, more patterns here.

41:15 It might be more than a weekend. Give me a moment.

41:17 Yeah.

41:18 No, this is really cool. I really like this new API. So when is 2.0 a thing? When is it released? Main way of working.

41:26 Release date or I should say date, but just like Target, you can expect it some one of the weeks or something around July 1. But we are still releasing. So anyone out there is intrigued by this, especially if you're completely new to Prefect. I definitely encourage you to just start with one of our beta 2.0 releases. They're way slicker, way easier to get your head around, more interesting, and they're still like everything's working. There are some critical paths that we haven't fully released yet that we want to make sure there and tested heavily before we go into Ga.

41:59 Right. But if what's there works for people that they could use it.

42:02 Oh, yeah, it should definitely work. And if you run into weird bugs.

42:06 How does it plug into the cloud visibility layer and all that? If I run some one, some two, is it going to go crazy or.

42:16 No, they both will be configured to talk to the right API, and so you won't be able to see them in the same place. So that's unfortunate if you will. But you can definitely run them side by side. I mean, the environments aren't compatible, so you'll have to have different Python environments that you're running them in. But otherwise, I think some of our one clients are pip install prefect.

42:39 Is equal equal one something or equal equal two or something along those lines.

42:43 Right.

42:44 You need different libraries.

42:45 Yeah. So if you just did pip install Prefect right now, you get an official 10 release. I don't remember the number. So you'll have to make sure that you allow for pre release in your command. So either I think if you just specify equals equals 2.0, I think we're at the three right now. Then you'll get it. But yeah, you have to explicitly call it out since it's not since it's still in beta.

43:06 Sure. I always like going to PyPi.org. It's 375,000. I know it's in projects now.

43:14 Yeah.

43:15 So one, two, one, is the current one. But in here you're 2.0 beta three.

43:21 And we are planning to cut another release later this week.

43:23 So you can expect before probably it'll be before by the time people get around to hearing this.

43:31 But still really cool. So basically your advice to people who are like, hey, this sounds interesting. I want to check it out. Just start with two.

43:38 Yeah, I'd say just start with two. It's working easier to grab. And I think it's more powerful and more flexible for different use cases, especially if you're thinking outside of data.

43:47 Sure. So when I hear people talk about data engineering, if you go into that world, you see all these amazing tools that people have built that looks like, wow, these are really amazing. And to me they feel quite similar. Like Prefect and Friends, it feels really similar to the web frameworks. Right. Like Flask or Django. Okay. So for example, what I mean by that is in Flask, all I have to do is I have to say here's a function that goes to this URL and I just write the code and return a dictionary or something like that. I don't have to think about headers, cookies, connect, like stay connected, header, Http, two traffic. I just do the little bit and it just puts it all together for me. And in the data engineering world, there's a bunch of stuff like that that I feel many people are wholly unaware, probably.

44:36 Yeah. There's an explosion of tooling in data engineering right now and also in kind of the adjacent analytics world. This kind of goes back to what I was saying about how we kind of crystallize this concept of negative engineering. And it's just important. I think all of these tools coming from some very real use case. Right. And I think it's just important to figure like the way I talk to people about this stuff is you shouldn't really pick a tool just on its current feature set. You should pick it on its vision as well as whether it works for you today, because you're going to change a lot and you want to make sure that the tool is changing with you because these tools, especially the explosion of startups, we're all changing quite quickly and you want to make sure that we're changing in an aligned way. And having that flesh out vision is important. And if it's just a tool, that seems cool. But what exactly is this doing for me, precisely? If you can't really articulate that, then that's not to say you shouldn't keep using it or something, but that's always my exercise that I do with new tools.

45:29 Yes. When it's something as fundamental as this, you kind of have to think about, I'm going to live with this for a while.

45:35 Exactly.

45:36 I want to have this as my roommate when I come to work.

45:39 Right. Do I want to debug this? Do I want to exactly. Extend it, you're definitely going to do something weird with it. We've all done weird things with every toy we have used.

45:48 Yeah, absolutely. All right. So my question to you about this sort of like parallel to Flask and the Web frameworks and various other things. This is solving a lot of negative engineering problems for data scientists and data engineers.

46:03 Where do you see maybe people like me who mostly do APIs and Web apps and things along those lines? Should I be using stuff like this? Where do you see the solving problems for people who don't, like, traditionally put on the data science data engineering hat.

46:18 So there are two places that I think are relevant. I think the first is just like really kind of tactical, just tracking of background work, tracking of background tests. Right. Like Celery is a popular example for something like this. Yeah.

46:31 Let me give you an example.

46:33 In my world, I might hit a button. I have to send out thousands of emails because of that. Right. And then maybe based on that, if it bounces, take them out of the email list or whatever. Right.

46:43 Exactly. You want to record the fact it's a perfect example. So just anything like that for a background task. And that's one of the things, too, that we're going to try to make even simpler because we have focused a little bit on the data space and there's very easy changes we can make to kind of extend that. And then the second thing and this is the way I always kind of like to think about prefix. It's one way you can consider everything I've been saying is we're kind of like the SRE toolkit at the business logic layer. And it's something that kind of everybody could just use just that single pane of glass. You get alerts, you get notifications, you can collaborate with people, and it's just kind of all right there for you. And at the end of the day, you don't really have to manage the code that much if you're just using the UI. And so I think that's how we can expand by just kind of giving people that value problem. You want to look at the things that are happening. You want to see a place where all of your systems are just right there. And it's at the business logic layer, you're not looking to see fewer memory all the time, although you could display that if you wanted to.

47:38 So how about this as an example? I created an ecommerce site, and I want to track I just want visibility into people buying stuff, what's working, what's failing, what's the rate, the bosses. I need something on the Web that I can look at this.

47:53 Exactly.

47:54 And get reporting.

47:55 And the key thing here. Right. That you said that it puts it in prefix camp and not in a Datadog's camp is you want to track the user button click, for example, like some business logic thing, whereas something like Data Dog is an SRE or Observability tool that's going to tell you your API throughput Prefect isn't trying to do negative engineering for like your Raw infrastructure is trying to do it for your business logic.

48:21 Got it. Okay, yeah, very interesting. So Prefect open source. If I want, I can just take it and do my own thing, right?

48:28 Yeah, go for it. It is Apache 2.0 license as of maybe a month ago. So before we had a few different licenses floating around, but now we're all in all Apache 20.

48:39 Okay, give me the elevator pitch on Apache 20. So what does that mean?

48:42 I can do it means you can do quite literally anything, as long as you don't violate trademarks.

48:48 Essentially, and so don't violate it or something like that.

48:53 Yeah, exactly. It's very generous. You don't have to check with us or anything like that.

48:58 Sure. Okay, excellent. Yeah. You guys are doing a lot of stuff, not just with Prefect, but with other projects out there as well, right?

49:04 Yeah, we are. We really kind of, like I said at the beginning, trying to instill this kind of open source ethos. Even at the business layer. We're trying really hard to genuinely deliver value. Right. That includes to our customers and users, but also to just the broader ecosystem that we find ourselves in, which is exploding right now. And so we have a lot of different efforts that we can definitely go through and look at all the ways we try to contribute back to open source.

49:30 Yeah, give us a little bit.

49:32 So we do a few different things. So one thing that we do is we will send pizza to basically any conference or meet up talk that has a talk featuring Prefect.

49:42 And so you just have to submit a quick application. We'll probably reach out to you and then that's pretty much it there. If you are a Prefect engineer, we have kind of this advocacy program, and if you get involved with that, we've sent people to conferences before that are not prefect employees. So that's another thing that we.

49:59 Oh, nice. Okay.

50:00 Try to give back more concretely on the business side. Every engineering team at Prefect. So right now there are five kind of distinct teams. They each get a $10,000 annual budget to sponsor any and all open source projects or just maintainers directly that they think are impactful, maybe for their work or maybe for our ecosystem. And so some of them.

50:21 Just to give you an example.

50:22 One of the ones that kind of kicked this whole thing off was we sponsor Mkdocs material theme, just really slick theming. And so that was the first one. We also sponsor a lot of Vue projects, and we're going to be expanding this fast API and some other ones that we just have to dot our eyes and everything and cross our T's.

50:41 It's kind of an escalating rate intensity. The last thing is we've actually prefected. The company has gotten into investing in certain open source tools that we think are very compatible with some of the things we want to do. And so the big one, the headline one here is textualize. So Will, who the author of Jackie. I always be afraid to say his last name, so I'm going to say it wrong. So he's the author of Rich and Textual, and as everyone knows now, it's all out there. So he's building the service technologies for hosting these text based terminal applications and distributing them through the web. So in kind of an interesting sense, it's spiritually similar to the hybrid model where you can kind of run one of these agents. And we've always wanted to expose richer interactions with prefect agents running in your infrastructure through our UI. And when we talk to Will, this is it. This is perfect. And it's got all the right theming differences. So you'll be able to tell this is something you wrote. It's very tech driven compared to kind of our more branded assets lurking around the UI we invested in his company and their seed round.

51:46 I'm really glad to hear that. I'm also really glad for Rich that he's got that going. He's been making such progress.

51:51 Yeah, he gets the top of hacker news. I feel like every other day I will I always switch his name.

51:56 I feel so bad. I'm very happy for Will. He's been doing so much with Rich. My brother's named Rich. It's a problem. But you go and I'm finding so many of these projects. Oh, this is really interesting, especially for our Python Bytes podcast. We're just covering packages and things that are interesting that week. And more often than not, you're like that's got a really cool UI. Oh, I see. In the dependency requirements.txt or Pypiproject timewell. Rich is at the heart of making that look good.

52:21 Yes, we use it.

52:23 It is great.

52:24 Yeah, absolutely. It totally is. So one of the final things I want to talk to you about is creating a business around open source with this very permissible model that you're giving away. And I think it's super admirable. I know there's other companies doing it to various degrees and to various degrees of success, like MongoDB comes to mind, for example, and Red Hat and stuff. But all these examples that I see are like, fantastic. Look at what you guys are doing. You're investing in other open source projects by having a successful business with this open source core engine as the core.

53:01 Right.

53:02 And so I just wanted to give you a chance to talk about the business model, maybe riff on that a bit and give people advice out there who are doing their own thing. Like, another one that comes to mind is you guys do a lot with Dask and Coiled is now sort of in a similar position with hosting dask as a service, sort of.

53:19 Right. Yeah. And we're partners with Coil, too. So we keep up to date our integrations there. So, yeah, our business model for cloud and Unsurprisingly, they're going to get some slight changes, but like spiritually very similar. So first and foremost, we really want to make sure, especially the hobbyists, the open source projects out there, you can come in and use the system to actually achieve powerful use cases for free. And so one of the ways that we came up with our free tier volume, which is 20000 free task runs a month, is we ask ourselves for just a very bare bones airflow deployment running. Like, how many tasks would you turn through a month? And you could run a businesses ETL processes off of that volume. And so that's kind of roughly where you pick this number. So we do think that this satisfies real business need. And then kind of the reason that you would move out of it would be for kind of pretty standard reasons. Right. You want to unlock more scale. Okay. So then you talk to us, you want to add more users because you're capped on users, and maybe even you want more teams. So if you're an actual enterprise, you're presumably going to have some more permission structures that you need to grapple with. And so that's when our enterprise tier comes in, SSO integrations, all of that fun stuff. And this isn't really going to change in spirit or two. I know it's going to be pretty similar. There's going to be some sort of throughput metric. Maybe it's test runs, maybe it's storage something else. That a lot of it's free. And then you want to add more users, more workspaces. You start to talk to us, and then kind of it grows. If you start to have really big performance needs or you have requirements for data locality and things like that, you start to talk enterprise plans. So we tried to align it. And one of the things too, that was insight for me to really think about it in the early days was business models. You're not selling the code you write, and that's why open source works. There's some value that you're providing, and you have to figure out what that is. And like for us, it's kind of almost funny. Having to host and maintain an API locally is negative engine. And so that's not what you're trying to do. You're trying to schedule jobs. And so kind of migrating up to up to cloud tends to be a natural thing unless you've got the resources to manage and scale it out, which is also perfectly fine.

55:34 And the thing is, there's a lot of expertise in running systems like this.

55:37 Yes, it really is in the database, too. You want it to be scalable.

55:43 I think there's a lot of value. I think it makes perfect sense. I give you the core for free and you can run it and you can maintain it. It can be your baby or it can be kind of hands off and we'll take care of it. Like you said, authentication and all that kind of stuff.

55:57 Yeah. And we have a really active, like a crazy active slack community. So if you're doing that yourself and sell hosting, I can go out there and I'm sure you're going to get a lot of responses. I think there's 16,000 people, something like that in there right now.

56:12 Wow. Okay.

56:13 It's active, like messages pretty much applying by pretty regularly. And then all our discourse as well is up and coming.

56:21 Yeah. I think it's worth maybe just highlighting you guys have almost 9000 stars on GitHub, which is quite far up there. That's pretty awesome.

56:29 Yeah.

56:30 So, yeah, you all must be proud.

56:32 Yeah.

56:33 It's always fun when you're open source project. We reach the number one trending repo on GitHub. One time this was maybe two years ago.

56:41 We had a random happy hour that night just to celebrate.

56:46 How long has it been out? When did you start it?

56:47 So let's see.

56:49 Prefect, the company is four years old and I believe we open source, I want to say in December of 2018. So the core at least has been out for quite a while. And then Cloud was maybe six months later when the first version of Cloud got out.

57:07 Cool. So not brand new, but not super old.

57:10 Yeah. Battle tested, but definitely still got a lot to build.

57:14 Yes, for sure. Did you guys build it on Python three only just at that point?

57:18 Python Three only Yes. We tried it to build for the future. Right. Like shrinking audience. Yeah.

57:24 You end up in a place with a lot of negative engineering. If you try to support too much, too far back, I would say for the community, you got the slack and that's pretty awesome. You got the discourse, then you also have Club 42. And if people just go to your website and they go to community, you'll see this, you'll see that I want a pizza if you've got a user group. But I'm guessing this has to do something with Hitchhiker's Guide. What's the story with this Club 42 things?

57:47 Yeah. So Hitchhikers Guide definitely is a theme for us. Everyone gets a free copy when you join. Club 42 is our application only. So you can apply it's, not invite them set. It's a private group of of kind, external advocates of prefect. So people who just really want to get early access to things, who have proven themselves to be the positive forces in our community, which, to be clear, it doesn't mean that they're necessarily like some of them are technical experts, but you don't necessarily have to be. The point is just that you kind of help our community succeed in whatever way that makes sense. Just keep it healthy.

58:20 Yeah, that's fantastic.

58:21 We run like special events with them, and they get early access to everything. They were the first people that got cloud 2.0 auto access.

58:27 I think that's really valuable. I think more companies should be doing it. I know Mongo DB did that for a while. I think they stopped. Microsoft has Microsoft MVPs, and I'm pretty sure Docker has something similar. But, yeah, it's a cool idea. I'm glad you guys are doing it.

58:40 Yeah, it's really fun. And just to kind of get ready, you get a bunch of people all caring about one thing kind of together, and interesting conversations always happen for sure.

58:49 All right. We're getting short on our time here, Chris. But I guess one more thing I can imagine.

58:56 I know a lot of people who are working for amazing companies are starting to reevaluate the amazingness of it now that they've got to go back to the office or like, the rules have changed and then they've changed again and they may be thinking of other positions. What's the hiring situation? You guys have open positions to work on this fun stuff?

59:15 We do have open positions. We're fully remote, so no worries on that. Although we do have plenty of opportunities for meeting people in person as well, optionally. So we have kind of these homes that people can apply to, and they'll show up like 20 people at a time and have, like, a mini internal conference. We're having our first full company, I'll say, later in July. Really fun. But anyway, yes, we have open roles. Highly encouraged if you don't. So right now, the biggest roles on my mind are kind of more in the platform space. So SRE style roles, platform engineer roles. And so if that appeals to you or you have any experience there, let us know. And if you don't see a role on our website that maybe fits you, don't be shy about reaching out, because sometimes these things take a long time unless it's formally ship. Keep a conversation going.

01:00:00 Sure. I have this special power. It doesn't match one of your three listings, but I bet it could help somehow, right?

01:00:06 Exactly. I mean, it happens. It's happened at Prefect before, so, yeah, my email is Chris at prefect. Io, so you can feel free to just email me.

01:00:16 Yeah, fantastic. That's great. All right. I think that might be where we got to leave it for Data Engineering and Prefect, but very interesting. But before you get out of here, you've got to answer the final two questions. If you're going to write some Python code these days, what editor are you pulling out?

01:00:31 Oh, put me on the spot. I'm still a Vim user platform survivalist. If I myself in the corners of an old system, I want to be. I want to feel powerful for sure.

01:00:43 I think there's still a ton of people who are on the vim and the Emacs. I mean, they can't talk to each other they don't get along.

01:00:51 I started out programming doing Emacs on Silicon Graphics machine for one part of it now editor choice right now. These days I'm on PyCharm.

01:01:00 Okay, good choice.

01:01:01 Pycharm and Vs code seem very I use Vs code as well for like small little things. And if I'm like here's my big project, then PyCharm is the choice.

01:01:10 And then notable pypi. PyPi prefect is one of them. But like some library that you've seen that you're like obvious. It's amazing. I really should tell the world about it.

01:01:20 Rich and textual for sure.

01:01:21 Oh yeah.

01:01:22 Fast API, I think is amazing. I really think that you can scale out some pretty powerful web servers with that.

01:01:28 It's really pretty basic. I was just doing that before we jumped on the call.

01:01:32 Oh, nice.

01:01:33 And then we did mentioned earlier. Dask is a really powerful Python framework for distributed computing. Definitely easy to get started with and really powerful as you scale. Those are the ones come to mind immediately. Cool.

01:01:47 Those are all fantastic. All right, Chris, final call to action. People are interested in prefect. What do you tell them? What do they do?

01:01:53 Definitely join our Slack and go on GitHub. That's where you'll really be able to immediately kind of get involved in the action. Figure out what's going on. Just ask around for best practices, how to get started, get some project ideas, whatever the case may be.

01:02:06 Yeah. And you have a nice tutorial, but maybe what you would recommend Oriondocs orion-docs prefect. Io, like the tutorial to follow along for now.

01:02:17 Yes, I would definitely recommend Orion docs to get started and we'll slowly start making these more discoverable over the coming weeks.

01:02:26 I'll put it in the show notes so people can get to it.

01:02:27 Awesome.

01:02:28 Alright. Well, it's really great to chat with you. Thanks for being here.

01:02:31 Yeah, thank you so much, Michael. It was really fun.

01:02:33 Yes, it sure was.

01:02:34 Bye.

01:02:35 This has been another episode of Talk Python to me. Thank you to our sponsors. Be sure to check out what they're offering. It really helps support the show.

01:02:44 Starting a business is hard. Microsoft for Startups Founders Hub provides all founders at any stage with free resources and connections to solve startup challenges. Apply for free today at Talkpython. Fm/foundershub want to level up your Python? We have one of the largest catalogs of Python video courses over at Talk. Python. Our content ranges from true beginners to deeply advanced topics like memory and async. And best of all, there's not a subscription in site. Check it out for yourself at Training.talkpython. Fm. Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the itunes feed at /itunes, the GooglePlay feed at /Play, and the Direct RSS feed at rss on talkpython.fm

01:03:31 We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air. Be sure to subscribe to our YouTube channel at talk Python.fm/YouTube. This is your host Michael Kennedy, thanks so much for listening. I really appreciate it. Now get out there and write some Python code.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon