Monitor performance issues & errors in your code

#294: oso authorizes Python Transcript

Recorded on Friday, Oct 23, 2020.

00:00 When we think about accounts and security, we often think about identity, walking in and proving who we are. But for many applications, especially internal apps at large organizations, that's just step one. The next step is, what can we do and what can we not do? On this episode, you'll learn about a new library called Oso. It's a declarative way to create policy code that maps your mental model, or who is allowed to do what in your system. We have two guests Graham Neary and Sam Scott from the Oso project to tell us all about it. This is talked by me Episode 294, recorded October 23 2020.

00:46 Welcome to talk Python to me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm at m Kennedy, and keep up with the show and listen to past episodes at talk python.fm and follow the show on Twitter via at talk Python. Before we get to the interview, let me tell you about a brand new course that we just launched. At talk Python, we run a bunch of web apps and web API's. These power the training courses, as well as the mobile apps on iOS and Android. If I had to build these from scratch again, today, there's no doubt which framework I would use. It's fast API. To me fast API is the embodiment of modern Python and modern API's. You have beautiful usage of type annotations, you have model binding and validation with pedantic and you have first class async and await support. If you're building or rebuilding a web app, you owe it to yourself to check out our newest course, modern API's with fast API over a Talk Python Training. This is the first course in a series we're building on fast API and for just $39. It'll take you from interested to production with fast API. To learn more and get started today, just visit talk python.fm slash fast API, or click the link in your podcast player shownotes. cm Graham, welcome to talk Python me.

02:03 Thanks for having us.

02:04 Thanks. Yeah,

02:05 it's great to have you guys here. I'm excited to talk about this whole managing what people can do on computers and from a slightly different perspective, from the authorization side of things, which I think gets underserved in programming in general. So that's going to be a lot of fun. But before we get to all that stuff, let's start with your stories. How'd you get into programming in Python Zam, you wanna go first?

02:27 Yeah, sure. So I think for me, it was probably the kind of typical programmer entry which was I had a very monotonous data entry job, which I was like this, Surely there's a better way. I was young enough that I reached for VB macros A few years later, though, I actually ended up picking up Python, primarily through my master's degree. I had a professor who was like very open to number theory and worked on the sage math package. Oh, yeah.

02:54 Sage. Math is fantastic. Yeah, I've had William Stein on

02:57 it is incredible. Oh, nice.

02:59 Yeah, it's cool. what they're doing.

03:00 Yeah. Actually dug up. That was my first open source contribution is a sage math ticket from get nine years ago. Okay, cool.

03:08 Yeah. What kind of math are you studying?

03:10 So that was your undergrad maths. So that stuff was Yeah, that was kind of the the number theoretical side of things after that actually went on to do a master's in cryptography and a PhD in cryptography and security, which is

03:21 our nice how I got here, basically. Yeah, it's sort of indirectly around about leaves you here. Yep. Graham, how about yourself,

03:29 I actually, I took an entry level CS course when I was an undergrad, and actually at the end of my undergraduate experience, which took me way farther than I ever would have expected it to. And I don't get to do that much programming on a day to day basis. But I try to whenever I can, including at a recent company hackathon. So I still like to dabble when I can.

03:50 Yeah, super. These days, you both work at Oh, so yeah, give us the rundown on I guess, maybe introduce what so is your company since you both work there? And then what do you guys do day to day?

04:02 Yeah. So also, as a company, what we're all about is putting security in the hands of developers. That's how Sam and I got to know each other. That's like the thing that we really connected on as like the thing that we want to do. And the way that we think about doing that is by building consumer quality developer tools for security. And so the area that we're starting, which is the area that we'll talk to you about today is authorization. But that's like really sort of the ethos of the company. And what's kind of what's nice for Sam and me, I think for a lot of founders, it's not always clear how to think about division of responsibility. But for Sam and me, it tends to be pretty clear. I take responsibility for the business side of the company. So sales, marketing, financing, everything on the operational side, Sam is responsible for everything on the technical side of the business. So running the engineering team, Sam built the first versions of the product by himself and we share responsibility For the product roadmap,

05:01 Yeah, sounds pretty clear. And think it's a cool project. How long has the company been around for? Super? All right.

05:07 So we've been working together for a little over two years. But we only open sourced the project at 10 weeks ago. Okay,

05:15 so in the open source side, it's quite new. But yeah, still two years is pretty young for a company. And it's easy to think of the stuff that you're building just as technology. And it's clearly like developer tools and API's and things like that, but and that marketing stuff and getting the word out and sales. Without that you you just can't go man, it's Yeah, that's the hard stuff. Exactly. Give me some cryptography and some Exactly. Some compiling and language interrupt like, but don't make me write a lightning page. Like, I'm serious, though, that that is a super hard part of such a critical part of technical companies, open source companies and so on. And it's easy to overlook that side.

05:55 Yeah, absolutely. But it's fun.

05:57 Yeah, for sure. So we're going to talk about one of the three A's. I was recently told there might be four A's in this whole identity authorization sort of story. But I don't remember the fourth one. So I'm gonna go with the three A's. We've got authentication, we've got authorization. So authentication, who are you? authorization, okay. Now I know who you are, what can you do? And then auditing, what have you done? All right. You guys. You're a fan of the middle a?

06:24 Yeah, that's right. Yeah. And I think you just about nailed it. And a lot of the products out there really focus a lot on authentication, which is, I think the thing that most for instance, like consumer users would be most familiar with, like, logging into getting a login page, having your username and password, doing things like password reset, or more recently, things like two factor authentication, how all that stuff is managed. That's the authentication bit, just making sure that you can get in the door.

06:52 Even the sign in with Google Sign in with GitHub is really primarily about just right. Usually, that's about two things. One, who are you? And then sometimes it's about what part of GitHub, do you want to let this app access? Or what part of this app Do you want to let access your Google data, but it's not it doesn't work in the reverse way, it doesn't tell you what the user is allowed to do on that application. It's just connecting those two apps together from a data side. So even the social auth stuff is really just authentication.

07:22 Yeah, this landscape pretty blurry, right? Because you're allowing some other websites to access information about yourself so they can check who you are. So there is no element of authorization going on between those two services. But you're right, like the result of that is authentication.

07:35 Yeah, I guess when I say that there's not any really author's I'm thinking that that doesn't tell the app what you're allowed to do. But it does tell like, say Twitter, what, like, Are you allowed? Is this app allowed to tweet on your behalf? According to Twitter? Yes, exactly. Right. But it doesn't help within your app. Like if I want an app, I want to know like, Okay, this user, they can view invoices, but they can't create invoices, precisely, but they can't ever see the bank details of anyone like that, right? There's no social auth that's gonna help you with that side of the story is what I was thinking.

08:07 That's exactly right.

08:08 Right. That's fundamentally an authorization question, what you just asked.

08:12 Yeah, okay, exactly. Cool. And so this is the core problem that you guys are trying to solve. And we're going to talk about some of the open source stuff that you've done, and the Python API's and all that. But you maybe let's just continue this part of the conversation by talking about some common access patterns. And what is out there, like we talked about the social auth. And what that means. We talked about creating users with usernames and passwords, and what are some of the patterns you all are seeing out there?

08:38 Yeah. So basically, every application out there needs to let its users in some way, shape or form, see their data and do something to their data. And so fundamentally, that's doing some kind of authorization. And then you'll have so in a basic, like social app that might be like, what posts Can you see and what posts Can you edit or something like that?

08:58 And then you'll have Yeah, so in some sense, I guess there's like an implicit or default authorization that every application has. And it's usually I can see my stuff in public stuff. And that's it, right? Like, right, there's no rules like, okay, when I go to Twitter, I just see my stuff. And public stuff, right. I could go to my profile, but there's no expectation I would ever be able to see what someone else's profile and those sorts of things. So I guess if you don't do anything, that's generally the access pattern people have is I create an account and that account can see it stuff.

09:30 Yes, absolutely. And I think that's kind of like the common paradigm, in particular in like a consumer application context, particularly when you start to look at like is this the business applications you end up with like, very quickly, the patterns get a lot more complex. So it could be an HR application or CRM application, or medical records application. And very quickly, what people will do in building these types of apps is they'll reach for a pattern. called roles, where they'll group a set of permissions or capabilities together, they'll lump them together into something called a role. They'll say anyone that has this role like admin or billing or duress or whatever it may be, can do these sorts of things. And then if you want to be able to do those things, you got to get assigned that role. And that's kind of like a handy thing, because it means that every time you want to make sure that someone can do those things, you don't have to repeat that work, which is kind of nice, right, but also kind of limited, because effectively, what that's doing is it's creating a sort of representing all your permissions as like a two by two, you basically have a bunch of roles on one axis, and you've got a bunch of capabilities on another axis. And the truth is that most apps don't, they're sort of like, underlying, like data model isn't a two by two, they may be, they may have all kinds of other things going on. They may have, they may want to represent hierarchical patterns, like to represent an organization, they may want to represent inheritance, they may want to represent some kind of graph. And so like, oftentimes, after you adopt a role model, let me throw it out here. And you could tell me what you think of that, like, so for example, if I'm a manager at a company, I can see my work,

11:10 right, plus my team's work exactly, right. But I don't want to see another team's work, I don't want that person to be able to see everyone's work, just the people for whom they are the manager like that scene. And then you know, how do you you can't really easily manage that setup, right?

11:25 That's a perfect example. Because that's not a role that's like a manager in that context isn't a static thing that you just assigned to someone that's kind of dynamic. based on where you sit in the org, you might be a manager of one team, you might be a VP, in which case, you're a manager of like five teams. And so it's not this thing that's assigned to you. It's more like a function of maybe some other data that sits elsewhere in your application. Yeah. So you end up having to do all kinds of crazy things to hack around the role model and make that work for your application, which is all the kinds of stuff that we see. That's where it starts to get fun.

11:59 Yeah, exactly. So that's probably means like, instead of just having, say, a decorator, or some simple if statement that says if they are a manager, there's usually like some custom logic checking in that section, right? There's like code that's been written somewhere. The checks, like basically does those things. And look says, Okay, my manager, but who do they manage and right, and so on, then all of a sudden, it's this rolls ideas sort of somewhat falls down, and you're coding in the logic into your app. Right,

12:28 right. And there's like, you know, other examples of where the sort of rolls model stops and other things begin. And you sort of have to start just adding more whatever it is, if statements, or maybe you bring the logic to some other part of the application, because that's where it makes more sense to you. But yet, the example you gave is a perfect one.

12:47 The thing about that kind of stuff that scares me is what if I forgot, right? I've got a web app with hundreds of endpoints, one of the one section is like an admin section. And if that doesn't do the proper checks, all sorts of badness is going to happen, right? Absolutely. So you put your code in that? I mean, I don't know, how do you guys feel about that? Because that I'm always like, I'm triple checking. And then I'm like, I gotta go back and check this again, like this is,

13:14 this could be bad. It's super, super common. And that's where we see we see a lot in larger organizations. That's where often like a security team will spend a large portion of their time of how security teams who have like their own little regex that they used to go and find, like every method and see if it has the piece of code they're expecting to see or they sit on every code review so that it can make sure that once not miss that for people without that kind of street team, though, then yeah, it's just a case of hope you don't forget.

13:39 Yeah, oh, you might also even do the It might also be kind of the opposite of that, which is that like, you might be good enough to include the logic everywhere. But rather than trying to sufficiently extend your like roles model to account for all the different like intricate scenarios that you're trying to properly represent, you might just say, I will just like let this person be an admin or something like that, just so that they can do the thing that they need to do inside your application. And then all of a sudden, what you end up with is all these scenarios where people or, for instance, like internal services are over provisioned, because that was the fastest way to make it possible for them to get done the thing that they needed to get done. Yeah, which can be equally painful to undo later on, or risky.

14:21 Yeah, definitely sounds risky. Like it's, it's easy to have the admin non admin flag and just go with that. But that's probably fine for like a small team. But as you grow, it's no longer gonna work. Talk Python to me is partially supported by our training courses. Do you want to learn Python, but you can't bear to subscribe to yet another service at Talk Python Training we hate subscriptions to that's where of course bundle gives you full access to the entire library of courses for one fair price. That's right. With the course bundle, you save 70% off the full price of our courses, and you own them all forever. That includes courses published At the time of the purchase, as well as courses released within about a year of the bundle, to stop subscribing and start learning at talk, python.fm, slash everything. Let's talk about some of the coding approaches in like current Python projects. And I guess I talked about one Shammi put down as like a DIY, do it yourself. And that's the, well, if we got to have managers and people for whom they manage, we've got two, which is write some code and sort of put that logic in there. And maybe you've got that overlaid on some groups, maybe give us some of the common approaches you might see in like common Python web apps.

15:40 Yeah, so overwhelmingly common is that this is kind of seen as just the regular code and application, this is just the things you have to do in an app to build it. So you know, in that case, it just ends up baked into, you know, every method you have is going to have a certain amount of this logic, I think, either you're gonna see people who are just sprinkling this throughout the codebase, adding them where it's necessary. And that's just kind of like handles or like, consider something they just deal with. Or sometimes people will try and go the approach of like stretching that out, pulling it out through into, you know, maybe something like a decorator, like you said, and that ends up becoming this, like 500 line decorator, which has, you know, 10 levels of nesting, and if statements and things like that,

16:20 decorators are already hard to put your mind around, although that's what I do on my stuff. I'll have like a, like a permissions decorator, at least the thing that I like about that, even though it has some disadvantages, the thing that I like is I can go to the functions, and I don't have to read the function and know is this thing being dealt with? Like, does it have the decorator than the function is Okay, you know what I mean?

16:41 Yep, yep, exactly. And this, so mean that the, you know, that is like one set of things. It's not like nothing exists to do these things we're talking about, like, depending on the application, right. So a lot of the kind of stuff we've been describing doesn't have to happen in the code, it might be stored in a, like I didn't see management system or something like Active Directory, is typically this is a place where you can store all the information about users, and you can add them to groups and assign them permissions to different things. That is sort of like the manual admin approaches. Pretty a lot of people might be familiar with them as a way to maybe like manage permissions inside an organization. But it's not kind of suitable for the kinds of things we've been talking about for like an application b2b application or a consumer application where you don't want to have someone manually going into like an extra directory thing, like assigning people roles and permissions, right. So it's kind of like one set. And I think similarly, some of the Python frameworks out there, for example, have built in things for similar patterns. There's Django house, things like Django admin, which again, is sort of baked in UI and system to manage like users and permissions. But it's kind of more of like the ad flavor. It's like a UI where someone might manually go in and configure, you know, alright, Sam is in this group. And this group can do these things, or Sam can do these permissions, not for the sort of like, how am I going to provide a consistent interface to all my end users? Like, how do I do that sort of dynamic, automatic configuration?

18:02 How do you feel about multi tenant apps, so I've got a cloud service. And maybe I've got one company, and within that company, they have certain roles, but then other customers come along, they buy setup for their system, you think like Slack, yep, or GitHub with organizations or something like that.

18:21 So you just gave two very interesting examples. I think gab is a pretty good example of doing multi tenancy and in a reasonable way, in that you have your single user account, and you can belong to many organizations, you can have different roles inside the organizations you can have, you can even have roles inside repositories, although it's not that obvious. But you can be a, you know, owner or a collaborator of a repository, right? And it's all kind of like handles all of those in the sort of reasonably consistent way the, if you don't dig too deep, it it kind of makes sense. But you can imagine you're on the back end to support that. What are they having to do is have a reasonably complex data relationship model between users organizations repositories. If you go deeper, right? There's like team sub teams, infinitely recursive sub teams, things like that. Yeah, yeah, Slack, I feel like they did the, at least initially, they kind of forced you to create an account for every workspace. So presumably, on the back end, this looks like somewhat different where they didn't try and make it so that like, you could map a user to multiple organizations with roles. It's like you have a user inside an organization. And they have a role. Yeah. And this, I think this is probably I can imagine this might be something based on how they originally did authorization, it sort of might have even painted them into a corner where they're now sort of stuck with a model. It's hard to get away from

19:34 Yeah, side note or sidebar, I I slack authentication model drives me crazy, that I can't just log in and see what groups are like, I'm with you. I got to remember the pre, the pre the it's brutal, first part of the domain that belongs to it and then the password may or may not be the same as like, what am I Why am I doing this? It's so brutal, man.

19:55 Yep. I'm not saying it's because the authorization approaches wrong, but maybe it was

20:00 Well, I mean, it just it shows that you need to be careful about this, like how you think about this. And yeah, the fact that we are talking about it at all means that it's like an issue for users and people experience. In an ideal world, you shouldn't even think about it. It's like, well, I guess I didn't really think about it. But yeah, that it is restricting to me what I should be doing. And it just works, right.

20:22 Yep. Yep. Yeah. And you bring up interesting point

20:24 is why people say architecture is like what cement?

20:30 That's right. For a while. Yeah. Yeah, absolutely.

20:35 Yeah, I was just gonna say the idea about it being exposed to users as well, this is a really interesting one, it's, I think a lot of the struggles or frustrations people have with security, both on similar things as well, where you end up, you're trying to use a website or something, I have these like crazy, complicated permission roll systems where you're trying to like, decide what you can do within the app. And when you talk to teams like that, you realize it's because they've like influence an authorization system. And they're basically just like exposing the internals of that threat to the end users. So you kind of almost need to understand how the app works in order to decide what you can do inside it. And that is just kind of crazy.

21:08 Yeah, for sure. So you all saw a problem out there. And, Tim, you've built this policy system, this authorization system. And I guess one of the things we want to be clear about is I see this as an advantage is it is not it's specifically not about logging in users managing their passwords or their third party auth stuff. It's about once you know who someone is, regardless of username and password, you're going to have Google logon wherever you feel like once you've got that now, what can they do? Right?

21:36 That's exactly right. And in addition to that, it can be can have them multiple different ways that people authenticate or have identities, right? They might log in, in the web app, they might have an API key. And authorization might depend on that. That was like a main input to decisions.

21:50 Yeah, interesting. Like so for our mobile apps. For the training courses, we have you login with username, password, but then it actually exchanges an API key. Basically, the login gets your API key, and then from there on its exchange with all the calls. So yeah, I hadn't even really thought about the API side of things as well. But that makes a lot of sense that you don't want to separate those. Yep. Yeah. It doesn't matter how you log in. If it's API key, or you do with the username passwords, like Alright, well, now we're going to figure out what they can do. So I guess, tell me what problems it was that you saw, like, well, we got to do this differently? And then tell us about so and like, how we can use it in Python and so on?

22:28 Yeah, absolutely. So the, I think the biggest problem that we saw out there is that pretty much every single engineering team we've seen out there we've spoken to has repeated this work themselves from scratch, in all the ways we just discussed, right? Whether it's through that code, as a decorator, whatever it is, like everybody's repeating the same luck. And nobody's gonna get it, like perfect on the first attempt. And so they end up you know, having to iterate and refactor over time and add things so they don't quite get it. Right.

22:55 Well, and another thing, the way I think about this stuff is this adds no value to your application. In a like, sort of unique feature aspect. It's like one of those things that it has to exist. It's table stakes for being in the game. But it's not like someone's like, I love that app, because the author is so or the author. Yes. So yeah, like, it's only like dragon molasses, if you have to do it yourself, and you get it wrong, but it's not a bonus. So it's, the reason I say is it's not something you want to like try to invent in or whatever, right? You just want it to work really well and get out of your way.

23:28 Absolutely, exactly. It's one of those if you do it, right, they won't even know it's the kind of thing. Yeah, exactly. So yeah. So that's basically what we're out the you to solve is to make that experience for everyone involved like that much better for the developers who are building this and spending time on it, and hopefully not getting it wrong to the end users who are like dealing with these in crazy, two or two matrices have permissions to navigate. So basically, the way we solve this is through open source policy engine called Oh, so there's kind of two main pieces of this. There's the policy language called polar. This is what you write your authorization logic. And I can speak about that in a bit more in a sec. Yeah, that sort of piece one. And then piece two is the library, which is the you know, the policy engine itself, reads in those policy files, and basically has a very, very simple API, a single method effectively, to make an authorization decision,

24:14 right and forward. Like I was looking through some of your docs and say, for flask, you have some built in integration. Yeah, that's right. Right, you create a flask Oso thing and you just say initialize app, you give it the flask app, and then you tell it what routes to authorize, basically,

24:29 exactly. Yeah. So the library itself is actually available for multiple languages. So currently, we support Python, Ruby, node, js, Java, and, and rust. So that's kind of like the core of this. And then we build additional framework integration. So you're exactly right now we have one for flask more for Django. And for each of those, we try and provide sort of framework idiomatic approaches to authorization. So you know, I think, you know, flask is pretty keen with things like decorators with Django. It's more about middleware and automatically registering and making available like Dunblane data. How hard would it be to add it to a new shiny framework that

25:04 didn't make your list? Like I'm super excited about fast API right now? Yeah. Yeah. Right. looking really nice. But probably you don't have integration with that yet. It seems like it probably wouldn't be that hard to replicate what you've done with flask or something along those lines, exactly.

25:19 Each framework integration is effectively equivalent to how hard it is to add a secure application in general. And we're talking about pip install Oh, so then creating your own objects loading a policy file, and you're good to go? Yeah. And so you know, for in the case of Django, for example, will register automatically like the Django models, because the policy file can actually access objects and classes from the application. So you know, there's a little bit of work there, where we just like automatically register those for you,

25:44 right? Yeah, it's very Django like to do that.

25:47 Exactly.

25:48 Nice. Okay. So I talked about this complicated story of I want to be the manager, but there's only some people where I am the manager of and I could be managed to myself, and so on. And there's other people who I have no relationship with other than CO employee, and so on. One of the options was we could write code in the application to do that. Yep. It sounds like this polar language policy file is where that would go here. Is that right?

26:15 Yeah, that's right. So the poor language is a declarative language, it actually takes inspiration from logic programming language called prologue, sort of this decades old, pretty well established logic programming language. But Prolog itself typically is it's kind of known to have a pretty high like, barriers to entry, kind of hard to learn. Yeah. So we sort of, you know, that's what we started with, though. But then we basically push ourselves to make it as easy to use as something like Python. And so like along the way, you know, we've added stuff that you would expect to find in Python, you can look up attributes on your Python objects, you can call methods, you can use variable and keyword arguments, the logic is written and the letters as opposed to the, you know, some arcane glyph, things like that. Yeah. So that's the language that's popular. And then so basically, you know, by building on something like prologue is actually makes writing expressing logic, like you just like you said, arounds represent complex hierarchies, or things like that is a very, very powerful way to represent those, you can write a recursive rule like that, which says, you're a manager of an employee, if you are the employees manager, or you're the employees, managers manager or something like that, you can write like a little recursive rule in a couple of lines, you cannot use that like throughout your policy. So you could write another rule, which would say, managers can read, like a user can read some employee, some personnel data, if they're the target employees manager, right. And you know, there's two rules now combined together, you have, you've just written those like kind of in your flat policy. And the underlying engine is basically the one that kind of navigates through those searches through tries to find if that's true or not cool.

27:39 And I'm working through the syntax here. It's clearly not Python, but it's not that far from Python. You know what I mean? It's like, exactly a path by them person, person could jump in here and go, Okay, well, I create an object. Yeah, I got to put the new keyword but new person and such and such that seems pretty straightforward.

27:56 Yeah, yeah. Yeah, the rules look like methods. Now, they have type specializes, which look exactly the same, like Python type ones do but they're actually enforced at runtime. Okay. Things like

28:06 that, to house data get from like my application over into one of these policy execution instances, like calling running the policy, basically, I've got like, a user in my no request session or, or something like that.

28:22 Yeah. So they just get they get frustrated, right. So the the Oso library in the Python app, you should pass it in regular objects from Python, you pass in the request user, you pass in the thing they're trying to access. And basically, we sort of the policy engine, there is sort of a, an interface between the sort of Python specific library and the internal policy engine, that it lets Python deal with the objects, but it can do the sort of policy evaluation over those objects isn't is not they are. It's just like, if you say, user dot username, once the policy evaluates, it gets there. It will be like, hey, Python, what's the what's the user's username? And just says, Oh, it's this, Sam. And it continues on? It's like, cool. Now that's the string.

29:00 Yeah. Nice. Super cool. So one of the things that's interesting about Python and some of the other languages you mentioned with the integration, right, like Ruby, for example, is that we have a repple, read eval print loop, right? If you just type Python. Yeah, hopefully that runs Python three, not type Python three, and you just get triple triple greater than repple. And you can start typing in Python commands and go from there. And I do feel sometimes people are learning Python, they lean too heavily on that, like, they don't just go create a file because it's like a pain to make any corrections and stuff. But it is really nice as an exploratory Yeah, mode. And what surprised me when we spoke first about this is you guys have a repple for this polar policy thing, right?

29:41 We do we have we exactly have a repple. And we have a debugger, and

29:45 the debugger as well as

29:46 a debugger. That's right. Because, like, Okay, as you said, write it clean, not Python, but it kind of looks like Python, but the underlying model is different or expecting it's not imperative. It's logic basis decorative, and so we appreciate there is a degree of like Having to learn how the language works. And so for us, though, yeah, if you're going to build a language, you want the power of a language, you want to have a repple that you can dive into and like test things out and check something as simple as syntax or just a sense, check that, that you've got the expected result back, then. So we have a repple, you can load in your policy files, and it allows you to sort of interactively query them. So you can just like dive in and make sure things are working as you'd expect. Yeah, I

30:22 suspect that that would be hugely valuable. I mean, I haven't actually tried it. But, you know, I think of things like YAML configuration files and stuff and your psych, it's not working. Why did they? You know, it's like, it's those times you just want to yell at your computer. Why don't you work? It looks right. You know, like,

30:40 ah, Indonesia.

30:43 Yeah, or something. Right. Oh, man, I like always coming back as a single key, not a less Yeah. And the way you put it in YAML, or whatever it is, right? Like, having the ability to like step through it is really interesting.

30:55 Yeah, the debugger is great for exactly that. You just drop into the debugger and just kind of like hammer next, and just like watch it doing its thing?

31:01 Yeah. Do you have or have you dreamed of any, like ID integrations like VS code or pi charm? Or more broadly? IntelliJ?

31:10 Yeah. So I think dream is partially the right, the right word for it. You know, we currently do have syntax highlighting available for vs codes. I think there's actually have been configured there as well. There was some stuff we want to do with like the, you know, language server protocol and hooking up the debugger to IDs like vs codes that you get that experience like in your in your ID. Yeah, this is one of those stuff we want to do. Oh,

31:31 all right. Well, I've been beating my example of manager, employee to death, give us more use cases, or you might see people using this kind of stuff.

31:41 Yeah. So I think one of the one of the kind of initial uses we have for this is is kind of your prototypical authorization use case, they are building electronic health records software deployed currently in hospitals. And the stuff they come up with, like every day just astonishes me some of the authorization. But you know, there you get, like, can imagine that really classical stuff, which is like you can doctor can see your patients records, if they sold in the last seven days, or if they have an upcoming visit. Like that's the kind of level of granularity like a hospital might want to go down to.

32:13 Yeah, yeah, that sounds like a perfect use case. You know, just, here's a general heuristic for deciding whether some company organization might have a good use for this. If they use SharePoint, what do you think about that? That's a pretty good. SharePoint is like, it's just like all these weird permissions. And it's all about like, well, we can't really I don't know, just always feel like, Alright, this is like messed up enough that you really need some help here.

32:38 So I so you're gonna get maybe a different direction with that, which is, if you would be sort of embarrassed or out of business, if some of your data was exposed, then you've

32:46 heard this. Yeah, also, yeah. Yeah, for sure. For sure. All right, give us some more examples.

32:52 So we spoke about like the social media one at the beginning, which is kind of an interesting one, because it, there's a few slices of that. And actually, we recently did a bunch of like blog posts on it, or social media feed app that we're putting together. But even some of the simple ones that, you know, that you mentioned, can be reasonably complex, users can see posts they posted or they can see their friends posts, they can see them, you know, if maybe if they were like tagged in a post, things like that, they can be pretty involved in like having to read look at like the post and where it was put and like who's referencing it? Things like that? There's kind of the two sides of that as well. You know, there's like the uses of that. And then internally, how company like Twitter manages how employees can access things. Obviously, this was a pretty hot topic. Not too long ago.

33:31 It was super hot. Yeah, there was. I am blanking on the details. Maybe you remember it and help people. But yeah, there was it had to do with celebrities. Right?

33:40 Right. It was I think it was effectively the Intel employees at Twitter, were able to do way more than they should be such as anybody. Exactly, yeah. Which no should ever really need to have. But I imagine it's pretty convenient way to build up, you want to test how it looks, you see that? And you see a lot for legitimate reasons, right? Like your customer is having a problem. They're like, Hey, I'm unable to maybe tweet, you know, but I'm, you know, unable to do this in my account. You want your support to be able to like step in and help them out and be like, hey, like, I can see your permissions aren't quite right. Let me try this. And like, you know, there are real legitimate reasons, you'd have that much power, but it just like overlays this like entire extra dimension. It's like, your user submissions, and then like your customer support reps behaving as if they're a user, but with different and so on.

34:23 And it's one thing if that's like internal data, and Okay, so they probably should be able to do this. But if you log in as them they can, like in the app, you could maybe do a little more with it. But it's another to have that on production. Yep. In a live broadcast to the world. Like I can make this random politician or celebrity say this, do this thing. Yep. And to a varying degree, it'll be believed, right?

34:47 Yep. Yep. And then yeah, so that's, so those are good ones. And you know, then beyond that, all of the kind of typical cases you could imagine very common inside like an HR thing could because of like the manager employee relationship you mentioned right very organization. driven up in access control. So HR payroll, things like that you'll see this a lot. Similarly in like banking and finance, like any of those cases where you imagine that the data is sensitive, and you have some concept of groups or hierarchies organizations is where this like comes up a lot.

35:15 Yeah, yeah. Cool. So you said the language is available are the API's available for a lot of different languages. And one of the things that's interesting here is you all decided to build it with rust. That's right, which is a pretty hot, neat language. And at some point, though, rust has to talk to Python. So I've seen a few examples of people creating a traditional example is I'm going to recreate some lower level thing for Python. So I'm gonna use C, maybe I'll go crazy and use C++ but expose it se. But I think I'd rather write rust. How did you how did you pull that off? How do you do that integration?

35:52 Yes, the ecosystem for sort of embedding rust in Python, or actually, or even vice versa, you know, embedding Python in rust or calling into Python from rusts. There's actually there's a few tools out there which solve that kind of specific problem, that they have interfaces specifically for exposing a rust struct as a Python class, things like that. We don't really have that option available to us, because we want to support multiple languages, we sort of needed some an API that was simple enough that I see it wouldn't matter the language or the runtime, you can bind the Python and Russ tightly together as an option. And probably if your goal is only to write it as a base for some Python thing, it

36:25 might make sense. But exactly, that wasn't your goal, right?

36:29 No, exactly. So instead, we sort of, Okay, one of the inches Newsela team had this pretty great vision of how this would look which is sort of like a kind of an event driven API. So like, or the rust code is driven through like a very, very simple API from like the host language side. So from like Python, or Ruby, it's like a simple API, just kind of like, do the next thing like the internal polar evaluation is done through a virtual machine. So it's like virtual machine, go do your next instruction, go do your next instruction. And it kind of returns a JSON blob of data back to say, like I said earlier, hey, what's the username field on the objects number one, which maybe is the user type sample. So there's kind of this really nice, like conversation between the Python and Ross with Python is like, hey, do more work, do more work, do more work until Ross comes back and says like, I need more information. Which means that like there is no when, when the policy is not running, there is no like background thread, there's no, there's nothing like running there. It's Python is free to kind of pause our virtual machine for as long as it needs. In the case of things like node, we have done this for Python yet. But in the case of node, you have a synchronous code. That just works. It's like super nice.

37:34 Yeah. Oh, yeah. That's cool. That would be nice. And Python, also, my mention of fast API earlier. Yeah. Yeah, that one's all about the async and await. So it's not quite yet a big deal. But it's like some of the new frameworks are going down that path at the same time, you could still use it, it just won't benefit from the async and await I would suspect.

37:51 Right, exactly. So where this would matter. And this, this would be potentially a pretty important thing to do for something like fast API is, if, as I said, the, because the policy language can call into the application to fetch data, that call might itself resulting like a database query. Yeah. And so maybe if you're running multiple threads, you're trying to serve multiple requests, like you're gonna want that a sync so that the policy is a synchronously getting that data

38:12 back. Yeah, for sure. Every time you can await some other external resource, you're just better off for doing it. Yeah, exactly. Okay. Very cool. Well, it sounds like a neat integration. And I guess it's a challenge. I didn't really expect, like, I figured you'd have to integrate it with Russ. But I didn't expect like the, we needed to be sort of bi directional communication and work across the different languages. That's a pretty good accomplishment. Yeah.

38:37 Yeah, it was. I mean, there's a lot of fun two bills, I'd say the hardest part of it all was the packaging and ci, funnily enough, because now we have this rust library we're trying to add to add to the Python package. We I can't remember what number we're up to at this point. But we have maybe like 300 ci jobs for every operating system Python version to test like every combination.

39:01 So you want to do like build a wheel for like, macOS on Python three, eight, macOS on Python three, nine, and so on. Yeah,

39:09 exactly. So you have the there's like the many Linux is a format would be a door spec or something that we use the kind of the many the next Python approach so that we can build all these pre built all these wheels, including the rust code, but once you get them out, you want to make sure that those 50 different wheels who built or works, though?

39:27 Yeah, exactly. Exactly. Is this interaction, this rust Python interaction layer are part of the open source stuff that you have out there.

39:34 Absolutely. Everything is open source. Yeah, yeah. Actually, we'll probably see a blog post coming fairly soon on how we built that because I think it is, it's like a really nice, simple approach to doing this kind of thing. And I'm personally a huge, huge rust fan as kind of a big reason why we're using it. And so I'd love to see people taking this approach to building is like, cross platform, cross language, rust, cause I think that'd be awesome.

39:55 Yeah, me. All right, Greg. Let me throw it back to you with a business question. Sure. Yeah, so we've been talking about how cool it is open source. And yet, we started off a conversation saying you guys started the business two years ago, I'm really fascinated and admire companies that are able to make legitimate, meaningful open source things, and then use some interesting extra thing that you get more if you support them, or if you buy some product or service from them. And it sounds like that. That is the kind of thing you all are building as well, right? Because the library and the debugger and the repple. And all that stuff is open source on GitHub. People can fork it today. And and that's that, right?

40:35 Absolutely. Okay, so

40:36 what's the story? What is your specific plan here?

40:39 Yeah,

40:39 so in the near term, we're focused on open source. And the reason for that is, we believe that the right way to build this company and the right way to build this community is to put enough weight behind the is to put enough weight behind the body of people who are actually writing code in polar and giving them everything that they need to be successful. And so that's like the focus for us for the next year or two years. Plus, when we think about obviously, we're a company, and we have every intention of being around for the long term. And so the way that we need to do that is to create a sustainable business. So the way that we think about doing that is by offering a path for teams that want to run and secure so in production, and giving them things that make that really viable and easy. So I'll give you some examples. Right now, this was packaged as a library. Imagine a scenario where you want to run a bunch of Oso libraries in a microservices context. And we've had folks already asked us for this today. So now you've got a bunch of libraries with different policies running across a bunch of different services. And you want a way to ensure that those always have the most up to date policy, the most up to date version of the library. And you're doing that and they're all properly versioned, and so on and so forth. So I know so service that would handle something like that is one way you could imagine monetizing,

41:59 right? Because if you're one of these complicated SharePoint sort of organizations, there's stuff everywhere, like everywhere, and it's so easy for like one app to get its policy out of sync with the other. And how do you know you've got them all? Like, it just sounds like a nightmare.

42:13 Yeah, or security teams equally have assets. I mean, also being a library on the critical path of every request, puts it in a unique position to be auditing requests, which is something that you talked about back at the beginning. And this is something that a lot of security teams, surprisingly, really struggle with. It's not an easy problem to solve. And, but it's something that this particular piece of software is in a unique position to do. And so you could easily imagine also providing auditing capabilities to security teams in the future, showing them who was authorized to do what, at what point and because we're making the authorization decision itself, we can actually tell them why they were authorized. Because they were in this role, because they sit in this department and they report up to this person, stuff like that.

42:58 Yeah, that's super interesting. Because you're right, you are already in the middle of all those of exchanges. So it's easy for you to add that visibility.

43:07 Yeah. So I mean, for us, as I said, like the philosophy is relatively clear. We want to give developers the tools that they need to be successful with those so period, and that technology will always be open source, the way that we think about the technology that we'll use to sustain the commercial side of the business will be the sort of organizational pieces that larger businesses rely on in order to be secure, be compliant, run large operational teams and applications in production.

43:37 Yeah, cool. Well, I think that's a, it sounds like a pretty solid idea, right? You've got this legitimate open source thing that's meaningful and useful, and grow that and get the companies that got the deep pockets who are often unlikely or unwilling or incapable of contributing back to open source, give them a thing that they'll pay for that it will indirectly basically give back to open source?

44:00 Absolutely. I mean, as an example, before this, I worked at an open source company called MongoDB. And over for many, many years, and over that period of time, we invested several 100 million dollars worth of r&d into the database product, which is directly straight from the companies that were providing revenue to the business through the paid products. So it's a very clear tie between, you know, the companies paying money to the company building the product itself.

44:27 Yeah, I was gonna ask you about MongoDB, as well as like, what inspiration you got there? Because I think one of the things that we're starting, at least me in the folks who I've spoken to are starting to realize is that it doesn't matter how much money a company has, they won't donate. The idea of a donation is like, I don't know where that goes into the accounting spreadsheet. It doesn't make sense. Hmm. I can't tell my shareholders that we donate a million dollars to Django, because I don't know it just doesn't make they just can't put that into their structure right? But But we pay for service level agreements, we pay for additional services, we pay for better support, like that fits into their accounting software. And I think that's the story that's going to work. And so if you can offer them something more, they're very likely to pay for it to get that like you are.

45:18 Yeah, absolutely. And I think for us, again, like, whatever, there's books upon books upon books written about this topic, not in the context of open source software, but like in the context of like, philosophy over like, in the like, you know, 16th 17th 18th century, you know, people writing about tragedy of the commons, this is not a new topic, like in the world. Yeah. But in the context of like, open source, our philosophy is, we need to give something that's good enough for someone to be able to use on their own, where they wouldn't feel like they're going to be held hostage, if they're not going to pay money, that's just not a sensible thing for anyone to do. We try to put ourselves in the shoes of our users, we would never adopt a product where we felt we'd be at risk of being held hostage. But yeah, but instead give them an opportunity for, hey, here's something that you can, that you can take advantage of, and that you can get value from. And that if you find yourself in this other scenario where you think you want to, you want to get something like auditing, you want to get additional visibility, you want this way to run something at scale, then we're gonna be there for you. And we're gonna provide a commercial product for you in that situation.

46:17 Yeah, that's fantastic. And I'm a huge fan of MongoDB. I told you this earlier, before we hit record, but all of our stuff runs on Mongo, and it's has 456 years has been beautiful. I actually just looked at the StackOverflow developer survey from 2020. And under the most one a database, MongoDB is out by like 5%, above Postgres, and then it's like, those are the two that are way out for it. So pretty neat. What lessons did you take from that your time at MongoDB, that maybe you wouldn't have otherwise brought to this

46:48 venture? By far the number one thing that I learned there is focus on the developer. And I mean, if you look at the mission and vision of our company, it's we put security in the hands of the makers, that is all we care about, we have a singular focus on developers. If you woke up anyone on the team and shut them at night, and you ask them who's the number one focus of this company, I guarantee you that anyone would say, developers, and that has been clear from the beginning and will continue to be clear for us. And that was definitely the main thing I took away from my time at MongoDB. Yeah,

47:20 yeah. Super cool. All right. Awesome. Well, I think we're about out of time and but what any project and I wish you guys good luck with it. Let me ask you the final two questions before you get out of here. Grandma, sorry. First, if you're gonna write some code, what editor Do you use these days? said you do some not a ton. But if you are what are you using?

47:38 It was definitely VS code.

47:40 All right, right on that's a popular one, Sam. Yeah, I

47:44 think that's probably because of my love of VS code have forced it on cram.

47:48 Yes. Influenced by Sam for sure.

47:51 I went through his computer uninstalled all the stuff that wasn't VS code. Well, you gotta edit that. Here you go. text edit is gone. Sorry.

48:01 Yeah. Just kind of stunned me hobbyist codes like crazy seems to be taking over the world. But

48:07 it's interesting. I find it surprises me a little bit because it came out of the whole Microsoft side of things. I thought that there would be a lot of communities that would go just know, yeah, but it's somehow it's hit the right notes. And people really love it. So yeah, it's definitely successful these days. And then notable pi pi package anything, Sam, you've come across, maybe that like the cool libraries like Oh, man, you should really know about this. Or maybe we did our rest integration with that or so on.

48:34 So funnily enough, so one of the things like find those hilarious well, Python is sometimes I'm not even aware if I'm using package or it's just built into the standard library. Yeah, yeah. So like, I'm a big fan of like, all the typing stuff that Python has been like gradually adding in. Mm hmm.

48:48 I think partially, that's because I went through my like rust phase, back in morning, all the types. So they feel the typing extensions, I love the type stuff, I put it on, like all the Python code that I write on the boundaries, like say, not every bit of code, but where like some part of code is written and some other part is going to be sort of externally consuming it types go on that straightaway.

49:08 And yeah, I think it's another standard library stuff, but I love all the metaprogramming stuff you can do with Python, it's, you can really do some crazy stuff with it. But it's, it's kind of fun.

49:18 Yeah, awesome. I'll throw two things out there for people that are like really related there. So my PI is us a static type checker, that will verify all the types you put in are consistent. And then there's my pi c, which will actually compile the native code some of your Python based on those types. Understand I haven't done anything meaningful with it. But anyway, bunch of fun stuff around the types out there. The Yep, yep. All right, final call to action. People are interested in letting someone else handle their authorization and some library maybe putting that into one of these polar files. How do they get started? What do they do if they want to get started the projects you guys

49:54 go to so HQ comm probably the fastest way there's a big button on the front. That'll take To the Quickstart. I would check that out. Awesome. All right. Well,

50:03 thank you both for being on the show and for working on this project for the last couple years. It looks really helpful.

50:07 Thanks for having us. Awesome. It

50:09 was great to be here. Thank you.

50:10 Yep, you bet. Bye. Bye, guys. See ya. Bye. This has been another episode of talk Python. To me. Our guests in this episode were green Neary and Sam Scott. It's been brought to you by us over at Talk Python Training, wanting to level up your Python. If you're just getting started, try my Python jumpstart by building 10 apps course. Or if you're looking for something more advanced, check out our new async course the digs into all the different types of async programming you can do in Python. And of course, if you're interested in more than one of these, be sure to check out our everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show, open your favorite pod catcher and search for Python. We should be right at the top. You can also find the iTunes feed at slash iTunes, the Google Play feed it slash play in the direct RSS feed at slash RSS on talk python.fm. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Get out there and write some Python code

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon