« Return to show page
Transcript for Episode #70:
Pythonic cover songs at Loudr
Some of the best songs are cover songs of popular music. If you're a musician who wants to create a cover song and actually sell it, you'll be diving deep into complex agreements and legal agreements with record labels. Sounds like no fun to me.
But this is where Python comes to the rescue! The guys and girls over at Loudr are using Python to create a service for creating, selling, and distributing cover songs. This week you'll meet one of the co-founders, Josh Whelchel. He's here to tell us all the cool ways Python makes this possible, including a touch of machine learning!
This is Talk Python To me, episode 70, recorded July 29th, 2016.
Welcome to Talk Python To Me, a weekly podcast on Python- the language, the libraries, the ecosystem and the personalities.
This is your host, Michael Kennedy, follow me on Twitter where I am at @mkennedy, keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter via @talkpython.
This is episode is brought to you by Rollbar and SnapCI. Thank them both for supporting the show on Twitter via @rollbar and @snap_ci.
Hey everyone. Quick follow up from last week's episode about writing a programming blog with A. Jesse Jiryu Davis. Several people went to the show page and commented saying that this episode tipped them into deciding to finally create a programming blog and start contributing more to the community; I thought that was really awesome. So, good luck to everyone creating a blog as a result of that episode and I appreciate you adding your comments there, and thank you Jesse for inspiring everyone. If you want to add a comment to any of the episodes just go to the episode page, the one for the blog was talkpython.fm/episode/show/69.
Now let's do a quick update on the Kickstarter. The Python for Entrepreneurs Kicstarter is into its second week, you are probably not surprised that it's already funded and everyone seems pretty excited about it. In fact, I want to say thank you for everyone who backed it or who is still considering backing it. The campaign was fully funded in just an hour and 15 minutes, that's pretty awesome, it's a good sign that I think Matt Makai and I are onto doing something cool for the community. I really truly appreciate all of you who are contributing to it and helping me bring this course alive, so I want to say thank you so much. You can learn more about the Kickstarter campaign and the course at talkpython.fm/launch.
Now, let's talk about the intersection of music and Python with Josh.
2:34 Michael: Josh, welcome to the show.
2:35 Josh: Hi, thanks for having me.
2:37 Michael: Oh, yeah. It's going to be really cool to talk about music. We've talked about all sorts of different things in the show, a lot of programming, we've talked about particle 2:44 we've talked about font design, but we've never talked about music. So, I'm really excited to have you here, it's going to be fun.
2:51 Josh: I'm really to get into this conversation, there are very few opportunities to talk about music and programming together in the same interest group, so it's going to be fun.
2:59 Michael: Yeah, it's going to be awesome. But before we get to the story of Loudr and what you guys are up to at the company, let's just talk about you, how you got into programming, how did you get into Python, like what's your story?
3:10 Josh: Sure thing, so I was one of those kids who was- I grew up on Doom, right, basically on video games and I think the first big memorable experience I had getting into programming was as a kid, I came across on the internet, the very early internet for that, a little program that lets you make random levels for Doom, and I think I actually learned most-
3:43 Michael: So, you were making your own Doom Wads, is that right?
3:45 Josh: Yeah, Doom wads, for sure, there you go. I think that's how I learned math, I was actually tweating a gram divine and later had a conversation where I went to like class in the fifth grade and I knew the word vertices and I corrected the teacher when she said vertaxes, and that was my first moment when I knew I was doomed to be a nerd for the rest of my life. But yeah, so there was a random level generator written in C, it was called slidge, it might have been actually dlidge or something with the d at the time but yeah, so I just wanted to learn what the ins and outs of that was and so I gotten into programming pretty quick-
4:19 Michael: programming the Doom Wads, I never actually did, it, I just sort of consumed it, I think my favorite was like a Sims and so on, but those were not actual, they weren't done in Python, were they?
4:30 Josh: No, no and I think that might have been before Python's popularity too.
4:35 Michael: Yeah, I think you're right, I think it just barely came out at that time, right, ok, yeah.
4:38 Josh: Absolutely, yeah, it was very early. So Python for me really started, you know, as I got older, I say older but I'm still very young, so as I got older, I got some contract for company called click team who was making a product called multimedia fusion and the idea if anybody really remembers it be called click and play in its earliest days, it would say sort of a grid based conditional event action based system for programming interactive experiences and most of the time those would be games.
5:05 Michael: That's cool, a little like a flash type stuff?
5:09 Josh: Kind of like Flash but it would actually make binary executables, right, so these are stand alone deliverables which is really awesome.
5:16 Michael: Like blender, maybe. Is that closer?
5:18 Josh: No, blender is definitely more focused, it has been a scripting click team, didn't really get into scripting too much, which is a what opened the window of opportunity for me, II actually, they have extension system, where you can basically program an extension that would get packaged with the game, as a dynamic library, so that was really cool, I learned a lot about programming fundamentals, and the low level stuff from that experience. But I basically built the library that you would use if you wanted to play a modular or like an impuls tracker file, so that's like very DOS kind of music, you know what I mean?
5:53 Michael: Right, of course. That's awesome, so you basically made it possible to plug in like soundtracks into the games?
6:00 Josh: Yeah, and the neat thing about these soundtracks is you had midi and midi was already built in, you know, midi is different on everyone's operating system basically especially at the time, because the sound fonts would be different, and there was a real desire to have sample based music, if there are any big video game nerds I think the one moment in video gaming history where you could tell the difference between a sample and midi was in 6:25 at the end there is a choir that comes in, it's mixing midi data with waveform audio, so modules very much kind of achieve that because it's all sample based; basically you would have like let's say I wanted to have a violin that is plucking a string a pizzikato, you would just record like a C4 on that and then the programming would basically stretch the frequency, by using just sample playback rates and manipulation around that kind of idea, to create music. And you know, that's basically how most samples based synthesizers worked, and so it works today, it was really cool to get into the low level of that too.
6:58 Michael: That's cool, and it sounds like the timing when you got into this was really good, like there is a real magical the whole- you know, everything is open, and can be invented at this time, right?
7:10 Josh: Oh my Gosh, yeah. I mean, it was such a magical time, there was just so much fruit, I remember one of the things that I learned during that time was I was also really interested in how you know, different video game consoles were doing music as well, and it came to my attention that there was a format called PSF that was taking play station sound tracks and squishing them, I came to realize that the library code and the sample code for some of the most iconic soundtracks I listened growing up, fit into like a 2MB window of data and I was like oh my Gosh how does this work, I still don't know how some of that works and I actually got to do an interview with the Final Fantasy tactics composers where I pointed that out and they were like how did you know that, nobody know s that, it was a lot of fun, that was something that we did with Loudr in the very early days.
7:55 Michael: Oh wow, that's really cool. Yeah, I feel like people who programmed back then, I don't know, I wouldn't necessarily say they had a different skill, but they certainly focused on different things, and it was so different than today, isn't it?
8:09 Josh: Absolutely, I mean, the demos basically is kind of icon of that ideal, right, like, there are people that just wanted to make things because they could make things, and something that has been really riping creativity too is working within limitations, and because we had so many limitations back in DOS and early Windows, it was just really fun you had to figure out like ok, well, we can't just 8:31 in, we can't have people download 500 MB just for the music so how do you make really good music fit into a much smaller space. Well, there is lots of creative ways to do it, and so that's what I was exploring at the time.
8:42 Michael: Ok, that's awesome. So, let's jump up to today, let's talk about Loudr, so L-O-U-D-R.
8:53 Josh: Yeah, Louder without E.
8:54 Michael: If you want to check it out it's at loudr.fm. So, I think this is a pretty cool idea, but why don't you tell everyone what Loudr is for start?
9:02 Josh: Sure thing. So, Loudr is primarily in the public's eye, Loudr is a distribution platform where artists can come on and similar to like DistroKid and CD baby and that sort of thing, this is a tool you would use if you wanted to get your music on iTunes or Spotify, that sort of place. The thing with Loudr though, that is distinctly different is that we do cover song licensing, so as an artist, when you want to do a cover song,you have to get a license, you have to deliver something in the mail and then even worse, you have to do monthly tracking of your sales and send checks to publishers whom you used the rights with right. So Loudr takes all this whole process and built a platform around it, this was something that the team was doing prior to quote-unquote Loudr as acapella records or joypad records we did it specifically for acapella artists on iTunes, when nobody else was and we did it specifically for the video game music cover crowd, which was very niche at the time, but it's definitely grown into a very big thing now; and we took that model and said you know what we need to make this an open platform, where people can come on and just put cover songs online, and so if I were to cover song right now I would do it on Loudr I don't have to pay anything upfront and all the royalties for the publishers in the rights holders actually come out when the sales come in so I'm not actually paying out of pocket to do any of this process, which is really neat. And when we launched Loudr that was that was not a thing that existed at the time.
10:24 Michael: Yeah, so prior to this was it pretty hard if you wanted to do cover song, like one of the songs I really like that is a cover song is the Boys of Summer, the Don Henley song but covered by Ataris, like this punk rock theme, that's a cool song, those guys you know, they don't own Boys of Summer right, Don Henley does.
10:44 Josh: Correct.
10:44 Michael: So, that that's hard, that's not so easy for people to do right?
10:48 Josh: No, it's not trivial and it's not trivial because there's legal aspects, there's the actual execution which you know has details and they're not fun details, and then there's you know you've got to be on top of this process every month, which is a chore right, it's something that people would do and you would typically see cover songs when done legally that is of course, you typically see cover songs be coming out by people who are represented by labels, who knew how to execute licenses very well, and it would be really common to see labels do covers within their own catalogs right. There's a bit of a misnomer and that statement in that the labels who do the sound recordings and the publishers who actually write the songs and represent the lyrics and the musical notes if you will, are actually different entitie, that's a conversation for another time I think, because it get really down the rabbit hole there, but I think demonstrates pretty succinctly how difficult this problem can be.
11:39 Michael: Yeah, I can imagine. I don't have any musical skills, so this is not a problem that I suffer from, but if I wanted to cover some song I come and I create account with you and what do I do, I just like record it, or do I have to like get you guys to work with the original artists, or you maybe it's not even the artists, right, maybe it's the label that was-
12:06 Josh: Yeah it's the publisher right, so you basically come on, you would have you would have already probably recorded your album or you're single right, so you're coming to us to execute the distribution aspect as well as the licensing for a product you hopefully already created, before you signed up, but it doesn't really matter, and you can do originals too, that's also not a big deal. But the way you would do it on Loudr is you just sign up, with your free account you would submit your album, once you've gotten cleared, because it's a free platform, we have like some upfront tax stuff we have got to figure out just because receiving a lot of money, and the overhead of doing this for free was a thing that we had to deal with and that was our solution. Once you've done that, you just upload your album you send us the metadata, you let us know what songs you've covered, and we take care of the rest. Fortunately, it's not a very long process, because in most cases there's not really a negotiation aspect anymore like there used to be, there is a lovely little section of copyright law that allows us to send what's called a compulsory license meaning that as long as the song you're covering has been released in the United States on either like a specific digital phono record or a physical one, meaning specifically it has to not be something that's in the background of a video, it had to actually come out as its own product right, then we can cover it.
13:11 Michael: Wow, I had no idea, that's pretty interesting ok. So that definitely takes it the legal part the human involved part out of it that's cool, but the rest of it sounds like it involves a lot of software, and let's dig into the Python stuff but before we do, you told me right before we hit record on this episode, that Loudr has its team sort of scattered about some origins in San Francisco, but that recently you guys transitioned to being a fully distributed team whereas that was not the case before; can you give me a little bit- what's the story there, like, how's that working out for you?
13:44 Josh: It's actually worked out really well. So, in the office we were a bunch of young guys and some talented just talented everybody on the team really, the things is we were suffering from in a small space looking at just when we have focused conversations, do we have a conference room for that or can we do that on the floor, and what's the consequence if the operations team is getting into a big conversation, while the engineers are focusing- these are all problems that are great, but it seemed that we were losing the ability to really hone in on specific problems, just because of all the all the things happening at once, and I think this is something that happens with a lot of startups specifically, but for a start-up tackling such an immense problem like the one I just described to you, and I really only scratched the tip of this, right. You know, it wasn't becoming of us to constantly interrupt one another to try to figure out you know how we're going to solve this next problem which is immediate, and any of this also has to do with solving legal problems right, you know, every legal problem is an immediate problem, it's not necessarily something you can defer, and if you can't defer your problems, how can you take the time you need to solve them?
14:47 Well, all that said, we wanted to keep the small space, we love it, the operations team and the business team are still in San Francisco, and they are thriving, doing so much great work there now, and the engineers are really benefiting. The funny thing about it, and you know, since I think Slack came out particularly, a lot of the stuff that engineers typically do today is already really remote compatible right, like so I'm not going to get on the whiteboard and copy/paste my code onto a physical medium right, I'm going to send it over Slack, or I'm going to get a code collaboration tool of some sort to actually get in there, and collaborate on a computer screen, over the internet right. So if I'm already doing that big part of it, the next big step was to figure out ok, well how do we manage your tasks, how do we stay in sync, and how do we ideate; well were already doing that online too, we used Asana for task management, we were doing that when we' were in the office sort of like Trello as well. Whiteboarding stuff we were already doing in a few tools that were available online, kind of switching around but MindNode was really succeeding for us.
15:46 I have crappy handwriting so I love to do things when I can just type it out and really get things out fast. And then I think with tools like Screen Hero we can share a screen and actually do cooperative coding. We were already the remote team, just in the same room, so it worked out really nicely, it was super smooth transition, the time zone thing I think is the biggest challenge, so what happened is we were all in San Francisco and then due to some family things, with a few of our employees that kind of happened at the same time a bit synchronistically, we just said you know, this is an opportunity give this a try and see if it works. We did a trial period, we found that we were super productive as a result of going remote, and we've stuck with it.
16:25 Michael: Oh, that's a great story, and I'll definitely have to get links to those tools that you're talking about, especially the collaborative, the Screen Hero one, that sounds cool and I've done a lot of Skype and various other ones, but I haven't tried that one.
16:38 Josh: Oh yeah we love Screen Hero. I think Slack actually bought Screen Hero so it should be integrated any day now, because they bought them a while ago. If you're in Slack, if you've used it before, the call function is actually built by the, it was either Screen Hero speak actually I could totally be making false statements, double-check; we used the tool called speak as well before, and we didn't need to use anymore.
17:00 Michael: Ok, yeah that sounds really cool. And I totally agree with you, first some reason I guess it makes a lot of sense, but we do seem to as developers already used tools that are extremely compatible with being distributed or asynchronous work you know github pull request for example, all these things. So I think it makes a lot of sense, I've spent probably the last 10 years working either remotely in distributed teams or you know, this last year I've been doing my own solo thing, but I've lived that life and it is really nice, I definitely feel more productive, the time zones are painful like I'm spending this year in Germany, so I yeah it's great, but the problem is like I have a lot of meetings in California, so it's like hey Michael can you meet at 11pm on a Friday night, it's like you know, I didn't want to see my wife anyway, right. So it depends how far it goes, but what's in the US within a few hours, it's pretty nice, the time almost works for you right, you get like that dead zone where you can actually get stuff done.
18:01 Josh: Absolutely, and it's funny, so you're in Germany, we've actually got contractors in Berlin right now, and I think that's kind of nice about me being in the East Coast is that I'm right there smack18:09 in the middle of the West Coast and then our contractors who are in Germany, and then the team that we have in the West Coast is really easy for me to kind of get the sync from them at the end of their day, and at the beginning of the West Coast day just kind of give them a brain dump on what's going on. And that's worked pretty nicely. The other thing too, you mentioned you've been happier, I think that's something we've definitely felt, when you're able to live where you need to live, there's definite advantages and speaking as somebody who's just recently moved away from the bay area I'll tell you the amount of money that we are saving just on the cost of living is it was like getting a huge raise right, but no actual money needed to be spent by the team to achieve that, we've had some frustration just setting up Ohio taxes and stuff, but that kind of stuff has to happen you know.
18:58 Michael: Yeah sure, yeah that's really cool.
This portion of Talk Python To Me is brought to you by Rollbar. One of the frustrating things about being a developer is dealing with errors, relying on users to report errors, digging through log files trying to debug issues, or a million alerts just flooding your inbox and ruining your day.
Rollbar has put together a special offer for Talk Python To Me listeners- visit rollbar.com/talkpythontome sign up and get the bootstrap plan for free for 90 days, that's 300 000 errors tracked, all for free. But hey, just between you and me, I really hope you don't encounter that many errors. Rollbar is loved by developers, awesome companies like Heroku, Twilio, Kayak, Instacart, Zendesk, Twitch and more. Give Rollbar a try today, go to Rollbar.com/talkpythontome.
20:34 Michael: Let's talk about Python. So Python you said is pretty central to what you guys are doing, how is it involved, can you describe your architecture, what pieces what moving parts are in there, and the how does Python fit into it?
20:45 Josh: Absolutely. Ok, so just I think to set the scene right, so Loudr is built on Google App Engine, which is on Google's cloud platform and we've been on App Engine pretty much in one way or another since its inception. I started out, actually I met the Loudr team before I was working for them or with them rather, to make Loudr was on a product called the Game Music Bundle, where I needed to build a store really quickly and the way to do that I found was to build it on this new thing called App Engine, and it was it was a total it wasn't really a risk because I don't think that failure would have been that big of a deal for something that was new like this and just something I built in my free time with some friends; but to have it succeed in the way it did was really a proof they're like, okay you can develop fast, you can iterate quickly, you can deploy quickly, and you can scale massively, and that was like holy crap, this is really cool. So I became immensely fond with App Engine at that time and I brought that into the Loudr.
21:39 Michael: What year was that?
21:41 Josh: I want to say 2012, maybe even 2011.
21:43 Michael: Ok yeah, App Engine probably still had that new cloud smell right?
21:49 Josh: Oh yeah, it had the new cloud smell, it had the undocumented appeal, the whole thing was there, it was it was diving deep.
21:59 Michael: That's cool, and Python was one of the two first available languages or was it the first available language?
22:04 Josh: I wouldn't know App Engine's history well enough to know, I know that as far as what's on their manage platform,22:10 you're looking for manage runtimes I guess you call these, you've got Python Java and Go, Go wasn't even around at the time for sure.
22:18 Michael: Right and Java came later, so I think it was Python ok.
22:20 Josh: I'd say it's probably Python, because Java I don't recall being an option I knew Java and I probably would have started with Java to be honest, very glad I didn't . [laugh]
22:30 Michael: Yeah for sure. So, App Engine is really good at getting you started; I haven't done much work on it, but I did check it out for a few projects and it seems like you can run real legitimate stuff there, at least you know, back in the time frame that you're talking about, for almost free whereas the other hosting options in that time frame were way more expensive right?
22:52 Josh: Yeah, the nice thing too about Google is that they have been really aggressive on keeping up with pricing, and I think a few years ago I was at the Next before it was Google Next right, the conference that they had and they were cutting their data storage costs like I don't know what happened at one point, but they completely undercut Amazon in a big way and Amazon responded to it, so it was really nice to kind of be on the front edge of those pricing changes, and data was really cheap and the nice thing and just kind of bring it back to Python here when you're using App Engine is you can just open up- this is not a good Python practice by the way, but you could just open up one file right, stick your basic handlers in there, put your data scheme in there in the form of a Python class and integrate it with one of their API frameworks which is just really beautifully written. I think Guido actually is the guy who wrote the NDB API in Python for them, so whenever I work with it I've always got this like tingle of magic, because it's basically doing like future management with generators.
23:54 Michael: That's really cool yeah.
23:54 Josh: Yeah, So I really love it and there's definitely some grace to how it was designed and then you know, all the APIs that they have available like you've got a built-in data store, you've got the obviously the actual platform itself, they've gotten to the point with cloud platform now where you've got a container engine via24:12 24:12 Cougar Nettie's and Docker, you've got manage VMs which s the auto scaling just computed instance sort of thing, so there's a lot of stuff in Google and I think people come and they look at Amazon and they think that's the de facto, but in reality, you've got a really aggressively smart platform with Google, and then also when you consider the things that Google is doing with like data flow and machine learning, and we all know where Google goes with machine learning right so like to have them expose those tools is really smart frameworks an API is really exciting as well, and it's always been really fun to work with that kind of technology.
24:47 Michael: Yeah I'm sure it was. So back then that sounds like you were probably working with the platform-as-a-service not infrastructure-as-a-service, but yeah, because that came along later. One of the things that I think is really cool and you talked about how cool it was, is that there's all these APIs you can use, but also it feels like if you write your code for the Google App Engine it's kind of like it's their right, because you can't extract it, and say move it to AWS or to a VM, right?
25:18 Josh: Sure yeah, and that's one of those things when you're looking at it and you're thinking ok, well we need some data liberation, how are we going to deal if Google one day is like this is not a priority, this is a waste of our assets, we need to just cut App Engine right. Fortunately, and this is really awesome, I don't remember when this came about, but there was a complete initiative of a team of people who was dedicated to making sure that all of the core APIs and frameworks for App Engine were available on an open-source stack, and so I don't actually have the name of it, but I'll definitely try to find it and send it your way; but there is a way that I could take the whole code base and pull it off App Engine proper and actually run it on something like AWS if I wanted to. Now I think there is certain API's that we beach challenge with changing. Fortunately, kind of internally at Loudr we do when it doesn't make sense to use one of the crazy cool frameworks, we are for example using elastic search you know, we could use big query something as well, but like there has been an emphasis on making sure that you know- if the day comes that Google's like screw you guys we are going home, that we're not going to be left in the dirt.
26:24 Michael: You know we're taking our ball and we're going home. Yeah, I don't see that happening, I mean this whole cloud computing thing is really- it seems to have some actual legs
26:34 like I just read I think it was today, that Amazon's profits, not revenue profits, jumped 800%.
26:40 Josh: Oh I'm sure.
26:41 Michael: And the majority of that was from AWS, right, it wasn't been doing better with books or you know things like this. So very cool. I think Google App Engine will definitely stick around. So maybe tell us about what does your architecture look like, what are the pieces, what kinds of things you guys need to do in there?
26:58 Josh: Sure, so we very much have that microservice mentality right, we're building a very segmented APIs to solve specific problems. I think one of the cornerstones of the Loudr framework is built on this concept of a task queue, it's literally called task queue in Google but anyway, Cassandra 27:14 you know you can implement task using several ways of course. But Google's got this nice task queue concept, and so we're building a lot of things you know, so I guess our problem space, we like to divide into three ideas or three core problems and these problems are matching fulfillment and administration. I kind of talk to you about how you could come to Loudr and you could say ok I'm going to cover Boys of Summer right, it's our job to match your sound recording and the information you provided to us with the actual composition, and know who the publishers are attached to that. So we're getting lots of ingestion for rights holders, lots of data feeds are involved, there is indexing problems, so that we can search and find things quickly, that's good for elastic search right ; there's a lot of machine learning involved, because we have to do this at scale, it's not one or two people come and do licenses, we have an enterprise service where we're doing this for big businesses, like DistroKid for example, which requires us to be able to- we can't just have humans doing everything, so there's that aspect of it as well.
28:12 And I think the way we look at it we kind of take those core problems and we segment our code in microservices that represent them. Then of course, you've got the shared microservices for things like the user management. I think it gives you an idea of how we work and what we think about. We also have a big emphasis on not reinventing the wheel, so we like to utilize and contribute back to open source projects. Google's worked on something called a pipeline library for App Engine specifically in Python that we've given a lot of contributions to, just because we so heavily rely on it. We've also been working with one of our really talented contractors helped us build an open source framework for an API in Python called Pale, and I really liked the work that we've done there simply because it's the most Pythonic way do a RESTful API they've seen yet, the built-in documentation built in to like the way you manage arguments on the API level is just something that's really fluent, I'll share the link to that with you, at the end of this call so you can kind of get that out.
29:11 Michael: Yeah, awesome, yeah I'll definitely put that into show notes. It sounds really cool. So, if I'm writing a web app on Google App Engine do they have their own framework or is it like Flask, or what's the web actual API?
29:26 Josh: Yes, so you've got a few choices, webapp 2 is what we use, so you're using that framework and they are kind of just built on top of it, you've also got Flask available and when you're on an App Engine you do have the choice between the managed platform the managed to run times I think they're formally called, and what's called a managed VM which is just basically like an EC2 instance, a virtual instance right. And you could do whatever you want on a virtual instance; on the App Engine runtime that's managed, that we use for a lot of code, you've got a really tight integration with the data store, super low latency, you've got access to all those- I think I mentioned the NDB API where you could use generators to manage features, that kind of writing is really Pythonic and it's helping us right really efficient code really quickly. I think the problem of dealing with lots of data to and all the complicated sort of business stuff we have to do, it is really behoving to have a situation and a platform built for us where we can say, ok go fetch this entity and while you're doing that communication, get this work done and then this work is done ok, now maybe that response is ready, and will block when we need to.
30:27 And to writing in task where you're just yielding futures the way that the runtime and the loop works in there, it feels a little bit like black magic, but you can actually just go look at it, I am fairly confident I know this is true because I look at it all the time, the NDB framework itself is also just out in the open, you can go look at the code for it. So it's just nice to be in that kind of space where you got the Google level engineers who have built a platform that is highly technical and like I mentioned Guido being involved like this is just out of this world. Obviously he's at Dropbox now and another really fun thing to do, but I think you get gist here, like it's great for, as far as I'm concerned, like being on a platform as-a-service it allows me to not worry especially as a CTO, unlike the sort of operations problems, like oh is this instance screwed up for this reason, oh do I need to hop in a shell to fix this trivial problem, most of that stuff's already done for us. So it's really allowed us to focus on the problems that we need to solve, and be really efficient, making sure that we're not spending our time doing work that really probably shouldn't be doing right.
31:31 Michael: Yeah, yeah it's awesome. Yeah I mean, would you ever consider having like a data center in your own company again? Maybe it's pretty far-fetched though.
31:42 Josh: Usually what I would say that the ever makes an engineer frown is business reasons, 31:51 reasons to have data centers like, fortunately we've got Google's level encryption, so we've got confidence that they can't read certain things that have gone on there any whenever we have anything super sensitive we're using public key encryption on the server so that private key is only in the hands of developers, like people even get the access to the data couldn't possibly read what's in there for a lot of the really sensitive stuff. So with those kind of technology is really blossoming especially lately, and encryption like I don't feel that there's too many cons on the cloud-
32:18 Michael: I should put a caveat 32:18 in there saying technical reason.
32:22 Josh: Yeah I know, you probably should have, no I can't see much technical reasons, cost might be one but like you've got the pro and the con you got a balance right, like what happens if your business blows up overnight and you haven't you know taken the time to build up your data center, put the resources in, have those guys build that sort of thing and it's like what's the point. I think that nowadays if you're going to start a new project what you should do, what might work for you is to build it on the cloud first and then figure out if you have a need to go to a data center, unless you've got a very specific reason going and why you don't want to be on the cloud right. And then, thinking about what if Facebook was built today right, like Facebook is- I have recently seen a lot of advertisement targeted at me because I'm a developer, I guess about how they've got this awesome data center and the brightest minds are working on the Facebook product at that level, and that's really exciting and really cool but honestly, would they have done that in this environment today- I'm not entirely so sure.
33:16 Michael: Yeah, I agree, although it is interesting, you mentioned Guido and Dropbox, and I doubt Guido has anything to do with this, but they're moving out of AWS into their own data center but it's interesting, right but it's like that's a pretty long ways to grow before I they grow in the cloud, so, yeah very cool.
33:33 Josh: Absolutely. On the flip side, you've got company like Spotify who just went on to Google cloud right, so 33:41 data centers and go in the opposite direction.
33:43 Michael: Yeah, it's an exciting time to watch all these things happen. So, you mentioned machine learning, what's the story there? Is that like scikit learn and what problems are you actually solving?
33:52 Josh: Yeah totally, so like I mentioned, a big aspect of what we do is matching content to content right, so we've got meta-data; this is kind of one of the funny problems in the music industry that I'll just quickly set up the scene for you- I did mention earlier how publishers, the entities who represent the rights holders of the music itself the notes in the lyrics the guys who are actually doing the notation that sort of thing, are separate from the labels who represent the artists, the sound recordings right, the band, the killers 34:20 might be at this group of guys and then the actual people who wrote their songs are represented by a publishing team have a different set of tasks. If you ever got a hallmark and you buy a card that has like the Beatles lyrics on it, you bet your bum the hallmark has to pay the publisher that owns the represents the Beatles right, and I'm not sure that that's Apple Records because Apple Records is the label side.
34:39 So when you've got this segregation between the publishing data and the label data, one of the fun things that happens is the publishers don't always know who is using their music and when they send out the song information to their partners, they're not always including like ok, here's the song Happy Birthday it was written by these people right. They know who it was written by but they don't know who necessarily recorded it, and then you can imagine especially in that example a lot of people have recorded Happy Birthday To You right. And then, the labels on the other side will send content distributors and they won't include the publishing data because they don't necessarily know it either, it's not a connected chain, so when you're looking at two sets of metadata and you need to start drawing lines between them, you start thinking ok, I can either throw rocks at this, or I can have a statistical method for solving that problem.
35:27 I think as everyone who's listening knows Python really excels at these machine learning problems, there's lots of frameworks and lots of libraries, and lots of tools that come to mind very quickly when thinking about machine learning in Python. What's nice is that we have internally built this really abstract framework for doing machine learning-based problems, in terms of how it integrates with the ES level and with the platform so we are able to quickly look at two different models of information, and do some really cool things on the learning side with exporting training data, with setting a model, with changing features that sort of thing. So I think Python really paid off there.
36:05 Michael: Awesome and that's scikit learn? Or is it different? It's your thing?
36:08 Josh: It's a bit of an internal thing and honestly I'm not the guy does the machine learning stuff-
36:14 Michael: Yeah, no worries.
36:17 Josh: To know what tool we are really using, but I do oversee the integration on the platform side, and that's where it's been really fun. So machine learning has been a very big learning process for me going into this, and right now I think you are going to ask me about hiring later, but we've been looking for people who are really talented in that space and we've actually brought on really recently some great engineers in that space and it's like oh my gosh the things you can do it's really quite fascinating.
This episode is brought to you by SnapCI, the only hosted, cloud based continuous integration and delivery solution that offers the multi stage pipelines as a built in feature.
SnapCI is built to follow best practices like automated build, testing before integration, and it provides high visibility into who is doing what. Just connect Snap to your GitHub repo and it automatically builds the first pipeline for you, it's simple enough for those who are new to continuous integration, yet powerful enough to run dozens of parallel pipelines. More reliable and frequent releases, that's Snap.
For a free, no obligation 30 day trial just go to snap.ci/talkpython.
37:42 Michael: I don't know a ton about machine learning, I know a little bit about sort of neural networks and those kinds of things, but do you think that we're going to get to a future where we have software that does things, and we don't know why we cannot know why?
37:56 Josh: Oh boy, yeah let's go down the rabbit hole. [laugh] So, that is a great question and you think about neural networks and you think about throwing rocks, I think that we've got the limiting factor with Moore's Law a little bit on that. However, as we get into quantum computing that's when things get pretty interesting, and I think machine learning is just kind of inept into that, like if you're combining machine learning and quantum computing, I think you could start to scratch the surface on questions like that are like what is consciousness, can we really even quantify that anymore, when you've got a machine making decisions and it might identify the need to change some of its internal programming. Like, if a machine learns quantum computer can train itself to iterate upon itself, what happens next, right?
38:46 Michael: Yeah, I think it's pretty crazy. I mean, even before it goes that far, like if you're trying to just debug something or you're trying to manage like complex likes machine learning in like production, it might start making decisions, and you're like I don't really know why it's doing that, right, but maybe it should- maybe it shouldn't.
39:08 Josh: Yeah, something like we already have that today, right like so we have a trained model right, and we might see that it's doing something a little fishy on certain results and we might want to dive into- well it's not necessarily just like look at it and go oh well because the code path is this, and if this than that, it's really easy to see that, well actually it's not anymore, because you've got a statistical model you're basing it on, you've got to look how different features impact one another, you have to look how combinations of features work together, if this feature is scored highly and this other feature starts to go down, does the components go up or does it go down. Like, none of it is necessarily intuitive anymore, and it's all numbers. So the next step from that is exactly what you're saying like what about when the machine starts to do things that we can't even debug? And I think that's where you're looking at internal code patching, where the program can look and understand its own code base enough, that it can change it in such a way that it starts to only work in a statistical way right, so it trains itself based on a conditional pattern and then you feed it correct and incorrect results and it starts to learn, you know this model makes more sense than that model, and then at some point it gets so confident, so well-trained and you define the features so precisely that it no longer has a conditional code bath, because everything that it does can be decided on a statistical model. And that's just talking like the two-dimensional equivalent of machine learning I think, when you get into neural networks and you get into the sort of conditional based machine learning, that's when it goes next level.
40:38 Michael: Right, then you get like the 4 way switcher, it gets really interesting. The part you started if people are interested and they want to see a really great movie that like sort of explorers that in a science-fiction way Ex machina has a really good movie that I really I recently watched that, and it really freaked me out.
40:59 Josh: Yeah that's one of those things, I mean just for the people who haven't- I think it's worth talking about a little bit, you've got a situation where this machine is learnt you know basically what it needs to survive, and it starts to learn how to interpret human emotion based on the output of the human response right, so like if we can look at a face and determine that this emotion is that, and then it's got let's say the entire history of all the data in Google to Train upon, let's just use that as a real-world example, if you can actually build a physical medium for this thing to exist, and yeah it's going to start to do some really interesting things, and like what you touched on I think there's this point when it stops being conditional, it stops being functionally programmed. It's getting to the point where it's saying is this working for me or is this not, if not- try something new. And I think that's, you've got some really talented people right now, who have spoken out against AI, and it just feels like the natural progression. Throw a rock at me if you want for going down this path, but like when you've got the guys coming and saying it is a bit scary it's because in my opinion, if a computer is going to do that trial and error, and trial can be somewhat random, who knows what that means and who knows how to quantify success, because the motivations of a machine that doesn't necessarily have features if you will, that are targeting emotional human factors and responses, you're definitely getting a situation where that can be kind of scary. And that movie is a perfect quantification of the worst case scenario for an artificial intelligence.
42:40 Michael: Yeah, it's definitely worth checking out, I recommend it. Super interesting. The people who spoke out about AI, a lot of them are kind of like well you know how much really do they know about software, it's kind of like an amorphous concept. But, Elon Musk like that's a guy who is actually creating the future, so I definitely give his statements more credit.
43:02 Josh: Stephen Hawking too, I mean you've got a lot of smart guys who think of stuff like this. I have to find this for you too, but I found this a talk by the guys who are working on the first quantum computers, and for my sci-fi nerds it's really fascinating their big black boxes and he refers to them as black monolith, but he's talking and he literally talks about the way qubits work and he gets into this idea that it is kind of like you're pulling from this is going to go totally fringe, but it's like pulling from two different universes right that are parallel, but you've got a point where they converge and that point is the qubit and everything that comes beyond the qubit is sort of like the shadow of what happens; and the way that these things work he goes and talks about how the biggest challenge is how you keep all the chips that near absolute zero temperatures which is a scientific feat in and of itself and the stuff that he says in this talk is just like the predictions he's making and his colleagues are making, in 15 years computers are going to be able to do anything humans are going to be able to do, and much better, because they can train and iterate themselves. And you know, we like to talk about and joke that it's science fiction but science fiction is becoming reality, it always has followed that trend, we've completely diverged from music, but like absolutely it's one of these topics that is really fascinating I think quantum computing is something to keep your eye on, especially in the next five to ten years.
44:25 Michael: Yeah I totally agree and to me honestly, quantum computing is scarier than AI. You put them together you get all sorts of trouble but even basic things like encryption, what happened if tomorrow we found out that all encryption was useless, what would that do to the economy, to jobs to businesses like yours that need privacy. I think it would be really an immediate problem.
44:57 Josh: I think it would be an upheaval of current paradigms that we have in society but one thing that humans are really good at is adapting right, so you know let's say encryption goes out the window, I think humans will very quickly become aware that ok there is a real need for transparency in the world, and we'll find a way to adapt and you want to look at an example where that's happened- bitcoin was the perfect prototype of for example currency that was trying to solve so many problems by being transparent. And I can tie this back to music because one of the things that's really trending in this industry that I'm in right now is the idea of having a blockchain for music royalties, so whenever a usage is generating royalties what they want to do is use a Bitcoin like technology to attribute the stream or whatever is generating the money, and transparently show who is getting paid to that would attempt to solve this huge problem of why aren't artists actually getting paid for their content, why aren't songwriters getting paid for their content. Right now though, and to go back to transparency, the problem is that the blockchain doesn't work unless the data it's attached to is transparent, and right now rights holders and other interested parties are not always ready just saying oh this is all the music we own, and this is how much to pay us because there is value in that data, and they're able to make intelligent business investments and make some money by putting a value to that information.
46:17 Michael: Yeah that's a really cool idea, but I can definitely see the real world challenges. All right, so let's bring it back to Python just for a little bit. So you talked about machine learning, you also talked about some other cool things, you said this RESTful API that you guys open sourced called Pale was really cool- tell us a bit more about that.
46:36 Josh: Yeah, so Pale is nice, and the reason I like Pale so much is we are writing things in such a way that- let's just start from the beginning- if I'm going to build a new API and actually I'm doing this by making a website for my own music, where you can come and I just wanted to have like band camp music integrated into it, so I could link everything to the Spotify, so the way that I did this was I just pulled down Pale right, and Pale you can implement with Flask or webapp 2 and you just get off the ground writing a RESTful API in a very Pythonic way. So you manage information that comes in responses in the form of resources and you define those very succinctly, and they can nest really nicely so I can have a user resource that mentions a password resource if that was structured, in a smart way.
47:22 And then you've got the concept of arguments that is going to handle casting; so you've got a string argument, you've got a boolean argument, and the way why this is so good is because you kind of remove serialization from how information is handled, so you can have a json input which is ideal in my mind, but you could also have an XML input and what you're doing is you're doing a really nice kind of object-oriented solution of keeping the problem where it belongs, so like I'm not going to try to struggle with a string argument problem outside of the string argument class, right, and this is a trivial example of strings, but when you get into structures like currency types or other structure types of information, where you might have a need to do all sorts of fun stuff like permissions is a great one, so on Loudr for example in our API we have let's just say like an album resource and it represents an album on the store.
48:15 The thing is, when you are the owner of the album versus the public user looking at this album, you're not going to necessarily need to see the same fields, and one of the challenges when building any new framework is how do I actually manage those permissions in an intelligent way, where my code and my permissions logic isn't all over the place. Pale provides a solution to that and that we've got this ongoing context and the context is passed every time any field or resource is going to be rendered, and the context is going to carry all the things about the request, the HTTP requests, all the arguments that come in that have already been deserialized to Python, dictionaries, and it allows you just to look from let's say I am coding the album resource and I need to say these are the fields I want to expose and I've already defined the fields of the class level, so the album has like a track, number of tracks, which is an integer, argument it's got the name string all that stuff and all I have to necessarily do is say context, tell me who's accessing this, let me determine what the permissions model is, and that itself can be abstracted however you want to when you implement.
49:18 And then, it's the resources responsibility to decide how it's going to render argument. So, this makes things like expanding nested relationships, really a lot more easy to tackle because you don't have to go into some really nasty conditions if you're thinking about just like how would you do this from scratch, what would you do. I think there's other cool API frameworks out there, but this is the first time that I've used one where I felt- and I don't know if it's maybe my app engine or web app to kind of background to coming in where this context idea is so fruitful, but having that context and having all your permissions and all these challenges whatever you want to do, be passed at that level where the request is kind of contained like that, it just makes programming this whole problem really nice.
50:03 So I'm going back to my open source like website thing that I'm working on, it's really easy for me to quickly build and I think in a day when I was using pale i was able to scrape band camp put all my albums into the API, have a situation where I could write a react front end, and actually show the albums have search for the albums, have album specific pages, have those streams playoff band camp and off Spotify, literally within a day. And that feels ridiculous to me for some reason because I guess when I got started doing this in 2012, 2011 whatever it was, it was not that fast, like especially to do
50:39 Michael: It was like they got a three-month project or something right?
50:42 Josh: Right, and do it in a RESTful way, especially I guess at that time there wasn't necessarily large of a push for it, but when Pale's got this real big emphasis on being RESTful which is the most intelligent way to do an HTTP API at this point, like it's just beautiful like you get to specify n points as classes, and you can determine this end point and this end point might share the same URL, but they have different methods. And it's very straightforward.
51:08 Michael: That's really cool, yeah. Very nice.
51:09 Josh: In addition to that, you get to use like to go back to this argument concept, I can have a URL argument and take advantage of webapp2 is like URI4 for example, and just dynamically determine what the URL should be four canonical resource. And that's Pale like a lot about Pale in a small amount of time, but yeah I mean that's why I'm excited about it.
51:29 Michael: Ok, cool yeah, I hadn't heard of it, so I'm definitely going to check it out that sounds awesome.
51:33 Josh: I like it. There's one example app that's like included in the code base and I would like to 51:38 that by having more examples. I'm talking about this thing that I'm building it's not quite open source because when you start your project you want to make sure it looks nice, but when you put it out and it's usable for others, but that'll actually be a good example of an App Engine Pale integration that people can throw up on their website very quickly, I think it will be if all goes well I think that the more technically inclined are the musicians that are trying to do some really interesting things with their websites my transition too, that's the goal.
52:02 Michael: Yeah that sounds very cool. So, one of the sort of hot topics right now given that we're coming up on the end of life for Python 2 in 2020 is whether people are using Python 2 or Python 3, how about you guys?
52:17 Josh: Yeah its 2.7, so these are things that we have to face. App Engine's got Python 3 support as of relatively recently I think. Fortunately in terms of how we've written the code so far I've done some early assessments but I don't think it's going to be too problematic to port over, it's just one of those things whenever you're doing like a big version change that it's like oh my gosh how we're going to gracefully transition, are going to be able to do it in time, oh my gosh what's this and that; and I think Python says that they're going to stop in 2020 it's going to be the end of the life, but and I don't think that we're going to functionally see Python 2 die in the same way that Josh Whelchel still plays Doom on his computer and it's not the new Doom. [laugh]
52:59 Michael: I think Python 2 is going to die like windows XP is dying, which is to say that I go to plenty of places and they take my credit card and interacts with the thing that windows XP on it makes me nervous but they do it, or I think the US military at least for a time was paying microsoft extra for support even when it became unsupportive, things like this. I think that's how Python 2 is going to go.
53:26 Josh: Like pip install requests and then pip is going to be like installing Python 3, suck it. [laugh]
53:34 Michael: No Kenneth what are you doing? Yeah, exactly. How awesome. All right, so you talked a little bit about having a lot of cool opportunities for people and bringing out some new folks that were able to really make the machine learning zing and how awesome that was. Some of my most popular episodes are when I talk about people getting jobs or sort of what you could do with Python, and so on. Do you guys have open positions you're looking for Python people?
54:05 Josh: Yeah, so we've actually just put out a call for a senior backend engineer and if you got machine learning experience that's going to be absolutely a big plus, the same goes for like natural language processing, that sort of thing. My expectation is that we're going to see a blossoming of opportunities at Loudr in the next probably after a window of three to six months somewhere in there, just simply looking at the growth in terms of the clients that are integrating with us at this point and our need to scale to the amount of clients that we'd like to. There's definitely going to be some ripe opportunities especially in machine learning. Right now, we have some really talented Python contractors doing some really dedicated machine learning work, and actually if you've got the experience of machine learning and you are a contractor particularly, and you want to get your hands wet or dirty or whatever, in a practical example where Python machine learning actually come into a business practice then you should definitely give me a- don't call me, email me for sure.
55:00 Michael: That's awesome, I'll put some you can give me some link that we can use for them to get in touch in a proper way that is not waking you in the middle of the night. All right, so we are kind of getting near the end of the show, and let me ask you a couple questions that always ask my guests at the end; we talked a little bit about some of the PyPi packages that you are using, but you know there's 80,000 of them and there's always you know every I know, now it blows my mind, it might be quite a bit higher, I need to look probably, there's so many that are amazing like this Pale one that you told me about, that people don't necessarily know about, but you have a lot of exposure, so is there one you really like it is that Pale or is that another one?
55:46 Josh: I can't say Pale because that would just be biased but-
55:52 Michael: Well I'll nominate Pale for you and you can you pick another one.
55:55 Josh: There's just so many Python packages it's like hard to choose like I think you've probably gotten the answer requests probably a lot I would think. I don't want to be lame I'll make music really, there's one called mutagen which is great for handling metadata inside of an mp3 or WAV file that saved us a lot of time in terms of just being able to look at what's inside of files, the trick that came with it of course is and I think this might be why it's my favorite is it required me to solve a problem, it required me to iterate on a little and I like that kind of interaction with anything I'm working with I actually have to get my hands dirty to solve a problem it has. I'm going to have a better experience with it and I know that might sound strange considering a lot of people to say oh yeah I put in and it works, but I find that when you get to actually modify and you learn and understand we're working with a lot better, and mutagen was one of those, so one of the things that we had to solve was how to get mutagen to look at the meta data or the id3 tags if you will, inside of a file object rather than inside of something that was just a file name or on the file system, and it initially hadn't been written, it was actually written in a way where most of the architecture was working with an open file resource, but for whatever reason the API itself in the opening and all that stuff wasn't right there at the time and that's probably changed, there has been a few years since I had to do this but I really do enjoy using mutagen because it's just like any time you can bring music into Python and solve some problem like that so that I don't have to call a command-line I don't have to get into the bytecode or do any like frame decompression yada yada. I'm going to really appreciate it and I think mutagen is that for me.
57:23 Michael: That's awesome, nice recommendation. And then, the other question is when you're writing Python code what editor do you open up?
57:30 Josh: I am a Sublime guy, I've been using a little bit more Vim these days but for some reason I still really like Sublime, I like the Python packages and I haven't gotten into Atom really like, just Sublime all the way.
57:43 Michael: All right that's definitely a popular one, that's cool, All right, Josh, any final call to action? What should people do if they have got this music problem, how do they get started working with you?
57:48 Josh: Yeah I mean, so this should be fun, if you want to build a website sort of A la SoundCloud or something, and you want to do it and make sure you don't get your butt sued when you do it, you would want to talk to Loudr, so if you ever like streaming music on the Internet or you have ever distributed content or let's say that you are a small label actually I kinda like this example because we work with a few distributors, so like if you are looking to solve this problem that you have where you have artists or you are an artist yourself and you need to distribute cover songs or you need to secure mechanical license you should definitely get hold of us, because we are solving this problem in a way that's very technical and scalable which means that the 58:35 maturity of the business level of integration you're going to have access to some really cool api's. The stuff we're working on now is really exciting and I think it's going to bring the level of technical understanding in the music industry to the next degree. I think the joke about the music industry and technology is it's about 10 years behind and that's because I think technology has typically screwed the music industry in some form or another. You can have whatever opinion you want on it but I think there is some truth to that statement. So I would say if you're getting into being a label or especially if you're doing some cool integrations and you're maybe automating your distribution or something like that, or you like working with software, talk to us at Louder like we'll help you solve this problem in a really cool fun way for you, because it can be done in code, and who doesn't like to solve all their problems in code.
59:23 Michael: Well I think the people listening didn't really like to solve their problems in code so it definitely will be well received. All right, Josh, thanks for sharing your story this is really interesting what you guys are up to you.
59:36 Josh: Absolutely thanks so much, I hope that we didn't go on to many crazy things, that was a lot of fun.
59:40 Michael: Yeah, that was definitely fun, talk to you later.
59:41 Josh: Yeah, bye.
This has been another episode of Talk Python To Me.
Today's guest was Josh Whelchel and this episode has been sponsored by Rollbar and Snap CI. Thank you both for supporting the show!
Rollbar takes the pain out of errors. They give you the context and insight you need to quickly locate and fix errors that might have gone unnoticed until your users complained. As Talk Python To Me listeners, track a ridiculous number of errors for free at rollbar.com/talkpythontome
Snap CI is modern continuous integration and delivery. Build, test, and deploy your code directly from github, all in your browser with debugging, docker, and parallelism included. Try them for free at snap.ci/talkpython
I'm so excited to finally be able to unveil my Python for Entrepreneurs course that I told you about at the top of the show. If you want to learn Python web development and launch an online business, you owe it to yourself to check out the kickstarter at talkpython.fm/launch. I hope we can build something amazing together.
You can find the links from the show at talkpython.fm/episodes/show/70
Be sure to subscribe to the show. Open your favorite podcatcher and search for Python. We should be right at the top. You can also find the iTunes feed at /itunes, Google Play feed at /play and direct RSS feed at /rss on talkpython.fm.
Our theme music is Developers Developers Developers by Cory Smith, who goes by Smixx. Cory just recently started selling his tracks on iTunes so i recommend you check it out at talkpython.fm/music. You can browse his tracks for sale on iTunes and listen to the full length version of the theme song.
This is your host, Michael Kennedy. Thanks for listening, I really appreciate it.
Smixx, let's get out of here.