« Return to show page
Transcript for Episode #51:
SigOpt: Optimizing Everything with Python
You've heard that machine intelligence is going to transform our lives any day now. This is usually presented in a way that is vague and non-descript.
This week we look at some specific ways machine learning is working for humans. On Talk Python To Me you'll meet Patrick Hayes the CTO at SigOpt whose goal is to accelerate your machine learning by "optimizing everything". That's a pretty awesome goal!
Listen in on this episode to learn all about it!
This is episode number 51, recorded March 3rd 2016.
Welcome to Talk Python To Me, a weekly podcast on Python- the language, the libraries, the ecosystem and the personalities. This is your host Michael Kennedy, follow me on Twitter where I am @mkennedy, keep up with the show and listen to past episodes at talkpython.fm and follow the show on Twitter via@talkpython.
This episode is brought to you by Hired and Snap CI. Thank them for supporting the show on Twitter via @hired_hq and @snap_ci.
A couple of quick updates for you before we get to the interview with Patrick. First I want to thank all of you who participated in help me launch Talk Python training through Kickstarter. I finished the course, got the website online, it's training.talkpython.fm and added all the Kickstart backers who have filled out the survey to the course.
The feedback so far has been really positive- I got this from one student just today, "Absolutely loving the course. I learned so much, not just Python principles but your methodology with designing an app. Your lessons in PyCharm have really helped a ton too. I've bought so many books and online classes but nothing has brought it all together like you have, so a big thanks to you Michael!" To celebrate the launch of my course, I am giving away a free seat to a friend of the show, just enter your email on talkpython.fm to be eligible to win. I'll draw out a winner at the end of the week.
Next, I had the honor of telling the story of my career, how I went from super junior, know nothing developer to development instructor, to owning my own business and launching the show. I was on Developer On Fire podcast last week, if you are interested in these kinds of stories, check out Dave Rael's podcast at developeronfire.com. My episode was 112.
Now, let's get to this interview with Patrick.
2:29 Michael: Patrick, welcome to the show.
2:30 Patrick: Thanks Michael.
2:31 Michael: Yeah, it's super exciting that you are here. I am ready to optimize everything.
2:35 Patrick: Perfect, that's why I am here for!
2:35 Michael: Very cool. Now, before we optimize everything, let's start at the beginning like- what is your story, how did you get into programming?
2:42 Patrick: Sure. So I started when I was a kid, I was young, my parents' computer, played around with Visual Basic, or Java or things like that, just playing around making little toy myself, they sent me to computer camp at one point and that was when I started getting excited about it. Then, I went to the university of Waterloo where I started computer science, and there of course I got much better fundamentals and started building my career from there. And then I started with Python, on one of my first internships, so I was an intern at Blackberry and this was in 2007. And at that point, my programming experience, I was still pretty junior but I was to used to Java and this kind of verbose languages. And then, one of my previous colleagues had written a script in Python and one of my tasks was to update it, so I got in there, I started playing with it and I realized actually this Python stuff is pretty cool, I can work more quickly and that is when I started becoming a big fan and I've been since.
3:32 Michael: A lot of people who don't work in Python have this sense that it's kind of like bash but you can make websites with it.
3:39 Patrick: [laugh] Right.
3:40 Michael: or, you know, they don't see the entire sort of landscape of all the amazing stuff you can do and so, a lot of us when we come to it, we are like, "Oh, it's just a little script- oh my goodness, look there is PyPi and oh my gosh, look what else you can do!"
3:56 Patrick: Exactly.
3:56 Michael: That's awesome.
3:57 Patrick: Increase in productivity, you know, is like night and day from what I was used to.
4:02 Michael: So what did you study in college and work on Blackberry?
4:05 Patrick: I studied mathematics. Computer science and also pure mathematics. So I went to university of Waterloo, and there you do six internships, so I was in Blackberry for my first one and there I was working on the automated testing team.
4:19 Michael: Oh, that's cool.
4:19 Patrick: Yeah, I was just getting started, kind of my very first internships, so it was good.
4:24 Michael: I think it's cool that university has six internships, because a lot of times you come out of college and you are like, ok, now what?
4:30 Patrick: Yeah, it makes a huge difference, definitely I felt that in my career, you graduate and you have six 4 month internships, so you basically have almost two full years of work experience at this point. So it's really kind of night and day versus if you haven't done an internship, what kind of practical experience you have. Anyway it does complement the education well, because what you learn in class is very theoretical and good fundamentals.
4:52 Michael: Yeah, somewhat detached.
4:55 Patrick: Yeah, exactly, bet then when you get into the real industry you are really learning totally different things and you need both to be a good engineer in this day and age.
5:01 Michael: Right. Of course. So the company that you guys founded, SigOpt, has machine learning as a sort of central feature that it provides, right. So, could you tell everybody a little bit about just the mechanics of machine learning and describe how some of the pieces work, before we get into the details?
5:21 Patrick: How machine learning works?
5:20 Michael: Well, yeah, so let's just take a concrete example, right, suppose I've got some data about a bunch of shoppers and I've got a store, and I'd like to optimize something- give me a sort of how you go through that with machine learning.
5:34 Patrick: Yeah, so a great example, if you have a store if you have much of purchasers historical purchase data maybe you want to minimize the number of fraud purchases you receive, because that affects your bottom line, every kind of fraudulent purchase costs you money. The problem that machine learning can help you solve, is given that you have all this historic data that you have built up over the years, how can you use that to make predictions in the future, about events you haven't seen yet? So if I have seen a million purchase in the past, I know things about those purchases like I know the country that I came from, I know how much money the purchase was for, whether it was from a sketchy IP address or something like that. And I also know the truth you need to learn from, which was that purchase fraudulent or not. And now you have this data set of past truth and then you can apply machine learning methods on that data to produce a model that will make predictions in the future.
6:22 Michael: Ok. So basically as people are checking out you can ask, hey, should I let this go through even though the credit card system said yes, right?
6:29 Patrick: Exactly, so even though you don't know for sure yet whether it is fraudulent, you can make a guess, a good guess based on the data you see. A good machine learning model would be as accurate as possible. It always has sort of false negatives and false positives, so you know, saying a purchase is fraudulent when it might not be or vice versa- thinking fraudulent purchase is ok when that is fraudulent. But you want to of course minimize that, and the good machine learning model and machine learning process will reduce that.
6:50 Michael: Let's talk about SigOpt a little bit, your theme or your tagline is to help customers optimize everything.
6:58 Patrick: You got it.
6:59 Michael: That's a pretty audacious goal so, what is this company you guys founded?
7:03 Patrick: So, SigOpt is a SAS optimization platform that helps companies build better machine learning models with less trial and error. So we have an API that exposes the most cutting edge optimization research to our users, help them increase the accuracy of the machine leaning models as quickly as possible. So as an example of fraud detector, our APIs would help them tune that model to make the really reduce those cases where it would be inaccurate. And we want to get our users with most accurate models as quickly as possible, we don't want them to spend days or weeks or months trying different possibilities, kind of in the dark.
7:33 Michael: All right, that makes sense. It sounds like a big challenge, but it makes sense. So you said you are a SAS company, do you actually run the machine learning algorithms, do you provide like the compute and data processing, or do you somehow just provide hey, you should feed these inputs to SciKit learn along with your data.
7:54 Patrick: Yeah, that's a great question. The way we help our customers is they have their own models that they are running on their own clusters with their own data. So we are not actually running the models for them, and we are also not looking any of the sensitive data. What we are doing, is we are having our APIs are able to suggest what is the next version of your model you should try. So the models have parameters or variables that affect how well the model works. And finding the best values for those parameters is a very tough problem and that is what SigOpt solves in the most efficient ways possible.
8:24 Michael: Ok, that's really interesting. Where this idea for this company come from?
8:28 Patrick: My co-founder Scott, he was doing a lot of research in this field, during his PhD at Cornel. So he was doing a lot of research in this field and how and had this great idea for how to build this kind of black box optimizer, that was the beginning of that idea, and we've built the company around it, put an API around it to make it really easy to use, to administrate all the difficult computation that are required to serve these optimal suggestions.
8:52 Michael: Ok, do you have some interesting stories around stuff that you help people optimize? I mean, I heard one of them was something like shaving cream and another was synthetic rhinoceros horns, that's quite varied, right?
9:05 Patrick: Exactly, correct. So, the optimization- so, I've been talking about machine learning here, but really, it's actually able to optimize any problem which trying something new is very expensive, because we want to get you the best version of your product in as few tries as possible. So, that is true for a machine learning model, that you have to train it or test it on real user traffic and that's very expensive, but another thing that's really expensive is these chemical processes. So we have one customer who makes shaving cream and the expensive for him is to go to the lab mix up the ingredients into a jar or something, wait for 100 hours for it to settle and then test the quality of the shaving cream. And then, based on that, he wants to find the shaving cream that makes the most profit. So, in this case, how many ingredients that go into shaving cream and mixing them in different amounts is a very chaotic process. So, as a chemist, he is an expert in his field, so he is able to say, "Oh, the range should probably be somewhere this acid should probably be between 5 ml and 7 ml" but then at some point he gets to this area where the he isn't having as much knowledge as possible about the interactions. So that is where SigOpt comes in.
10:07 Michael: Right, it gets down to actual experimentation rather than theory, right?
10:10 Patrick: Yeah, trial and error. So now SigOpt is able to model these chaotic effects, and it will suggest here is the next version of your shaving cream you should try. So then he goes back in the lab, makes his other batch, what we suggested, tells us how well it did, and then we use that to update our model and give another best suggestion. So, before using SigOpt, this chemist had mixed up hundred different trials just by like keeping track of it in his lab notebook and hadn't really gotten anywhere. But then he uploads his data to SigOpt and now SigOpt is able to take data, analyze it and say, hey, here is what you should try next. And then he did then and then within just five new batches he already had a new version of his shaving cream that was 20% more profitable.
10:47 Michael: wow, that's really cool. And you are modeling physical stuff, it's not just like you are taking in a bunch of data and you are processing it, like you are helping this guy make shaving cream.
10:58 Patrick: Yeah, exactly, in this case it is helping with this very physical problem. And you also mentioned this, you are right, there is another customer using us for synthetic rhinoceros horns, and this is the same thing, they want to make a rhino horn that has these properties and there is various variables that go into that, like how much of different chemicals do you use. And we are helping them with that process as well.
11:15 Michael: That's cool, I have to ask, like why do they want a synthetic rhinoceros horn?
11:17 Patrick: It's a great question, my understanding is there is a large black market for rhinoceros horns, there is two key things here, one is they want to reduce that black market, they want to stop poaching of rhinos, and one way to do that is to flood the market with these synthetic horns so right now you actually can sell these synthetic horns for very high price because they mimic the very expensive real rhino horns but the synthetic ones can be produced at a much larger scale, so now the market is flooded with horns that are identical chemically but much cheaper to produce, and now that is no longer profitable to actually poach a rhino.
11:48 Michael: That's a really good thing for everyone. That's awesome. What other sort of stories do you have about what you helped people optimize?
11:55 Patrick: Yeah, we also have customers doing synthetic egg whites, music recommendation models, there is really it can be applied across the board on all these very black box difficult problems.
12:05 Michael: Ok. One of the things I think is pretty awesome is some of the stuff at the heart of what you are doing is actually open source software, right? Some of the algorithms and what not, and yet, you are building a business on top of it, and I talked to a couple of companies lately, the show that is about to come out next week, it's got a really cool sort of story taking something open source and turning it into a business. I want to dig into that, but one of the places you guys started was YCombinator, right?
12:32 Patrick: Yeah. That's correct.
12:33 Michael: What was it like being in YC?
12:34 Patrick: It was a lot of fun, it was very exciting, very definitely a lot of emotions, we had a lot of fun but you also have a lot of stress, you are very busy, you learn a lot, so it was a really fantastic experience. I think it took me and my co-founder when we started we didn't really know much of anything about how to build or run a business. And just the first three months at YCombinator- it's very transformative. You know what to think about, what not to think about, you know how to stay focused, you know how to talk to users and build something that your customers want. And then, 13:05 that's when you present to a bunch of angel investors, getting the fundraising you need. So that makes a huge difference, we wouldn't be where we are today if we hadn't gone through YCombinator.
13:15 Michael: That's awesome, so how do you actually get into YCombinator, is there like a call for applications, are there like open demos, what's the story, like how if someone out there is listening like I would like to be a part of YCombinator, what do they do?
13:29 Patrick: They may have changed it a little bit since I did it, so I applied in 2014, but essentially yeah, twice a year they have an open call for applications and they say what are you building, who you are building it with, same with application, they ask you do you have any customers, do you have any traction, do you have any revenue, what have you built so far, and then they also ask a lot of questions about you as the founder, they ask what have you built in the past, where have you worked, one of the kind of notable questions they ask is what is the system you've hacked in the past, so not hacking in kind of infiltrating servers sense, but like what is the kind of creative thing you did in the past to scurt around some rules but not like breaking the rules, doing something clever and creative.
14:08 Michael: That's really cool, would you recommend other people go in there and try it, like you found it to be a pretty positive experience.
14:13 Patrick: Yes, I would say it's positive, and I mean, I can't recommend it for everyone, I don't know everyone's specific situation, but I would recommend it definitely. It really did take me and my co-founder from the state of no experience to 14:30 company in a very short time.
14:32 Michael: Yeah, I think one of the challenges of starting a company, especially technical companies, the founders, they are technical, right, and a part of starting a company, especially one like yours that is a highly technical science math sort of company requires that, but that's not enough for like user growth and business deals, and marketing and growth hacking and connections. That's a really big challenge, YC helps with that, right?
15:00 Patrick: So both me and my co-founder were technical, so we had 15:05 skills that we had no experience with. What I commonly say is before I got 15:06 I got to work on what I was good at every day, I was an engineer, I worked on engineering problems. Now, the co-founder I work on things I am bad at every day which is very different, you spend more and you are trying to get customers, or you build a company, think about culture, think about hiring, or think about marketing, or think about sales, all these things that I had no experience with, prior to starting the company. So we definitely got a lot of experience with these things, exposure to totally new things, which are essential to running a successful business, but you kind of have to learn it on the fly. So YC 15:38 for this. Also, the network is really helpful, so there is over 1000 people in the YCombinator network at this point, you know, past founders and it's very friendly and welcoming community, you can message them if you ever need an advice, or you have questions or things like that.
15:53 Michael: Oh yeah, that sounds super helpful.
This episode is brought to you by Hired. Hired is a two-sided, curated marketplace that connects the world's knowledge workers to the best opportunities.
Each offer you receive has salary and equity presented right up front and you can view the offers to accept or reject them before you even talk to the company.
Typically, candidates receive 5 or more offers in just the first week and there are no obligations ever.
Sounds awesome, doesn't it? Well did I mention the signing bonus? Everyone who accepts a job from Hired gets a $2,000 signing bonus. And, as Talk Python listeners, it get's way sweeter! Use the link hired.com/talkpythontome and Hired will double the signing bonus to $4,000!
Opportunity is knocking, visit hired.com/talkpythontome and answer the call.
17:05 Michael: When you got through Demode, I saw that you guys actually had two seed rounds, one for like a $120 000 and one for 2 million. Congratulations. That's pretty exciting, right, was that first part sort of the end of YC?
17:19 Patrick: Yeah, the first part is from YC and then at the end of YC there is Demode and that is when we raised our log around, that was to get us over the next, we could start from there.
17:29 Michael: Yeah, so what is it like to get YC money.
17:33 Patrick: That's another scale, that you don't have any exposure to before starting a company or at least I didn't, so yeah, fund raising is a very kind of interesting beast you have to have your pitch ready, and you spend a lot of time talking with investors, and you deal with rejection and also yeah, communicating this way that you are not used to this very salesy way.
17:51 Michael: Yeah, you have to have a very crisp message right.
17:53 Patrick: It's also different from selling to customers, right, you know, for customers you are talking about, what's their immediate needs and how can I satisfy them now. And you are talking about what's kind of the grand vision, what's SigOpt going to look like in 10, 20 years, what's the how big can we possible be?
18:09 Michael: Yeah, the customer wants to know how are you going to help me make my store or chemical process better. The 18:15 want to know how you are going to reach like large scale growth and continue user acquisition, like the customer doesn't care at all about that, right?
18:22 Patrick: Exactly. It's good, talking with investors, you know, then they can really can give you a lot of help, they can give you a lot of great resources especially the good investors which we were fortunate enough to do.
18:31 Michael: Another sort of connections at network almost as much as anything, right? Cool. So you guys started in November 2014, right?
18:39 Patrick: That's correct.
18:39 Michael: And, you started with Python from day 1. What was the reaction within YC for using Python?
18:46 Patrick: [laugh] They give you a lot of freedom to build whatever you want so I don't think that anyone else really would have cared what we built it in, but it made a lot of sense for us as a company, you know, both me and my co-founder, we thought this would be a great way to get started and move quickly, so especially when we have this short timeline of 3 months to get out the door before Demode, we made sure to be as productive as possible right form day 1. Python was pretty natural choice.
19:13 Michael: Yeah, I guess especially in sort of accelerator type scenario, speed to market is more important than CPU cycles.
19:22 Patrick: Exactly.
19:23 Michael: You guys are sticking with it right, you've been going strong.
19:47 Michael: Ok, that's cool. And is that Python 2 or Python 3?
19:50 Patrick: We are Python 2.
19:51 Michael: Ok. And, you said you considered maybe using Go or Scala or some other languages. Can you just sort of discuss that?
20:00 Patrick: Sure. So, I had some experience with both of those, I think Scala was really nice, when I did use it. It's very safe, the type safety is really good, but it's still very expressive, much more expressive than Java or something which is based off of, and that's really nice, I think I would be less confident if we could move as quickly as we can with Python there, you can be very expressive but there is a lot of kind of issues like the compile times could be very slow and also, it's perhaps not as well known so you spend some amount of time training new hires or other people to use Scala as well. And, I don't think that is necessarily a bad investment to make always, but when we are at this stage where we want to move very quickly then we should stick with something that we can be confident about. And then, I have some opinion about Go, which is it was a great language and really has all nice features, but at the time, I believe it was still less mature than it is today, I recall wanting to use a AWS library and for Go there wasn't one at the time, I think there is now, but Python of course had great AWS library just ready to go. So, those little decisions were just like I want to spend my time working on the core business and not tooling and not building third party packages which could already exist.
21:11 Michael: Right, there is zero business benefit in you building a Go implementation of Boto or whatever, right.
21:18 Patrick: Exactly. There is a lot of tradeoffs to be made and that's kind of where we were relating at that time, and I am really happy with it now, so I think it's- we've built a lot and I don't think we'd be where we are today if we had perhaps started with a different language.
21:33 Michael: Sure. So, how much do SciPy and NumPy and all those pieces sort of play into this, was that like pretty critical to make any work?
21:41 Patrick: Yes, so we use a lot of that for kind of high capacity, computation and optimization. They are really great tools for that that we love both in the kind of prototyping phase where we want to try something new, we want some data set or something like that using SciPy, Scikit-learn, NumPy are fantastic. And then also, they have the performance we need to actually put them in production when were appropriate. So we also have of course things that we built ourselves that are very well tuned. And as I mentioned before, typically they are in C++ but having this wide array of tools available to us in Python is essential.
22:12 Michael: Cool. And, I heard you say AWS in there, is that where you guys were- that was in your test product?
22:17 Patrick: Yeah, that's correct.
22:18 Michael: Ok, are you doing like multiple data centers and things like this or is that US East one?
22:23 Patrick: US West. So I would say we are still pretty small, so there is lots of improvement to do in a lot of fields but specifically, AWS is we are big fans of it.
22:37 Michael: Ok, cool, yeah so am I.
Gone are the days of tweaking your server, emerging your code and just hoping it works in your production environment. With SnapCi's cloud based, hosted continuous delivery tool you simply do a git push and they autodetect and run all the necessary tests through their multistage pipelines. If something fails, you can even debug it directly in the browser.
With the one click deployment that you can do from your desk or from 30 000 feet in the air, Snap offers flexibility and ease of mind. Imagine all the time you'll save.
Thank SnapCI for sponsoring this episode by trying them for free at snap.ci/talkpython.
23:38 Michael: You guys have to really focus on API design because that's one of the primary ways that people interact with your entire business as a SAS product, how do you guys think about that?
23:50 Patrick: Using optimization tools like SciPy or like the open source or other open source optimization tools, one of the problems with them is that they are very difficult to use for a lot of reasons- one of them is the API might be very aptus, or not very well optimized to your problem or things like that. Another problem is administration- we have to have the servers and set them up and you have to know what kind of capacity you need to optimize your problem like if you use machines are too small then it will be very slow and so on.
24:21 Michael: Right, but if you bit too many you are just going to waste your money on big machines doing nothing, right?
24:24 Patrick: Yes, so we wanted to make sure that we can take this very valuable tool that is optimization tool and really make it, what is the easiest possible way to expose it to our customers. And so for that reason, it's one of our top priorities is having an API that is very clean, very predictable, very easy to use, that's what you want, small number of n points. You don't have to dig in nitty gritty you are not tweaking flags in our optimization, you are just saying here is my problem, tell me what do next.
24:52 Michael: Right. You really try to optimize for the simple getting started case and then provide additional features as needed or- how do you do that?
25:00 Patrick: We want to make sure that yeah, getting started is as easy as possible, so if you have your problem, you can use our API to define your problem and say, "here are the parameters that I am searching over", you can also use our API to say, "here is what I have tried so far", and then, our API will tell you what to try next. So, it's very simple, like 3 calls, one to get started, and then one to ask for suggestion and one to report an observation. And they are actually all a user needs to get started with SigOpt. And then, to the extent that there are expert level flags, like if you really know that you want this particular optimization thing we want to make sure it's possible, but we really don't want it to be users to have to know about it not be confused by it, just to get access to the really powerful tooling inside SigOpt.
25:43 Michael: Yeah, that seems like it's keeping with your overall mission of democratizing this machine learning and optimization in general, right? Very cool. So, let's talk a little bit about the open source engine that is kind of at the heart of a lot of what you are doing. So, your co-founder Scott he used to be at Yelp, is that right?
26:02 Patrick: Yes, that's correct.
26:03 Michael: Yeah, he worked on this thing called "metric optimization engine" or MOE.
26:07 Patrick: Yeah, that's right.
26:08 Michael: Ok, and this is sort of one of the core components of your system, right?
26:11 Patrick: Yeah, so we definitely have, we've built more on top of it and SigOpt these days is powered by like an ensemble of multiple optimization methods. But how we got started from the very beginning was this MOE that Scott has built using his expertise in the field. And that was the first prototype for chemist black box optimization really kind of be profitable, kind of be easy to use, kind of be something that people would want to interact with. And so MOE was the first prototype for that and MOE was very well received on GitHub, lots of companies and individuals were using it and then some of the very common feedback was but it is really hard to use, or perhaps it doesn't work and then the reason it didn't work was because this one flag wasn't set. So, he and I knew, and could see that there is real value here, but the biggest problem is how can that be even easier to use so that companies of any size can have access to this really powerful stuff.
26:58 Michael: Yeah, that's great. There is this opportunity you've got this great open source tool, but you want to build a business, can you kind of take me through the thinking of like- we are still giving away this thing for free, and yet we need to make something special that customers will buy and love, what was the thought process, just that it's too hard? How could we make it not so hard and accessible?
27:23 Patrick: Yeah, I would say that's the biggest barrier to using MOE is that it's very hard to use. I think if people still want to use MOE then they can, but they are going to have a lot of headaches with it. That's our goal, that we want to be the people who removed those headaches. But as I said, we started with MOE and now, we have a wide variety of other optimization methods that we've built, and those are not open source. So, using SigOpt does also give you these kind of wider array of benefits other techniques, other optimization kin of cutting edge research, but you are right, that like this it is still open source, it is still available and like it's still good for the community to have access to this open source stuff, to kind of they can see where we are coming from, and what's been built, and the kind of background behind it.
28:07 Michael: Yeah, I think there is a bunch of stories that are coming out, people building really amazing businesses, on top of things that they are giving away, and I think that is becoming much more common, but still, I think it's really special to see examples of it in action, like with you guys, recently, I spoke to the Scraping Hub guys that took Scrapy and turned that into kind of a sas platform, web scraping as a platform if you will, you know, there is like more obvious examples like MongoDb, Redhat, but it's really cool to see you guys sort of turning this into a business starting form this sort of kernel of open source. Do you still make a lot of contributions to MOE or are there a lot of other people working on it, what's the story there?
28:50 Patrick: Some people work on it, some individuals like in the open source community have been contributing to it. These days our contributions are mostly focused on these other methods that we kind of have diagnosed like we were able to diagnose like these are the problems that MOE worked well for and these are the problems that MOE doesn't work well for, and how can we as a company attack those and help our customers who have those kinds of problems. And so we spent a lot of time focusing on that, we are still working on that. AT this time that is still closed source to SigOpt.
29:17 Michael: Right, of course, you've got to keep the secret sauce a little bit secret, right?
29:21 Patrick: Right.
29:21 Michael: Awesome, so can you talk a little bit about how some of the internal systems work, we talked about your deployments being on AWS but is there like interesting architecture that you might want to talk about?
29:31 Patrick: Sure. SO we have a web of API that's as I mentioned before, all Python, we use Flask in production to do kind of web serving. For the most part we have like our API which is a very thin layer that just accepts you know, API requests from our customers, where they are describing their problem. But then, most of the work that we do is we offload that asynchronously to these high capacity compute machines that are doing the optimization, so for these problems that are very expensive, we want to be able to run SciPy or these custom C++ algorithms that we are working on.
30:03 Michael: Ok, yeah, of course, you definitely want to buy the highest end compute machine you can, it seems to make a really big difference if you sort of double your VM and AWS, it seems like the low end is really low.
30:19 Patrick: Yeah, that's right, you have this kind of flexibility to split up your architecture that way.
30:23 Michael: Right. Do you, being as part of the beauty of the cloud right, like it's a check box or an API call- are you guys doing anything with GPUs given how computational this is?
30:33 Patrick: We do. WE have experimented with that in the past, definitely with some of these algorithms which are kind of really parallelized over GPUs that is still something we are working on and experimenting with.
30:43 Michael: Cool. So maybe you could just in a few sentences tell people what is the deal with like computation and GPUs, like isn't that for graphics?
30:51 Patrick: So I'll be honest and say this is not one of my deep areas of expertise, but my understanding is that GPUs they are very highly optimized for this widely parallel computation so definitely when you are doing these kind of expensive CPU bound optimization techniques then you can use GPUs to get those done as quickly as possible.
31:12 Michael: Yeah. It's pretty amazing when you look at them, I mean, when I first heard about it, I was like, really- what, people are doing like math on the GPU, I mean of course if you look at video games and like for a while I worked on 3D simulators and the amount of computation those graphics cards do just to render a scene, is mind bogling for one scene and then they do it like 600 a second, it's crazy, and that was so long ago, many years ago that I was doing that, I was still super impressed. If you look at the parallels and like I've got to a Mac book pro and it's got a pretty high end CPU which I think has 4 real cores and each one is hyper threaded so it looks like 8, but some of those graphics cards have like over 1000 cores, so trying to see between 8 and 1000 for a parallel. Well, that's pretty insane, so yeah, I remember a few years ago I saw on AWS as one of the machine types like a cluster GPU, like what is that doing in the cloud. But, yeah, very cool. So you said you experimented with it, is it looking promising or-
32:17 Patrick: So far so good, yeah, it definitely seems like something we might want to go forward with.
32:21 Michael: Yeah, I haven't really tried this computational stuff with it, there is some really interesting projects but I just haven't had that much computation I guess, but it seems like that if you find a case where it works, it works crazy good, but it's not like a general computer right, you can't just give it any problem, so there are certain types of problems that are really appropriate or algorithms and certain ones that aren't, so I guess that's kind of a big decision whether it makes sense or not. We are getting kind of near the end of the show, let me ask you a few questions I ask all my guests. When you are going to write some Python code, what editor do you open up?
32:53 Patrick: I use Vim, I've been using Vim for about 8 years now, I think it's pretty well optimized, I feel more productive in than just anything else so that is what I use for everything.
33:04 Michael: Yeah, very cool. It's very unscientific, maybe I could pass a few data points off to your system and ask it. But I would say, I think Vim seems to be winning the popular battle among my guests anyway, I am not sure if they are representative or not of the overall community. Yeah, very cool. And, PyPi has 75 000 packages, now, just it's insane, you know, how many things are out there that you can just grab and bring into your your apps in Python, what ones would you recommend or what are really important things people should know about?
33:36 Patrick: I would be self serving and say SigOpt's API client Python package, ones that I use for my own personal day to day, definitely IPython, I got a huge amount of value out of, just for the raple I am using IPython notebooks, like increase of productivity over the regular raple is kind of astounding. This is obviously a pretty common one but I think Request, that has become necessary part of my toolkit, it feels shocking it's not a part of the standard library at this point.
34:00 Michael: Yeah, that's a really interesting comment, I agree that Request absolutely should be out there, it's the most popular package on PyPi.
34:07 Patrick: Oh is it?
34:10 Michael: Yeah, it is so clean and so useful. And, I was talking to Kenneth Reitz, he was on this show, and he was saying they were considering making Request part of the standard library. But they decided not to because they wanted to be able to rev the features and security fixes and various things of Request faster than they do Python itself, so they decided to keep it separate but to make it the recommendation to basically not recommend using URL lib, things like that, going, no no- pip install request, this is how we do it. Let's just all agree on that. But yeah, it's pretty interesting, right, they actually considered making it part of Python, but for versioning and- they decided not to. Ok, awesome. If people want to get started with SigOpt, what do they do, how do they get started?
34:59 Patrick: Head over to SigOpt.com, sign up, you get started, we have a free trial, and if you have any of these problems, whether it's a machine learning model that you want to make more accurate, or some other kind of process that you are trying to optimize, you just sign up, get started with our API and that's it.
35:14 Michael: Ok, that sounds great, yeah. So, pip install Sig Opt, sign up- off you go, right?
35:18 Patrick: You got it.
35:20 Michael: All right, Patrick, it's been good to have you on the show, this was really interesting, you guys are doing some cool stuff and continue to optimize everything, yeah?
35:27 Patrick: Yeah, we will, thanks Michael!
35:29 Michael: All right, see you later!
This has been another episode of Talk Python To Me.
Today's guest was Patrick Hayes and this episode has been sponsored by Hired and SnapCI. Thank you guys for supporting the show!
Hired wants to help you find your next big thing. Visit hired.com/talkpythontome to get 5 or more offers with salary and equity right up front and a special listener signing bonus of $2,000 USD.
Snap CI is modern continuous integration and delivery. Build, test, and deploy your code directly from github, all in your browser with debugging, docker, and parallelism included. Try them for free at snap.ci/talkpython
Are you or a colleague trying to learn Python? Have you tried boring books and videos that just cover the topic point-by-point? Check out my online course Python Jumpstart by Building 10 Apps at https://training.talkpython.fm.
You can find the links from the show at talkpython.fm/episodes/show/51
Be sure to subscribe to the show. Open your favorite podcatcher and search for Python. We should be right at the top. You can also find the iTunes and direct RSS feeds in the footer on the website. Our theme music is Developers Developers Developers by Cory Smith, who goes by Smixx. You can hear the entire song on talkpython.fm.
This is your host, Michael Kennedy. Thanks for listening!
Smixx, take us out of here.