#11: PyImageSearch and Computer Vision Transcript
00:00 Does a computer see in color or black and white? It's time to find out on episode 11 of Talk Python to Me with our guest Adrian Rosebrock, recorded Thursday, May 20th 2015
00:00 [music intro]
00:40 Welcome to Talk Python to Me. A weekly podcast on Python the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on twitter where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpythontome.com and @talkpython on twitter.
00:56 This episode we'll be talking to Adrian Rosebrock about Computer Vision, OpenCV, and PyImageSearch.
00:56 Hello everyone. I have a bunch of cool news and announcements for you. First, this show on PyImageSearch is a listener suggested show. Thank you to @jllorencetti for reaching out to me and suggesting this topic! You can find his details in the show notes.
01:21 As always, I'm also excited to be able to tell you that this episode is brought to you by Codeship. Codeship is a platform for Continuous Integration and Continuous Delivery as a service. Please take a moment to check them out at codeship.com or follow them on twitter where they are @codeship.
01:39 Did you know most of our shows come with full transcripts and a cool little search / filter features? If you're looking for something you hear in an episode, just click the "Full Transcript" button on the episode page.
01:50 Also, I want to say thank you to everyone who has been participating in the conversation on Twitter where we are @talkpython. It's a great feeling to see all the feedback and thoughts every week when we release a new show. But if you have something more nuanced to say or you want it more permanent than twitter, every episode page has discuss comment section at the bottom. I encourage you to post your thoughts there.
02:15 This week I ran across a really awesome GitHub project called python-patterns. You can find it at http://github.com/faif/python-patterns It's a collection of really crisp design patterns implemented in a pythonic manner. For example, you'll find patterns such as the adapter, builder, chain, decorator, facade, and flyweight patterns. I think you'll learn something if you check it out.
02:43 Finally, I put together a cool youtube playlist you should check out. This is a series of 9 lectures from Dr Philip Guo, a professor from the University of Rochester in NY. Find him on twitter where he's @pgbovine. The video series in entitled CPython internals: A ten-hour code walk through the Python interpreter source code. Check it out at ttp://bit.ly/cpythonwalk
03:16 Now, let's get to the interview with Adrian.
03:16 Let me introduce Adrian. Adrian Rosebrock is an author and blogger at pyImageSearch.com. He has a PhD in computer science with a focus in computer vision and machine learning, and has been studying computer vision his entire adult life. He has consulted for the National Cancer Institute to develop methods to predict breast cancer risks using breast histology images and authored a book, Practical Python and OpenCV, on utilizing Python and OpenCV to build real-world computer vision applications.
03:54 Adrian, welcome to the show.
03:57 Thank you, it's great to be here.
03:58 I'm very excited about computer vision and sort of emerging the real world with computer science, with robotics and I think there is just some really neat stuff going on and you are doing a very cool part in that.
04:10 Oh, thank you.
04:11 So we are going to talk about pyImageSearch, we are going to talk about OpenCV, and some of the challenges and even the future of these types of technologies, but before you get there, you know, everyone is interested in how people got started in programming and Python, what is your story?
04:24 I started programming when I was in high school, I started out with basics of HTML, Javascript, CSS, I did some basic programming. And you know, I probably will get a lot of hate mail about this, but when I first started learning how to program, I did not like the Python programming language so much, and this was around the early version 2 of Python, I didn't like the syntax, I didn't like the white space, and for a long time I was really really put off by Python. That was a huge mistake on my part, I don't know what was wrong with me back then, I guess it was just high school ordinance or something. But, by the time I got to college, I started working in Python a lot more and that's especially true with the scientific area.
05:07 You see, all these incredible packages of Python like numpy, sciPy, that just integrate with computer vision and machine learning and all of these types of libraries and more and more people are transitioning over from languages like Matlab to other languages, to languages like Python, and that's so cool. And, it really wasn't until college that I got into Python and I remember this one girl, she was in my machine learning class, and she had a sticker on the back of her laptop it said: Python will save the world, I don't know how, but it will. And it resonated with me, I'm like that sticker is true, that is absolutely true, like it's such a great language. So unfortunately I didn't have the best first experience with Python, it took me 4 or 5 years later to actually come around, but now that I am here, I love it and I can't imagine programming in any other language, it's almost a freeing feeling, a relaxing zen when you are coding in Python.
06:07 Yeah, that's a funny story; it really is a wonderful language. I also took a while to get there, but you know, looking back I would have enjoyed being there sooner, I went from Matlab to C++ so I had a bit of a torturous introduction, but it was all good.
06:07 [music]
06:07 Codeship is a hosted continuous delivery service focused on speed, security and customizability. You can set up continuous integration in a matter of seconds and automatically deploy when your test have passed. Codeship supports your GitHub and Bitbucket projects. You can get started with Codeship's free plan today. Should you decide to go with the premium plan, Talk Python listeners can save 20% off any plan for the next three months by using the code: TALKPYTHON. All caps, no spaces. Check them out at Codeship.com, and tell them "thanks" for sponsoring the show on Twitter where they are at @codeship.
06:07 [music]
07:19 So, you are focused on computer vision and image processing. Where that story began?
07:27 That story also started in high school. Originally, I had this idea that I wanted to go work for Adobe and I wanted to work on developing Photoshop and Illustrator. I loved the idea being able to write the code that could analyze an image. And, for whatever reason I just like really captured my imagination, I could see these algorithms running in Photoshop and I was like what is really going on behind the scenes like how were they manipulating these images, what does this code look like?
07:56 So for the longest time, I really wanted to develop these graphic editing applications. But I didn't have the math experience. This might surprise some people given that I have a PhD in computer science but up until late high school I did not do well in mathematics courses, I got Cs in algebra and geometry and it really wasn't until I kind of really put my back against the wall and I said you know, I got to learn calculus and statistics and so I did a self study in AP Calc and I took AP Statistics and I did well in those, and I was like man, math is fun now, I understand this.
08:40 So, I got the college and we took one computer vision course because the school I went to didn't really have computer vision focus, they had a wonderful machine learning focus but not really a computer vision focus, and what I found out was that, you know, you don't need a mathematical background to get started in computer vision, I think this is true in a lot of areas of computer science, whether or not people want to admit it.
09:06 A lot of people talk themselves out of getting started stuff especially challenging things, because they are just scared of it, they don't want to fail, and as the cool thing about Python like you almost do not have to worry about the code, you get to focus on learning a new skill and for me, that was computer vision and that was the OpenCv library. OpenSource library that makes working with computer vision a lot easier. So it really wasn't until college that I really started to get interested. More so being able to take action on what I wanted to do.
09:43 Right, maybe you know it felt a little unattainable like I'm going to go be an engineer but I don't know math so there is no way I can do this, but once you kind of get over that hump then that is not a big deal right?
09:53 Right.
09:53 So you mentioned OpenCv and your project is pyImage Search, what's the relationship there?
10:01 So, OpenCV again, it's a computer vision library that makes working with images a lot easier, you know, it abstracts the code that loads an image off of disc, or does edge detection or any of the simple image processing function like that, it allows you to actually build complicated computer vision programs, you could do things like tracking objects or images, or video streams for example, detecting faces, recognizing whose face it is, OpenCV really facilitates this process. And OpenCV really is the de facto library for computer vision and image processing. You have bindings for it in countless languages, the library itself is written in C and C++ but you can get bindings and access it in Java and any of the .net frameworks in Python. So again, while you can access any programming language I have this love for Python now and when I took the course to computer vision course in college I realized, man people are spending a lot of time writing their class projects in C and C++ why are they doing that, like you are fighting over these weird compiled time errors, and you are not really learning anything. And that's kind of the tendency behind the pyImage Search, it's a blog that I run dedicated teaching computer vision image processing and OpenCV using the Python programming language.
11:31 That's great. I think that using Python seems like the perfect choice, you are sort of orchestrating these high level functions that are calling down in the C++ doing high performance stuff and then giving you the answer, and that seems like the right way to be using Python. So, what is the actual package I use, if I were to say pip install something, what do I type to get started?
11:52 So unfortunately OpenCV is not pip installable. I wish it was but it is not, and it is not the easiest package to get installed on your system. If you are using Ubuntu or any Debian based operating system you technically can do an up kit install but that's going to pull down the previous version of OpenCV you are going to run into a lot of problems with the Python bindings and it is not a very good experience. So what you actually have to do is compile it from source download the code from their GitHub or the Source forge account and manually compile it and install it. And in fact that is really the only way to do it if you are interested in using virtual environments which I as most Python developers are interested in.
12:42 Absolutely. Ok, so I go and download that and then what packages are in there that I would work with? Is that CV2, is that the one I would import?
12:55 Yes, so if you were to open up your favorite editor you would just type in import CV 2 and it will give you access to all your OpenCV bindings.
13:04 Ok, great, and now I looked at some samples on your blog about how I might go and grab like an image from a camera, maybe we could talk a little bit about the type of hardware that you need and the spectrum of devices you can interact with and that kind of stuff before you get into the more theoretical bits.
13:25 Sure, that's kind of the cool thing about OpenCV is that it is meant to be run in real time, so you can easily process video files, raw video streams, without too much of a problem, again depending on the complexity of your algorithm. And OpenCV is meant to run on a variety of different devices, I personally develop applications on my Mac book but also in a Raspberry pie, and camera module for the Raspberry Pi and using OpenCV I can access to Raspberry Pi videos stream and then actually build like home surveillance system using nothing but OpenCV and a Raspberry Pi.
14:08 I built this one project where I had Raspberry Pi camera mounted on my kitchen cabinets looking over the front door of my apartment. And it would detect motion such is when you are opening the door and someone is walking inside. So once it detected motion, it would snap a photo of whoever is walking inside trying to identify their face and then it would take that screen shot or the screen capture and then upload it to my personal Dropbox, so I had like a this real time home surveillance system. That was really really cool to develop and again, this is using simple simple hardware. Raspberry Pi is not a powerful machine, but you could still build some really cool computer vision applications with it.
14:50 Yeah, and it's cheap too, right?
14:51 Yeah, the Py itself is I think $35 and probably another $20 for the camera module.
14:59 Yeah, that's really easy to get started. So, very cool, very cool. How does computer vision work, I mean, I have a little bit of a background in trying to identify things and images, I worked at this place called "IYI tracking" and we did a lot of stuff with image recognition and detecting eyes; I know enough to know that it seems really hard, but how does that work?
15:26 So, computer vision as a field is really just encompassing methods on acquiring, processing, analyzing, understanding and interpreting the contents of an image. For humans this is really really easy, like we see a picture of a cat and we are like That's cat, we see a picture of a dog and you know, we obviously know that's a dog, but a computer- it doesn't have a clue, it just sees a bunch of pixels, just a big matrix of pixels and the challenging part, as you suggested, is writing code and writing these algorithms like an understand the contents of an image, you can open up your Python source file and then write if statements that say if this pixel equals whatever, rgb code then it is a cat, then if it equals this pixel value then it is dog, you can't do that.
16:15 So, what happens is computer vision really leverages machine learning as well. So, we can take this data driven approach and say "here is a tone of examples of a cat , and here is a tone of examples of a dog". But see how we can abstractly quantify and represent this huge image and just like a small what they call feature factor it's a fancy academic way of saying a list of numbers; I'm going to quantify this big 3000 by 3000 pixel image into a feature factor that is a 128 numbers long, and then I can compare them to each other, I can rank them for similarity, I can pass them to machine learning algorithms, to actually classify them. So, the field of computer vision is very large and again it spans in so many different areas of processing and analyzing images but if we are talking strictly about classifying an image and detecting objects in an image them we are mostly likely leveraging some machine learning, at some point.
17:15 Ok, cool. And when you say machine learning, is that like neural networks or what is going on back there?
17:22 The machine learning algorithm you would use really depends on your application, deep learning has gotten so much attention over the past few years and deep learning has its roots in neural networks so we see a lot of that. You also see very simple machine learning methods like support vector machines, logistic regression you see of that a lot as well. and these methods while simple, they are actually the bulk of the work is actually happening on describing the image itself, you know, quantifying it. So if you have a really good quantification of an image it's a lot easier for the machine learning algorithm to take that and perform the classification.
18:06 Right, sure, and how much of this exists in external libraries like Scikit learn or OpenCV or something like this and how much of that is like I've got to create that system for myself when I am getting started based on my application?
18:24 So OpenCV does include some machine learning components but I really don't recommend that people use them, just because they are a little finicky and they are not fun to use. And especially in the Python ecosystem, you have Scikit learn so you should be defaulting to that. And, to give an example, I wrote my entire dissertation gathered all the examples using Open CV and Scikit Learn, I took the results that OpenCV was giving me and I passed them on to the machine learning methods in Scikit learn.
18:59 Right. It sounds very useful, I think a lot of the challenging aspects of getting started in something new like this if you are not already involved in it is just knowing what exists, what you can reuse, and what you have to write yourself, so knowing that it is out there is really nice.
19:14 yeah for sure, and some of these algorithms you definitely do not want to be implementing yourself.
19:21 No, I'm sure you don't. Unless you are really, really into high performance matrix multiplication and other types of processing that make your day, right.
19:30 Exactly.
19:31 I have some sort of mental models of how I might use computer vision and then we have the Hollywood models right like "Minority Report" and so on, but what is the current 19:41 like where do yo see computer vision early prominently being used in the world?
19:46 So, computer vision is used in everyday life whether you realize it or not. And it is kind of scary but it is also kind of cool. Back about a year ago, I was traveling back and forth between Maryland and Connecticut on East Coast United States constantly for work related activities. And one day I was exhausted, it was Friday, I just really wanted to leave Maryland and get back up to Connecticut and sleep in my own bed and just pass out. So I left work a little early and I started tearing up 95 it was beautiful summer day sunlight, my windows open and the wind blowing in and if you have ever driven on 95 specifically on the East Coast of the United States you know there is always like a tone of traffic, or just lots of construction to ruin your drive. And, for whatever reason this day there was no traffic, there was no construction, I was just flying down the road.
20:49 And I made excellent time getting home. However, two weeks later, I get my mail and I notice that there is a speeding citation addressed to me. Apparently, I had passed one of the speeding cameras that was mounted along the side of the road, it detected that my car was traveling above the posted speed limit, it snapped a photo of my license plate and then it applied what is called automatic license display recognition, where it takes the image automatically analyses it, look finds my license plate and it looks it up in the database and mails me a ticket. I was like "Man, I've written code to do this, I know exactly how this worked" So it was like the only time I had a smile on my face as I was writing up the $40 check or whatever it was, I'm like "You guys got me" and I know how you did it.
21:41 Yeah, exactly. Like your love for image recognition is slightly turned against you, just briefly there.
21:49 Just briefly.
21:50 You said you worked on this thing called ID my pill. What's that?
21:54 So ID my pill is an iPhone application and and api that allows you to identify your prescription pills in the snap of your phone. So, the general idea is that we are little too faithful in pharmacists and our doctors and not to say that there is anything against that, but mistakes do happen and people do get hurt, they get sick and some of them do die every year due to taking the wrong medication. So the idea behind ID my pill is to identify your prescription pills. It's also a way for pharmacies for a health care providers to facilitate better care for their patients. So you just take your pills, snap a photo of them and then computer vision algorithms are used to automatically analyze and recognize the pill. That way you can validate that yes, this is the correct pill, this is what it says on the pill bottle, and I know what I am taking is correct.
22:51 That sounds really useful. What do you think the chances are like something along these lines could be automated, so as the pharmacists are filled like they are actually filling the prescription, you know, the computer knows what they are filling and it sees what they are putting in the bottles could it say "whoa whoa whoa, this just not look legit"
23:09 Yeah, I think it absolutely can be automated. The current systems right now that pharmacies use especially within hospitals some of them are taking rfid chips so you know, when you take a pill bottle off the shelf it is able to validate that you are taking the correct medication and filling it, but again, that's not perfect and pills can get mixed up. So in the perfect world what you want to end up doing is you would have that rfid mechanism in place and then as the pharmacist is filling the pill bottle you have mounted camera looking down at their work station, at their desk and then validates the pills in the real time and they got a nice little thumbs up on the screen or whatever heads up display that have in front of them and they can continue filling the medication.
23:58 Yeah, that sounds really helpful. So I used a slightly less productive contribute to society sort of computer vision last night, I was out with some friends and my wife we were having some wine and there was this really cool iPhone app called "vivino", and you can just take a picture of a bottle even if the background is all messy and there is people around and so on, and it will tell you what the radiants are how much it should cost sort of standard retail price, and I was really impressed with that, you know, there is a lot of interesting sort of consumer style used as well I think.
24:35 Oh for sure. Vivino is one of the major ones and another one where you can see computer vision used a lot in and the people really notice it, but they appreciate it, is within sports. So, in America if you are watching football game you will notice that they have like a yellow line drawn across the field marking the spot of the ball along with the first down marker, and these lines are drawn using calibrated cameras so computer vision is used to calibrate the cameras and then know where to actually draw that line on the broadcast of the game. And then similarly, you can use computer vision machine learning in the back end and analyze schemes to determine what is the optimal strategy and this is actually done in Europe a lot and in soccer games. So, they will detect how players are moving around, how the ball is passing back and forth and they can almost run these data algorithms to learn how yo are going to beat the other team and try and learn their strategy, it's pretty cool.
25:43 Yeah, that really is amazing. I feel like these days watching sports on television is actually better experience than going to them live a lot of times because of these types of things, right? It's very clear, oh look they have got a go like a foot and a half forward and then they get the first down and arise they are going to fail, and- you don't really get the same feel live which is ironic.
26:03 Right, it's almost, it used to be that you didn't get the full story unless you were there, and now it's kind of flipped around, you get more than the full story, if you are watching the game on tv, you get all the detail that you could possibly want.
26:16 Yeah, that's right. So, you know, that sort of leads into the whole story of augmentative reality and stuff like that with Microsoft hololens, Google Glass, and bunch of iPhone apps and other mobile apps as well. What kind of interesting stuff do you see out in that world?
26:31 I have not had a chance to play around with hololens, I have used the Google Glass and played around with that, I most of the applications that I see again this is just because of my work with ID my pill is medical related, so you'll see surgeons going into really really long 10+ hours surgeries where they perform these complex operations and they need to look up some sort of reference material while they are doing a surgery instead of having an assistant doing that for them they can put on a Google Glass and have this information right there in front of them. So you see a lot of that and since the glass has a camera there is a lot of research focused on especially with the medicine identifying various body parts as you are performing the surgery so you can have this document in the procedure brought up in front of you as you are working there is no need to instruct the Google Glass to do it for you.
27:28 Yeah, that's pretty good, I can imagine calibrating some kind of augmentative reality thing to the body that is there and you can almost see like the organs and stuff overlaid- some really interesting new uses there. Very cool. What about things like the Google self driving cars, what role does image recognition and computer vision work there versus gps, versus laser versus whatever else. Do you know?
27:54 That remains to be seen. I guess I am a little bit of a pessimist when it comes to computer vision being used for driving cars-
28:03 You maybe know too much about the little problems you run into right?
28:06 Well, that's the thing is. I don't believe computer vision itself will ever be adequate, again, by itself for self driving cars, I think you need a lot more sensors, I think you do need things like radar, to help you out with that and you know help detect objects in front of you. And that's a problem computer vision does face, is when you take a photo of something it is a 2D representation of a 3D world. So it makes it very hard to compute depth based on that. And now we have things like the x-box 360 connect and stereo cameras where we can compute depth which is making things like the self driving cars lot more feasible, but again, for something like a self driving car it doesn't make sense to rely strictly on computer vision. I think you want to incorporate as many sensors as you possibly can.
29:01 Yeah, I think that makes a lot of sense and it might make sense actually the road to have the things like rfid style stuff in it and where the car can be sure between the lanes and they stand the major long part of the drives.
29:14 Oh, for sure. And I think that is kind of the point I would say is just because you can use computer vision to solve something doesn't necessarily mean that you should. There might be better solutions right there, you don't want to be the solution in search of a problem. You want to be the solution to a problem.
29:38 Yeah yeah. There is a really good show I would like to recommend to people by the way about this whole computers driving cars. Nova did a show called "The Great Robot Race" and you could just google it or whatever and it is all about this competition that kind of preceded the whole Google self driving cars and so on and it is like 2 hour sort of technical documentary on like the problems these teams technically face, and so, it was very cool so if you want to learn more check that out.
30:08 Nice.
30:09 Yeah. This is a really good sort of conceptual idea of what is going on but what do I do, what kind of code do I write as a Python developer to get started, I mean, it is tough to talk about code only, so just like can you give me a sense of what kind of code I would write to maybe grab an image and id something in it?
30:29 Yes, so let's take a fun example: if you have ever used your smartphone to scan a document, you have basically taken, you have like set a piece of paper on your desk, held your phone above it, and snapped a photo, and then the document is already scanned and stored as an image on your phone and you could email it to yourself or text it to someone. What is really cool is that while those programs and those applications almost seem like magic, they are really really simple to build.
31:02 So, if I were to build a mobile document scanner, I would basically say first capture the image, so I'm going to read it from disc, I'm going to read it from the camera sensor. Nice. I'm going to convert it to gray scale because color doesn't really matter, I'm going to assume there is enough contrast between the document and the desk that I will be able to detect it. And I am probably going to blur it, get rid of any type of high frequency noise just allowing me to focus more on the structural objects inside of the image and less on the on the detail; from there, I'll say, yeah let's perform edge detection, let's find all the edges in the image and based on the outlines of the objects in the image I'm going to take these and I'm going to look for a rectangle.
31:47 And if you consider the geometry of a rectangle it just has 4 points, 4 vertices. So I'm going to loop over the largest regions in this image and I'm going to find the largest one that has 4 vertices. And I'm going to assume that is my document. And once I have the region of the image, I'm going to perform a perspective transform to give me this top down bird's eye view of the document. And from there, you basically built yourself a document scanner, you can apply OCR optical character recognition, and convert the text in the image to a string in Python, or you could just save the scanned document as a raw image that works too. It's really actually not a complicated system to build.
32:39 How do you deal with the light imperfections of reality, like a crumpled receit of the document or something like that?
32:48 So to handle as you stressed it like these light imperfections I would suggest for the document scanner example to do what is called "contour approximation." So again if a region of an image can be defined as a set of vertices and you know, due to the com of the piece of paper maybe I find a region of an image that has 8 vertices, or 12 vertices, you know, it is not a perfect rectangle it is kind of jagered in some places or what I could actually do is like approximate that contour and I could basically do line splitting and try and reduce the number of points to form that contour and that is actually a very critical step in building a mobile document scanner because you are not going to find perfect rectangles in the real world. Just due the noise capturing with capturing the photo-
33:41 Even perspective right?
33:42 Yeah, even perspective, that can dramatically restore things.
33:45 Are there libraries out there that help you with that kind of stuff or is that where you need to know some math?
33:50 You do not really need to know that much math at all for it, it is all about gluing the right functions together at the right time and I think that is one of the harder points of learning computer vision and why I run the pyImage search blogs- I want to show people real world solutions to problems using computer vision and when I do that it is not really the actual code that matters, it's learning why I am applying certain functions at different times in the context of the angle . It is about taking these functions and gluing them together in a way that gives you a real solution.
34:24 Ok, that makes sense, sounds a little bit like the code variation of I open up Photoshop and there is 1000 things it does, and I individually they are all simple but I do not know how to go from a bad picture to a great picture with all my little pieces, kind of like that with the code libraries for image stuff, right?
34:43 For sure.
34:44 There is a couple of really cool apps that are you know, coming to mind when you talk about it: one is called word lens, do you know about word lens?
34:52 Yeah, that's the one where you have your iPhone or your smartphone and you can hold it over a foreign language and it automatically translates it for you, is that right?
35:01 Yeah, that's right, and it will basically replace a menu, a street sign, with whatever language you want, English in my case. That seems like a really cool use as well. Another one that I like is this thing called peak.ar and if you are hiking in the mountains you could hold that up and it will like highlight the various mountain top names and how high they are, and how far away they are using computer vision.
35:26 Very nice.
35:26 Yeah, that's pretty cool. Let's talk about your blog, and your book a little bit. Like, what kind of stuff do you have on there for people to go learn, you have like a short course people can sign up for and I saw you also have like a video course that they could take as well, is that right?
35:39 So it's not a video course, I have two small mini courses on the pyImage search blog; one is like a 21 day crash course on building image search engines which is my area of expertise within computer vision. We are all familiar with going to Google and typing in our query in and finding results and that is essentially what image search engines are. whether you are using texas or query we are using images, we are analyzing the image and then searching databases for images with similar visual content, and again this is without any knowledge of text associated with the image.
36:17 Right. That's really cool. You know, I think I first saw that with Google image search where I could take a picture of like the Sidney bridge and it would say hey that's the Sidney bridge. But it seems like there might be more practical uses for that...?
36:29 Oh there definitely are. I have used image search in the National Cancer Institute actually, I worked as a consultant there for about 6 months and we developed methods to automatically analyze breast histology images for cancer risk factors. So we would actually develop algorithms that would go in and automatically analyze these cellular structures. And then, based on all of that be able to kind of predict that person's cancer risk factor 5 years from now, 10 years from now, 15 years from now. But you could also apply image search and say you know here is an example of these cellular structures, go and find me other cellular structures in this image data set that look like that and then you can compare their cancer risk factors and look for co-relations. So the image search is the wonderful feel, I personally love it, there is incredible number of uses for it and it is not all consumer facing, it is also on the back end, on the business facing, on the government end and all the other sectors.
37:41 Yeah, that's cool. One company that I saw- I can't remember where I saw them, but they were basically starting a company that would help you identify parts of like machines and cars and stuff, like "I have a part, I know it is broken, I've no idea what it is", take a picture of it and it will say "oh this is you know widget x and it cost $32 here is how you order it."
38:01 Oh cool.
38:01 Yeah. So it doesn't help people quite as much but still, very cool. Tell us about your book, what's the title?
38:07 So I have a book called "Practical Python and OpenCV" and it is just a really quick kick start guide to like get you through learning the basics of computer vision, and image processing. It starts off with very rudimentary things like "here is how you load an image off of disc and here is how you detect edges in an image" and then it goes on to show you how to detect faces in images and video streams, and how to track objects in images and how to recognize hand written characters.
38:37 So it really takes you from assuming you have a very basic programming experience and no prior experience in computer vision all the way to hey look, I'm solving some world problems with using computer vision. And again, it is a quick read and you can learn a tone about computer vision really quickly. And I will also love to offer a discount to Talk Python listeners if they want to go to pyimagesearch.com/talkpython, you can get 20% off of any of the books that I offer on that page. And I'll keep that open for the next two weeks.
39:14 That's awesome, thanks for doing that.
39:16 Oh, thank you, thank you so much for having this opportunity it is great.
39:20 Yeah, so people should check out your book, that's really cool. Is it used for like any university classes or is it more not so academic, a little more like industry focused?
39:29 It is much more practical on hands on, so it is not used that much in college level classrooms, however, I have seen it used in Amsterdam and a few universities in the United States as well.
39:44 Yeah, that's really cool. It's got to make you feel good?
39:45 It certainly does, I'm happy that I can contribute and give back to a feel that I love so much.
39:52 Yeah, that's awesome. A lot of people build web apps and web apps do not have attached cameras, I guess you can get one from the browser, you can do something with that. What is some like web scenario rather than say iPhone or Raspberry Pi type scenarios for computer vision?
40:09 Let's say that we are going to build like a Twitter or Facebook style application where user has a profile picture. And we are going to build the functionality to upload a profile picture and then set it. Well, there is a naive way of doing it where you just kind of upload a profile picture and maybe you ask the user to crop the image or maybe you just automatically assume that the users face or their profile is directly in the center so you just automatically resize the image and crop it; those are viable solutions and there is nothing wrong with them, but, we can also use a computer vision and create a much more fluid user experience. So the user can upload the profile picture and in the background we are running these computer vision algorithms so we can analyze the image and find the face and automatically crop that face from the image and set it as the user's profile picture. And again, the goal here is to make it a much more seamless experience.
41:04 That's really cool, I think people's expectations of consumer focus and even b2b type apps to some degree, things like slack and so on, are- they definitely have gone up in the last 3 or 4 years and I blame the iPhone, we can no longer build just ugly apps that users are forced to use right, they expect a cool experience and something like that is just really a nice touch.
41:27 Yeah for sure. And from a business perspective I mean, making a good impression on your users is crucial, so you want them to have this really really great experience when using your app even if it is something as simple as s profile picture, I mean, things like those do matter from user experience level.
41:44 Right. If somebody goes "Oh my Gosh you wouldn't believe what happened, I uploaded this and it made the best profile picture ever" I think that that could possibly be something worth a mouth marketing, that would be cool.
41:54 Yeah, and then, getting back to the Facebook example; If you've uploaded a picture to Facebook recently you probably noticed that it can not only detect your face or other peoples' faces in the image, but it can recognize whose face that belongs to. So these images are being automatically tagged for you and maybe that's creepy maybe not, kind of beside the point but again, it is these computer vision algorithms running behind the scenes.
42:20 Right. That's really fantastic. Have you played with Google Auto Awesome from Google Plus?
42:24 No, I haven't.
42:25 Oh my Gosh, there is amazing stuff, I don't do much with Google Plus, I mean, there are some people on it and I check it every now and then but it's ok and so on but, for images Google plus is crazy so it'll do things like take a picture, if you have like 3 pictures of a group, in one picture there is somebody frowning and there is another picture someone else has eyes closed, but in the other picture the person with closed eyes is wide open and you know, if you could combine them and they will all like look good, like you can take out the bad faces and put there the good face from the other pictures, it will do that automatically for you.
43:03 Wow.
43:03 So, it will like take a set of best pictures and make the best possible like unified group picture of all the different faces that's pretty crazy-
43:12 That is.
43:11 And other things like that as well, like focusing on faces and so on, so I'm sure there is some really cool computer vision stuff going on there.
43:20 Yeah, I will definitely have to check that out.
43:21 Yeah, you can just like install Google Plus, turn on your iPhone and just tell it to auto back up and then like you will get these notifications every now and then and you will get these great pictures.
43:30 Nice.
43:32 Yeah, So, suppose I am new, I am pretty new to computer vision, how do I get started?
43:36 So, if you want to get started the first thing I would suggest doing is installing OpenCV. That's pretty important. Another library that you might want to look into that is pip installable is Scikit Image and this library- I love OpenCV but Scikit Image releases updates a lot, lot faster. It's a smaller library but it is very good at what it does, and in my opinion it stays a little bit closer to what the state of the art is. So if academic papers published and they implement some new algorithm, Scikit Image is much more likely to have that implementation then OpenCV, simply because it's a smaller community, but it is faster moving community, so they can release updates faster.
44:25 Ok, yeah taht sounds really cool, and do I need hardware? Do I need a Raspberry Pi and the camera kit or can I just use my web cam?
44:31 You can use pretty much whatever you want; I recommend just starting out with your laptop or your desktop system, there is no reason that you need any special hardware, and realistically, you don't even need a webcam unless you plan on processing real time video. And in the meantime if you do, you can still go ahead and get started work with pre-recorded videos, maybe you recorded from years ago, from files you have lying around on your system. There is no reason why you can't get started, I think that's what's really cool about computer vision is, back in high school I had this impression that computer vision image processing was just magic going on behind the scenes, but it's really not and it is possible to learn these algorithms and understand them without having a degree in computer science or a focus in machine learning, or tones of understanding of mathematics.
45:25 Yeah, very cool especially with things like OpenCV where you don't have to understand like the video and coding format and that kind of stuff, right?
45:32 Yeah that is taken care of for you.
45:35 All right. Maybe you could tell everyone a few of your favorite packages or the libraries out there that you like to use, everyone has their own flavor and things they've discovered... So have you got any favorites?
45:48 Yeah, so I mentioned two of my favorites already. The first one is Scikit Learn, for machine learning, the other is Scikit Image for computer vision. OpenCV is great but again, not pip installable. If you do decide to play around with computer vision I would suggest that you check out the AM-utils package that I created. Basically a set of basic functions for image manipulation such as resizing, rotation, translations, that could take a little bit of code to do in OpenCV, and then it just reduces it to a single function call inside the AM-utils package.
46:26 Yeah that's great, I'll put that in the show notes.
46:27 Yeah it is definitely worth taking a look at. The other fun one is "color transfer". And, the idea is that, let's say you went to the beach at 1 pm in the afternoon and you took this really great picture, but you want it to make it look like it was taken at sunset so you would have this beautiful orange glow of the sun over top of the ocean. Maybe you could create that effect in the Photoshop, or you could just have an example image of the sunset and then algorithmically take the color space from the sunset image and then transfer it to the image you took. So if you want to play around with that, that's on the color transfer package on PyPy.
47:15 That sounds really interesting.
47:18 It's a lot of fun, not exactly real world applicable package but it is fun.
47:23 Yeah, it sounds really fun. Ok, any final shout outs or things you want to tell the listeners?
47:29 Yeah if you want to learn more about computer vision, definitely head over to PyImage search, check out the blog post I have, I have at the time of this recording at least 80 free blog posts that discuss computer vision and solving the real world problems, I have two courses, mini courses that you can take and learn computer vision and again I have my book "Practical Python and OpenCV" that you can purchase and read through and learn computer vision and get the speed really quickly. And again, I'm offering that 20% off to all Talk Python podcast listeners at pyimagesearch.com/talkpython.
48:08 All right, that's really interesting, I appreciate the offer. And I think everyone out there should go and check out your blog, look some tutorials you have they are really short and they both walk you through the code as well as like maybe setting this up on the Raspberry Pi with the camera board and all those kind of things so I find it really helpful and hopefully if you haven't tried it out I encourage you to do so.
48:29 Cool.Thank you so much for having me on the show this was a great experience.
48:33 Yeah, Adrian it was great to talk to you and love to bring a different perspective on the cool stuff you are doing in Python, so here is one more awesome thing that people can do if they want to try it, right.
48:45 Yeah, for sure.
48:47 Thanks for being here, talk to you later.
48:48 This has been another episode of Talk Python To Me.
48:48 Today's guest was Adrian Rosebrock and this episode has been sponsored by CodeShip. Please check them out at codeship.com and thank them on twitter via @codeship. Don't forget the discount code for listeners, it's easy: TALKPYTHON. Remember, you can find the links from the show at talkpythontome.com/episodes/show/11. And if you support the show, please check out our patreon campaign at patreon.com/mkennedy where you can contribute as little as 1 dollar an episode.
48:48 Be sure to subscribe to the show. Visit the website and choose subscribe in itunes or grab the episode rss feed and drop it into your favorite podcatcher. You'll find both in the footer of every page. This is your host, Michael Kennedy. Thanks for listening! Smixx, take us outta here.
48:48 [music]