Brought to you by Linode - Build your next big idea @ linode.com


« Return to show page

Transcript for Episode #9:
Docker for the Python Developer

Recorded on Tuesday, Apr 28, 2015.

It's time to jump into the brave new world of containers and Docker, with Patrick Chanozen. This is Talk Python To Me, episode number 9 recorded Monday, May 4th 2015

[music]

:40 Hello, and welcome to Talk Python To Me, a weekly podcast on Python- the language, the libraries, the ecosystem and the personalities. This is your host, Michael Kennedy, follow me on Twitter where I am at @mkennedy and keep up with the show and listen to past episodes at talkpythontome.com. This episode, we will be talking with Patrick Chanozen, about Docker.

1:01 Before we get to Patrick, I'm really excited to tell you that this episode is brought to you by Codeship. Codeship is a platform for continuous integration and continuous delivery as a service. Please take a moment to check them out at codeship.com, or follow them on Twitter where they are @codeship.

1:17 Now let me introduce Patrick. Patrick Chanozen is a member of the technical staff at Docker Inc. He helps build Docker- an open platform for distributing applications for developers and sysadmins. Patrick is a software developer and storyteller. He has spent ten years building platforms at Netscape & Sun, 10 more evangelizing platforms at Google, VMware and Microsoft. His main professional interest is in building and kickstarting the network effects for these wonders two sided markets called Platforms. He worked on platforms for Portals, Ads, Commerce, Social, Web, Distributed Apps and of course, the Cloud.

1:54 Patrick, welcome to the show.

1:57 Hey Michael, thanks for having me.

1:59 Hey I'm really glad that you are here, and I'm super excited to have Docker on the show. Ever since I learned about Docker, there is a few times I've learned about some technology and it has completely taken over me and somehow I am obsessed to learn it. And when I came across Docker you know, when you guys announced it, that was a year ago- two years ago, I came across it and I'm just "Oh my Gosh! I need to learn this, I need to understand this, because this is going to be huge". And I really think Docker and these container technologies are going to change the way we build and deliver software. So, let's start with telling the world what is Docker for those who don't know.

2:41 Sure. So, Docker is an Open Source project, that was announced by Solomon Hykes, the founder of dotCloud and Docker, two years ago at PyCon, so it is sort of run for Python developers because it all started at PyCon, and so Solomon and his team were building this platform as a service called dotCloud, and they were struggling with adoption, at that time I was at Google, I was doing a Google App Engine, there was Heroku who's been bought by Salesforce. And so dotCloud had trouble attracting developers to their platform as a service, which allowed you to deploy applications in PHP by multiple languages, while at the time platform like Google App Engine for example was limited to Python and Java as runtimes.

3:38 And so they were struggling with adoption, but they had this jam inside of dotCloud that they were using to build that platform as a service that was called Docker, and that was an easy way to package and distribute build, ship and run Linux containers. And so they Open Source the project at PyCon two years ago, and then it took the world like storm, all the developers started adopting it, and the whole industry is structured around that notion of Linux container.

4:15 So Docker essentially is a set of tools that allow you to build, ship and run distributed applications based on Linux containers. So there is the Docker command line, which is at the same time its unique binary that has a client and a daemon, and the daemon lets you get images for containers, on your local machine or on a server, and run them in Linux container like isolated from other workloads on that machine so you can have your own file system, your own users, your own network namespace; and so Docker is based on Linux namespaces and cgroups.

5:01 And then, you have Docker machine that lets you provision machines running the Docker Engine on any cloud provider, like Google, Amazon, Microsoft, but also locally on your machine on VMWare, Virtual Box or Hyper- V. So Docker itself, Docker machine to provision machines, and there is another project called Docker Compose, that allows you to assemble all these containers that you are using to build your microservices into a single unit that you can run on your laptop for development or ship to the Cloud as a single unit.

5:42 So typically, you would have for your typical Python App maybe you are using Django or Flask, you would have one container that has your Django application with its code in it, and you would link that to another container that has Redis or MySql and then you can start all this with a single command, Docker Compose.

6:04 So you can, with Docker Compose you can just start-

6:07 Yes, exactly, so in Docker Compose you say "I have my image for my Django App, let's go like Chanozen-my Django App, and I want to launch that with port 80 on the machine on the server on which it's running map to port 8080 maybe internally, where I am running it, and then I want to tie that to a Redis instance that is using the Docker image Redis that's on Docker Hub, and then inside of my code I'm just going to refer to Redis and automatically to just tie that to the other container that's running.

6:46 [music]

Codeship is a hosted continuous delivery service focused on speed, security and customizability. You can set up continuous integration in a matter of seconds and automatically deploy when your test have passed. Codeship supports your GitHub and Bitbucket projects. You can get started with Codeship's free plan today. Should you decide to go with the premium plan, Talk Python listeners can save 20% off any plan for the next three months by using the code: TALKPYTHON. All caps, no spaces. Check them out at Codeship.com, and tell them "thanks" for sponsoring the show on Twitter where they are at @codeship.

[music]

7:37 So before we get into details how we run all that, you know, it's interesting that Docker kind of grew out of what was supposed to be kind of a larger project, right, platform as a service company if you will... That was how you guys were trying to internally achieve greater density for your hosting.

7:56 Yeah, that's exactly right. And the notion of running Linux containers has been around for really long time, actually containers have been around for a long time, in like 2005 there was already the notion of Solaris containers so that notion of containers existing for a long time, and that is the way Google is running its workloads inside of Google data centers using Linux containers for isolation, in order to be able to run workloads with the very high density, because it is using Linux operating system virtualization, operating system like virtualization as opposed to running on top of virtual machines.

8:45 Right, so I think that's a super important distinction to make between these containers and virtual machines. So, if I create a virtual machine, maybe it's a couple of GBs for the file system and the operating system and all my data and everything I've done to it, maybe uses 2 GB of RAM when I start and stop it, it's got to boot up like a regular machine; you know, it's a big blocky thing. How big are these Docker containers?

9:12 Yeah, they can- so Docker containers compared to virtual machines can be much smaller. They can be big as well, but usually they are much smaller, and another advantage they have over VMs is that the image format for Docker images- they are layered. So, you build a layer you can say "I'm going to inherit from the Ubuntu layer, and then I'm going to install Python on that, and then I am going to use pip to install my requirements on that".

9:47 Each of these instructions generate a different layer, and when you are using the Docker pull to pull the image or Docker run to run that image, what is doing at that is looking in its local cash to see "hey I have Ubuntu already in there, so I am just going to download the image that adds Python to it, and then the one with the requirements for my app".

10:09 So there is an additional density in terms of storage as well, that goes with it while when you are packaging your app for VM you are just building that monolithic VM image with everything from version Linux or Windows that you are running in there, the full operating system plus your application program packaging, and every VM you start has all these data. While, if you are running like 10 different Python applications packages or containers on one host, you would have one time the data for the Ubuntu VM that's your base and then maybe one layer for Python and then small layers 10 times for the requirements and the code base for your Python application.

11:01 So, the images are lighter, and what that means that they download much faster; another aspect is that its runtime is much faster, so starting a container once your image is cashed locally will just take like a few mili seconds, while as you know to start a VM it takes a good like 20 to 30 seconds.

11:23 Yeah, that's really amazing, and I like the layering concept. So, every layer you add brings you closer and closer to the specific type of operating system and environment you need for your app; so at first, it's just Linux and that could be anything; and then it's Linux plus your version of Python, it's a little closer. And then maybe it's that plus Nginx and then it's a little closer, and then it's that version of Linux plus Python, plus Nginx plus all the packages your web app depends upon, and finally maybe it's your code and so, you can use every one of those layers as the building block to create new ones and things like that, right?

12:04 Yeah, it's very composable, so once you have started once you have created your base layer, anybody in your team or if you make it public on Docker Hub can inherit from that and add their own stuff. Or, you can go up in the chain of the inheritance of layers and say "Oh I don't like the official Python layer and I'm going to see what are operating system they use to run it and maybe I am going to create my own" So you can branch off at any time, it's a very flexible system.

12:38 As you talk to all flexibility, at the same time to me the big difference is that the speed aspect to it; the fact that downloading images is much faster because images are layered, and starting the container is like 10 times or maybe a 100 times faster depending on the cases, when you make things faster it usually changes the experience, and that's what developers realize when they start, when developers start to use Docker usually they have this A-ha moment "Oh, then I can build all the time, and create my own images, and start breaking my applications that was monolithic into smaller microservices that are easier to compose with something that Docker compose."

13:26 So it goes hand-in-hand with true big shift in software engineering that happened in the past 5 years. One of them is microservices the fact that breaking up your application into smaller services that can be scaled and be blocked independently, and the other one is the DevOps where dev teams and ops teams are working together with continuous integration and continuous delivery, using the same tools to achieve continuous delivery of their applications. And so, the container is really nice unit of collaboration between devs and ops.

14:11 I think you are right that the size makes that really even possible to some degree. So let me give you my mental model for the DevOps story and maybe you can tell me how I have that slightly wrong or improve on it, whatever. So, in my mind, I'm thinking there is some developers building their app and probably they have some kind of continuous integration setup, where they check in some automated build happens, some test run, stuff like that.

14:36 So I could build instead of just running my code directly checking in and running my tests directly on the say the build server, I can create a container and when I do a check in the build server could actually create one of these containers, push my code into it, run that code in that actual container environment, and if I'm happy maybe I could automatically push that out to some Cloud provider and the thing that is actually running is the thing I tested and built locally...

15:03 Yeah.

15:04 So, there is no sort of environmental differences or anything like that, it's just that is the thing I can actually ship almost infrastructure?

15:11 Yeah. I think you got it right, one of the things I would add is that this kind of fracturing out of applications into microservices works really well for people who follow The 12 Factor App manifesto that Heroku published a few years ago, where you try to build your application as stateless you run data services centralized way and then you configure the connectivity between different microservices with environment variables and so typically what would happen is that you would run locally as a developer on a single box with tests MySql container that has a volume attached with test data and then when you just push to a build server, the build server deploys that into Cloud environment where automated tests are deployed, and then the container will be tied to test MySql database that maybe is maintained by operations.

16:22 So this is where this pit between the operations and dev happens; where the dev specify all the requirements that they have for their app in the container, and then the ops are providing the right plugs to the right services in different environments. And then you eventually would push that to production, where typically there would be someone, sysadmin that decide "hey, this build is ready to go to production" and just going to trigger the fact that in productioning image's been same image that they had on the dev box but there it's configured to work to production MySql cluster.

17:06 That's really amazing. So, how do I deal with persistent data? You know, I've got this- let's say I created one of these these Docker containers that contain like MongoDB. MongoDB has to have a data store somewhere in the file system; how do I kind of deal with inversion that- is that actually in the container, is that somewhere else, how do I- if the database schema changes, as the new container comes up, what do I do, that seems a little tricky?

17:36 Yeah yeah yeah, definitely. So Docker has this notion of volumes, where your container, you can mount directory that is on your host machine, and this is where you keep the data. And typically, there is also another notion that is called Data container, where it's just container that just has a volume attached to it, and then you mount the volume from that container into your runtime container.

18:07 And then, changing database like from develop production it just be a matter of changing the volume of the container, you are mounting your volume from. So, that is the notion of volume, that lets you have persistent containers. There is one issue in it in Docker today, that lots of people are running into when they try to deploy the clusters which is volumes can be attached only for local file systems, on the engine on which you have trigger on the host on which you have trigger, and there is a company called "Cluster HQ " which is building an add on to Docker called "Flocker"

18:51 We are working with them to make it a plugin, so there is work on the wait to make that happen right now, and what Flocker is doing there it addresses this issue of moving the system container from one host to another, that shows deploying in a large orchestrated Flocker setup, where when the orchestrator tells the engine "hey I want to move that Docker engine that is tied to- the Docker container that has a system volume from pose a to pose b", what they are doing there- they are using ZFS under the hood, and they are doing ZFS snapshot which is fast, then they are moving the snapshot to the other machine that can take a few hours; and once the snapshot is moved they can start the server take a differ the snapshot and push it on the other side which is faster, and then they restart the container on the other side.

19:50 That sounds really helpful. Because it seems like there is a lot of complexity, more and more containers that get involved, more horizontal scale that gets in there...

19:56 Yeah. Container persistence is really an interesting issue where there is lots of research right now and lots of companies looking into that and how to enable this.

20:06 Excellent. So you have spoken a little bit about Docker Hub. Can you tell the listeners what that is?

20:10 Yeah, definitely. So Docker itself is a client and a deamon that's running either on the local machine or on the server, but what you are building with the Docker client is an image of the system on which you are running. Once your image is ready typically you would push it to Docker Hub, which is a hosted service by Docker, so its software is a service, it's a hosted service that lets you share your images either with the world, you can make your image public, and there are lots of Open Source and images in there, or you can host them privately so you can pay for having- so Docker Hub is free for public images, but if you want a private image you can buy subscription that allows you to have several private images for your own projects, and then your engines download the images from the Hub as well, and it's integrated with Docker as well. We also have the version that's called Docker Hub Enterprise, that enterprises one to adopt Docker for internal workflow can install behind the firewall, so they have complete control over the polices for images.

21:26 Right, that makes sense, you might not want just go grab a bunch of public images and download them- is this actual source code or actual binary executables that I'm grabbing, or is there another way to like get an image from Docker Hub?

21:38 So, what you grabbed from Docker Hub these images are that are for each of the layer with a bunch of meta data it's spinning the Docker Engine how to reconstitute the image your operating system in there.

21:53 Ok, excellent. So tell me how to install Nginx or how to install Python 3 or something like that?

21:59 Yeah,exactly, so we host official Docker images for many Open Source projects; so for example for Python there is an excellent- I used to take images from the community ? and then I switched to the official image which is called Python. So in order to run Python in Docker you just say Docker run Python, eventually you could qualify that to say if you want Python 2 or Python 3, there are labels on these images that let you specify more fine grain version of Python, and once you run the image with it comes with the- I don't remember which flavor of Linux, and then Python all installed with the tools from developing for running Python workloads.

22:49 That sounds like a really easy way to get started for the Python developers, that's cool.

22:53 Yeah, and that's one of the things I love, so I joined Docker, just two months ago, before that I was at Microsoft where my role as part of the developer experience team was to bring all the Docker partner ecosystem on Azure. And so I started doing that and I built a bunch of tools to deploy clusters on Azure, and the tool I built for that was in Python; and I interacted with lots of startups using Python. And when I started talking about my tool to other Microsoft people these guys were at Windows, they didn't have Python installed, they said "how do I run that script", and in order to make that easier, I just bundled all my script in a Docker image and I just told them hey you just run that image and you'll have Python 2 installed with all the requirements for the tool that I built that measure Python as decay already installed and you don't need to install anything, it's just a Docker runaway from...

24:05 That's really cool, it sounds like interesting world to live in, working with lot of the windows stuff, and actually, one of the things I'd kind of like to talk about in a minute is the Docker view from different operating systems. I mean, if I'm at Linux it makes perfect sense, it's kind of like a Linux thing, I can run Docker on Ubuntu, things like that. But if I'm on let's say on my Mac, what do I do then?

24:27 Yeas, on the Mac you can download an installer from Docker.com, so we always release all of our versions on Linux, Mac and Windows. And so you download an installer and what it's doing because the engine relies on primitives in Linux, and so we have these tools that's downloaded as far of the installer that's for Ubuntu Docker that starts behind the scene of virtual box VM running Linux where the engine is installed, and then you run the client on your Mac and your client is targeting that virtual box instance. And then, by changing the environment variables you can just target- instead of targeting your local machine your, local virtual box instance on your Mac you can target a box measure on Google compute engine or Amazon.

25:25 So that's pretty easy. In terms of operating system one of the things that I am super happy about that happen- it was launched few weeks ago, and that was announced at The Build Conference last week from Microsoft, is that Microsoft like really on Docker and what they've done, they have contributed a lot of code to Docker so that now the Docker client runs on Windows, so you can run the Docker client and then target the Linux machine so you can use Ubuntu Docker on Windows as well, so you can run your Linux workflows on Windows with that and develop-

26:05 That's cool, so that kind of puts it on pile with OS 10 maybe?

26:06 Yes, exactly, puts it with OS 10 but they are going even further where where they announced a build last week Mark Russinovich, who is the CTO for Azure, he spotted a Docker T-shirt and he demoed the step further that they want, which is they implemented Docker Engine for Windows server. So that's really a big news they- Microsoft like the Docker workflow for Devs and Opts so much that they wanted to give that to Windows developers and Windows sysadmins. So they modified Windows server so that to have the same isolation for existing Linux, and then they implemented the Docker deamon in terms of that. So that's not out yet, they just gave a demo last week, but it is going to ship with the next version of Windows server and there will be a preview this summer. So that means that you will be able to develop .NET applications running on Windows server inside of Docker containers, which means that for developers and sysadmin and enterprise people working Python Java .NET they will be able to just deploy altogether microservices in the technology that suits the best.

27:30 Yeah, that's really cool, I'm glad you brought that up. I was watching the Build Conference last week as well, and just for the listeners because of time shifting. We are recording on Monday May 4th, and I think that was announced last Wednesday, so, 4, 5 days ago. And one of the very first things they announced on the entire conference, was your CEO Ben Golub coming out and saying, from Docker saying "Look, we are working with Microsoft doing amazing things and we started out maybe trying to bring it on par with OS 10 but, the Microsoft guys were ready to go even further and actually add features to Windows the OS that would allow it to have containers for its operating system and it's processes, because until this moment, there was no possibility for something like Docker on Windows for Windows apps.

28:18 Yeah, it's pretty incredible.

28:21 So, I mean, to me that means- now we have the two major operating systems that run software and the servers and the data centers both supporting containers, both working on Docker and that's a huge for you guys, right?

28:32 Yeah, it is. And I think it's very good for the Windows ecosystem as well, because it will bring the same kind of tools that Linux developers and admins had been enjoying for long time. And it really allows to all more dense workloads on the servers etc.

28:51 Yeah, that's really cool. So the DevOps story just gets way better over there I think.

28:54 Yeah. Exactly.

28:56 So, I'd like to talk a little bit about Python with you, but before we do, what do you think the future holds for Docker and containers, where are things going in this whole world?

29:04 So we have DockerCon, which is a conference that's happening in June, and one of the things we have been working on, as I explained before, there are Docker takes care of the lots of aspects but it's the only to mature lots of variants and especially when people started making production there are lots of needs that are still unfulfilled by the two, so one of the big aspects that we have been working on recently is to make it pluggable, so we are working on the plugin model with people like building virtual networking, or who are building quotable volumes so that they can run as plugin Docker Engine and so that you would be able to define networks for your containers that span your personal your own data center and maybe Azure and Google, like two public clouds.

30: 02 So that's on the side and from the side is like moving volumes from one to the other. We are also working on making sure Docker Machine lets you provision Docker deamons everywhere; there has been lots of improvement to Docker Compose, and also one of big topics that is happening right now is once you start building all your apps as microservices running in containers, the next step is how you orchestarte them, and there at least 6 or 7 containers in that space. Again, as I said before industry we organize along that, before people were building platforms as a service and that formed industry we focused towards orchestrating Docker containers so the unit now is the microservice package as a container and in the orchestraters you have Docker Swarm which is one from Docker itself, so it's a native orchestration solution that works multicloud, there is kubernetes from Google that's inspired by what's running internally at Google and that's Open source, and there is Apache Mesos by Mesosphere and they are running in production lots of different places they are coming from Twitter originally I think Netflix runs it as well, and there are smaller players like Deis, which is running on Azure as well; so there is lot of activity in that space of Docker container orchestration ...

31:39 Yeah that seems like where the real challenges lie, and if that gets easier and easier then, you know...

31:45 Yeah.

31:44 The sky is the limit, and that's amazing.

31:46 Yeah, The Docker side we are trying to make the whole experience easier, like with machines, like one to provision new machines in different clouds and then it's integrated with Swarm so you can say "oh I want all these machines to be in a single cluster" and then with Docker Compose you can say oh there is the start of the integration where you can say my app that's composed of multiple services please deploy that to a Swarm instance and then it will just provision it in the right place in the cluster, so we are really trying to make the experience much easier for developers and sysadmins.

32:25 Cool. The more you guys encourage and make it easy to break are absent as small little services then the challenge becomes linking them together so it sounds like you are doing some great work there. Fantastic.

32:36 Yeah, and one advantage I will have to add that there is to this microservice approach and that Docker enables, is once you start breaking your app in microservices you can start innovating in terms of what language and platform they are using for each of the microservice. So as opposed to being stuck for example with the Django app running in platform 2.7 and made that choice and there is some legacy code that cannot be imported to Python 3, you could say "oh I just break my app in ten different microservices and for this microservice I don't have any legacy I'm going to write it in Python 3". And for the developers it's just a matter of saying "oh I'm inheriting from Python 3 as opposed to Python 2".

33:26 That really needs some, maybe it's a gate way for a little more flexibility in the technology to...

33:32 Yeah and I think for especially for the Python community where that big gap between Python 2 and Python 3 developers happen, I think there is breaking up of large monolithic applications into smaller microservices where you have more freedom to just do stuff. Maybe a good way of introducing Python 3 to go run them...

33:53 For Sure, do you see anything like that happening in the data side of things, so you know, if I'm using MySql relational databases and some big monolithic way like, could my different microervices be using like no Sql and MongoDB or Redis here and there then maybe one other part still using a relational database?

34:12 Oh yeah, I've seen a lot of that, where people are starting once your microservice has smaller frequent and like a single functional goal, then it is easy to say I'm going to back that up with Redis or MongoDB as opposed to storing everything in MySql because maybe I'm building an e commerce application and I need the relational aspect for parts of it but maybe for like tracking referrals or the social media aspect of how people are tweeting about products on my website, maybe this small services can use a different types of databases.

34:58 Yeah, that seems like a perfect way to choose the right database... Cool. So you guys- do you use Python internally- obviously you can run Python in Docker containers really well, but do you guys use it as a language for yourself?

35:12 Actually we do, not everybody, so Docker itself is written in Go as well as Docker machine and Swarm. The Docker Compose is actually written in Python so it's a Python project.

35:26 Python, how is that- are people pretty happy with Python internally?

35:29 Yeah, and I think they even try to ?? to Go at some point but then they say hey we have high velocity with Python we have a good base that people understand well so let's stick with that.

35:50 Yeah, that's really cool. So if I am a listener and I'm out there hearing this and I'm like this is awesome I have to get started, what do I do? How do I get started?

35:57 Yeah, to get started you go to Docker.com we have lots of tutorials if you are in Mac just download Docker for Mac for Linux depending on the distro, there are packages tha, and for Windows there is an installer as well. There is also another project you may want to take a look at that's called Kitematic that one runs only on Mac OS and it's a Docker applying GUI. So it shows you all the containers that you can find on Docker Hub and you can say oh just stop this one launch a terminal where I can script it or connect to it, and it shows you the logs, the volume that you have attached and you can go edit the files and all that; so that's a very easy way to get started, Kitematic I would say. And else-

36:49 Yeah, I have Kitematic installed and I enjoy it, it's nice. Clean, and works well.

36:54 Yeah, there is a lot of stuff to add in there, it's really good stuff and it's very easy to get started.

36:59 Excellent. So, I think it's probably a place to call it a show Patrick, what else would you like to add while we got the chance to talk...

37:09 I mean, thanks very much for the opportunity, I really encourage Python developers to go take a look at that, to leave one of the enrolls the most interesting for Python developers is you can use it portably for your customers, to run your code anywhere, we are not having complicated environment to setup as long as they have Docker they can just start your application as a container and the other I think pretty important is introduce Python 3 microservices into your app packages containers it just doesn't matter you don't need to setup complicated environment is just the change of the label in the front Docker file to start using Python 3.

38:03 I am very excited about Docker and this whole container world and I appreciate you taking the time to share with everyone.

38:10 Thanks Michael.

38:10 You are welcome Patrick.

38:14 This has been another episode of Talk Python To Me. Today's guest was Patrick Chanozen, and this episode has been sponsored by Codeship. Please check them out at codeship.com and thank them on Twitter via @codeship. Don't forget the discount code for listeners, it's easy TALKPYTHON, all caps no spaces. Remember, you can find the links from this show at talkpythontome.com/episodes/show/9, and if you are feeling generous please check out our Patreon campaign at patreon.com/mkennedy to contribute and support the show. And be sure to subscribe to the show. Visit the website and choose subscribe in iTunes or grab the episode's RSS feed and drop it into your favorite podcatcher. You'll find both at the footer of every page. This is your host, Michael Kennedy, thanks for listening.

39:01 [music]

Back to show page