Monitor performance issues & errors in your code

#38: Continuous Integration and Delivery at Codeship Transcript

Recorded on Thursday, Dec 10, 2015.

00:00 Have you heard about the works on my machine certification program? It's a really awesome certification for developers. It was created by Joseph Cooney and enhanced by Jeff Atwood (of stackoverflow fame). Here's how it works: Compile your application code. Getting the latest version of any recent code changes from other developers is purely optional and not a requirement for certification. Launch the application or website that has just been compiled. Cause one code path in the code you're checking in to be executed. The preferred way to do this is with ad-hoc manual testing of the simplest possible case for the feature in question. Omit this step if the code change was less than five lines, or if, in the developer's professional opinion, the code change could not possibly result in an error.

00:00 Check the code changes into your version control system.

00:00 Congratulations! You're fully certified.

00:00 On this episode of Talk Python To Me, you'll meet Florian Motlik from codeship. He's here to tell us all continuous integration and continuous delivery. Maybe he can help keep you and your team from getting certified, in a bad way! This is episode number 38, recorded December 10th 2015.

00:00 [music]

00:00 Welcome to Talk Python to Me. A weekly podcast on Python- the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on twitter where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm and follow the show on twitter via @talkpython.

00:00 This episode is brought to you by Hired and Digital Ocean. Thank them for supporting the show on twitter via @hired_hq and @digitalocean

00:00 Hi everyone. Thanks for listening today. I have a few prize winners to announce before we get to the show. Two weeks ago, JetBrains and I gave away three PyCharm Professional licenses. The lucky winners are

00:00 * Drew Kerr

00:00 * Ben Kelly

00:00 * Seraphim Alvanides

00:00 And in episode #37, on computer security, Justin Seitz gave away copy of Black Hat Python. And the lucky winner is Arek Fijalkowski.

00:00 Congrats to everyone. If you want to be eligible for these give-aways, all you have to do is be a friend of the show. Just visit talkpython.fm and click 'friends of the show' in the navbar.

00:00 Now let me introduce Flo.

00:00 Florian Motlik is the Co-Founder and CTO of Codeship, a Continuous Delivery Service. While studying and working at the Technical University of Vienna he stumbled into running test infrastructure for other teams and hasn’t stopped since. He sometimes blogs on the Codeship blog (blog.codeship.com) or on his personal blog (flomotlik.me). You can also find him on twitter where he's @flomotlik.

02:59 Flo, welcome to the show.

03:00 Thanks for having me.

03:02 I'm really excited to talk about building software, testing software, automating the whole thing with you. So, it'll be fun.

03:07 Yes, it will be.

03:09 For sure. So we are going to talk about continuous integration, continuous delivery, maybe throw in a little Docker, infrastructure, all those kinds of things. But before we get into these topics, maybe you could give me your story, how did you get into software and programming and this whole thing?

03:23 Sure. So as most people, at some point I started playing games on my dad's old computer. And so that got me into a little bit of scripting, and stuff like that, and like trying to look a little bit under the hood at like 10, 11 or something. And then when i was 15 the school that I did had like computer science branch basically that I went to and so there I really started to get into programming and what it actually means to develop software. So that was in school from like 15 on, and I basically loved it ever since. I am originally from Vienna, and I went to university here, the technical university and studied software engineering. At some level really the whole web development side of things really captured my interest. And so the web development and infrastructure is what I have done since.

04:21 Oftentimes I ask people how they got into Python, because a lot of my guests are explicitly doing Python, but you guys are doing all sorts of languages. Can you maybe just tell me what areas you work in, what technologies?

04:31 Sure. So in the past basically I started with the typical Java at school and university stack so I have been doing Java for long time. At some point we switched to -when we started Codeship, I looked at various technologies out there, and since it was a small startup and I was the only developer I looked into Ruby on Rails for getting stuff up and running quickly so most of our stack including our back end and the building infrastructure that we are running for customers was written in Ruby. And then in terms of production systems like we have started moving more and more stuff to Go, so we've re-implemented a lot of the back end system so a lot of that stuff is now in Go. So that is on the part of like developing our own systems. But then, since we basically run software and we run the tests for many, many other people and support a lot of different technologies, I've worked in Python, in Java of course a lot of it, in Ruby, and many other languages that we support in our stack. So I've been playing with and enjoying the ecosystem of many different languages, but mostly programming in either Ruby, Java in the past, or a little bit of Go now, although I don't get to code a lot anymore sadly.

05:41 That is unfortunate. So, I worked for very large companies, and very small companies. And I've been in the situation like you just described, where I am the only developer and I have to say, it's pretty cool to start something new where you are the only developer, and you kind of get to look across the entire tech landscape and say, "I can pick anything- where do we start?" That's a really fun place to be. You work for Codeship, obviously, and you guys have been the sponsors of the show for a long time, so thank you for that. Why don't you tell everyone what you guys do at Codeship?

06:19 Sure. So, the main idea is, just in terms of the messages, Codeship is a continuous integration, hosted continuous integration and continuous delivery service. The main point there is that basically when you build software, you want to test it as fast and as often as possible to really know when something breaks, you want to do tests, you want to do test driven development, or write tests constantly. And then you need a separate system that whenever you change any of your code, push anything into the repo, that it runs all of your tests, all of your build basically. And doing that on your own local development machine is kind of a hassle and oftentimes it's not done, like we are all humans, like we think, "This small change won't break my whole application". And then it does.

07:00 Yeah, of course it does, you know, maybe it doesn't break on your machine, it breaks on someone else's machine.

07:06 Exactly.

07:10 Have you seen "It works on my machine" certification program?

07:11 No, I haven't actually.

07:13 So everyone out there listening, you should Google for "It works on my machine certification", it's a really great certification program that basically mars you as a bad developer.

07:24 I have to look at that. No, I haven't heard of that one before. Yeah, it is really like that whole thing is a huge problem and so people- and that's where 7:32 came from, like we ran, or I ran like CI for a lot of companies, or like at university before, and it's such a hassle, it's so painful to run it yourself, and maintain that stuff when you actually want to get productive stuff done and work on your customers' projects, or like your own project. And yes, so that's where it came from instead of having everybody else through the boring task of maintaining the test infrastructure. We take that for them and then do it much better for everybody basically.

08:01 Yeah, that's fantastic. When did you guys get started?

08:05 We started in 2010.

08:06 We talked a little bit about continuous integration. Maybe we could dig into that a little bit more, like what is continuous integration, what are the steps, like what is involved in that?

08:18 Continuous integration is a term that is heavily over loaded. I think at this point a lot of people when they talk about continuous integration especially in small to medium size teams, are talking about automated testing, and running your automated tests on the separate system. So I think that is really the main goal of making sure that whenever anything changes on your codebase, whenever you push anything anywhere, that it's run by a separate system that makes sure that everything work and notifies you that the whole build has been working fine and everything has been working correctly. That's the main goal of the CI system.

08:56 Right. I think a lot of people initially, back when you guys started around 2010, saw that almost as, here is an automated compile step. Maybe we'll run a couple of tests, depending on whether you are working with a group that's actually written the tests, right? And people have been adding more and more to it over time, right?

09:20 Yes. Absolutely. I think continuous integration can be seen as more than just running your unit test. I think that's something where in the end, the way I would define this- the important part is that at every point the team should know that your application works as expected for your customers. I think that's the whole goal, and that can be unit test, it can be like functional level test, it can be deploying into a staging environment and like testing against that staging environment, just to make sure it works there.

09:51 I think the end goal I think it's- the definition that really is less about the technical aspect of continuous integration, and I don't really like defining continuous integration as a technical process, it's more of like- this thing needs to make sure that whenever I run something, whenever I want to change something in my system, I can actually do that and not break anything for my customers. It should be very customer or user driven. We want to make sure that we don't break anything, because that's actually why we built software, we want users to use the things that we have built, we don't want to break things for them constantly. And we want to move fast- move fast and break things, but even better if we just move fast and don't break things.

10:33 Right. Maybe you break them but maybe you don't send the broken pieces to your customers.

10:38 Yes, exactly. Get that information very quickly and I think especially like break things, maybe in terms of features that we remove or when you want to change your product, but not really break things as in 500 error page, that shouldn't be something that actually should be happening. I mean, all of us have experienced that all the time, things when features break, when things do not work as expected any more, when there is downtime, you don't like the product as much anymore. And it just doesn't feel good at that point.

11:13 And I think continuous integration and the process of really making sure from a customer's perspective that your application works, is important and can really put a lot of stop basically to unnecessary work that your team does and has to do then I mean, we have all been in those situations where we broke something and so now the whole team has to jump onto fixing that thing, and it takes so much resources from the rest of the team. If you can capture that early and make sure that it actually works and there is no issue, it puts a lot less stress on the team, there is a lot less overwork that we need to do. So that is I think the really important part of automating all of this and then just making sure it works.

11:52 It basically lets you focus on the fun part of the software development, writing cool new stuff, not fixing things that are not broken previously.

12:02 Absolutely. There is no worst thing than regression, that come in where things that I am working on or have been working on are now broken for no particular reason just because there wasn't anything in place to catch that. And so it's really important to make sure that that doesn't happen.

12:18 Yeah, I've noticed a lot of really interesting cultural behavioral shifts that seem to be out of proportion with effort of setting the continuous integration. So, for example, I've seen teams that release every 3 months, or every 6 months. To me, that sounds insane, but that is how they are working because they wanted to test it, they wanted to make sure, they wanted to sort of put it in the box and say, "Ok, now we are ready to send out this next thing." You know, maybe the next week somebody will create some great feature but well, that's going to come out in 3 months because it's just too much effort to go through release test deploy cycle to justify going through that just to get this one feature out. But if all of that becomes "push a button", or push to a branch and it just happens- well, maybe all of the sudden when that feature is tested, it's ready to roll and you just push it out.

13:12 Yep.

13:12 And so I've seen teams that have sort of gone from releasing these big painful slow things like, "Hey we are going to all going to come in Sunday night when no customers are there, shut everything down, make sure the integration stuff works to just much more fluid work." And I think that's great for the software, it's also great for the people that have to do it, right, it's way more fun to see your work that you do come right out, yeah?

13:38 Absolutely. I think that the whole team productivity and team happiness is a huge part of like automating your whole workflows and everything around. I mean, in the end, the best people that we have in our team and if you want to build like a high class engineering team, they don't want to fix broken things, they want to build interesting new and cool stuff. So if you have a process that is continuously bringing down the happiness in the team and continuously having the team have to shift resources between like building new stuff, then fixing old stuff, building new stuff, fixing old stuff, that's just going to drive the best people in your team away. And I think that is a very good way to drive the whole engineering team into the ground.

14:15 So I think making sure that automation and how we release processes really help to strengthen team happiness and putting those into the perspective of team happiness, is really crucial. Because, I mean, and also that is something that we have seen, that we have seen with customers, that we've seen in our own team, like when the tests get unstable or something like that- that's a huge drain on team productivity, team happiness, and just it takes so much mental energy from the team and it just puts people in the bad position. And I think it's not just about the technology, it's not just about the customers, big part of that is that, but it's also about putting it into perspective of how can I make my team happy by not having them to do any like boring tasks that I really don't want them to do on a daily basis.

15:03 So those are some of the things that I think are really important when setting up all the automation and we have seen it ourselves, but we have teams- basically like they release- even small teams, their release process was like a week of like going through excel, like having excel sheets that had manual test cases in there. And that took like, they took their main back end engineers and just have them run through that because they didn't have any more people, they were paralyzing all of that was basically getting more people, more engineering people in and clicking through the application.

15:36 And I mean that obviously doesn't scale. It's so often that path is just so set in stone for some teams that they really have to get the help to be taken out of it and see it a different path how we can automate this, and I am sure the developers were aware of how this could be done differently, but I think if you don't put it in terms of making the team actually happy and just see that it's just more time spent by the team and not see it as, "Hey if we do this two more times like this 3 people will just leave. They just won't keep up with this anymore." I think it's much easier way to actually do this.

16:13 Yeah that is a really interesting point. It seems crazy, like as soon as you said Excel, I started to get a little bit of a headache, and you know, you don't want to do that, right? That is not what people will sign on for but you know, some places they seem to develop this culture, where that is how software is built, and you know, it comes probably from a good place, like we can't break things, maybe you are a bank or something and if you are down you will lose a $100 000 an hour, like you obviously want to avoid that. But, you know things like continuous integration and continuous delivery and all that sort of stuff can really help there.

16:51 Yeah, totally, and I think it's also like it doesn't necessarily, as we mentioned, we actually work with teams where they had exactly that problem that they were contractually obliged to not release at a specific point in day. Like they cannot release from 10 in the morning to 6pm or something like that, there is just that in the contract you cannot release at that time. Just you are not allowed to. So there is no continuous delivery for them, or at least not directly to production because that is what the business is doing.

17:21 But what they have done is just simply release to a staging environment, like continuously have something that is pretty much exactly what production is, release there on a continuous basis and then run tests against that staging environment to make sure it actually works then. So they can release like during the night or in the very early morning hours and actually know that this works and they are fine with that. So I think there is different ways how to set that up, how to make that work that teams can do and the teams can use.

17:49 But it's really about how can I make my customers happy, how can I make my team happy, and how can I get the most productivity and focus out of my team. Because in the end, that's- everything is basically software, every company, Uber is taking over the taxi Industry, Netflix and Amazon are taking over basically the TV industry. So a lot of these industries were in the past- they had a much stronger part in the real world, are not taking over the companies that are software companies that are just intermediaries between different people selling stuff. And so everything is software, competition moves a lot faster in all of those fields because it is so much easier and cheaper like AWS and all those other clouds, it is so much cheaper to get started with things.

18:40 And then we have so little talent, there is so few engineers generally out there like everybody is constantly hiring, everybody is constantly trying to find new people, that it's just crazy to spend time of your engineers on doing something that some workflow or automated system could do for you because you need to get faster, you cannot slow down, you are just not allowed to slow down if you are in the software world anymore. It's just over, somebody else will come in and just eat your lunch.

18:40 [music]

18:40 This episode is brought to you by Hired. Hired is a two-sided, curated marketplace that connects the world's knowledge workers to the best opportunities.

18:40 Each offer you receive has salary and equity presented right up front and you can view the offers to accept or reject them before you even talk to the company. Typically, candidates receive 5 or more offers in just the first week and there are no obligations, ever.

18:40 Sounds pretty awesome, doesn't it? Well did I mention the signing bonus? Everyone who accepts a job from Hired gets a $2,000 signing bonus. And, as Talk Python listeners, it get's way sweeter! Use the link hired.com/talkpythontome and Hired will double the signing bonus to $4,000!

18:40 Opportunity is knocking, visit hired.com/talkpythontome and answer the call.

18:40 [music]

20:17 Yeah. That’s for sure, speed is definitely foundation of your business itself, right. You mentioned Netflix- those guys have tremendous amounts of software driving and cloud deployment and what not driving their whole business and they have the policy that as engineer you can release whenever you want. You decide, yes it's done out it goes and they have all the processes in place. So, it's that kind of stuff, you can make much more doable with continuous integration and delivery, right.

20:53 Totally agree. Yeah I think having those processes in place where I think for once putting a trust into engineers but also having things in place where maybe you just want to release this like 1% of your customer base and just to try it, and give those tools to your engineers and that is obviously like at Netflix scale that's a lot more doable than it is on like other people scale, where you can divert 1% of your customer, like your requests to that specific new release. But I think in general like I think teams really think through how can we support our team and like testing new things and experimenting with new things with moving faster and yeah, continuous integration, continuous delivery are definitely an important part there too to make sure that it can actually release there, and on the other hand that you don't actually break stuff for your customers and you just don't break that by buttons so customers can actually give you money still.

21:42 That's right. I think the other thing that really helps people have confidence moving faster and just going with it, is the ability to undo something right, if the ability to say, "Oh, that's bad- something broke, we thought we had this system in place, turns out not quite everything was caught" and just push a button and roll it back is well rather than, "Oh we are going to be down for a day if we break this thing."

22:08 Yeah, absolutely. What we have seen often with continuous delivery once people move to point where you can release so quickly is that when you release small patches at a time, then oftentimes people just, don't really roll back but oftentimes they just push forward with, because you just release something that is incredibly small so if there is a bug out there you find it immediately it's typically pretty quick to fix it, you fix it and push that new fix out there. So that is something that we have seen a lot of teams go to in terms of the process where you definitely need a way to roll back, that definitely needs to be in place, but we have seen a lot of teams that then just push forward and just deploy a new version very quickly because you only deploy so small thing that it's really easy to find like what the actual problem is, you don't deploy work of like 3 weeks where you have no idea if something breaks and you have just no idea where it happens, it's just a combination of 5 different things coming together, and then breaking your application. But it's just actually a small thing. So because you are pushing so often and so fast, you can actually push more often and faster, because you can find stuff so quickly and so easily.

23:16 Yeah, I think that's a really good point. You talked about releasing about shorter cycles, and sort of catching bugs early, I think the way that people run into trouble in a way that deployment has become not fun at all is you save all of your work for 3 months and then you put it out on the real system and then you figure out what works and what doesn't and then you find these problems, you have to go back and like completely analyse the whole system, right? What did you do the last 3 months, well, let's start here. If I've been working for hour I put something out right then, you would probably know exactly what it is, just like you said, it's super easy to fix, right?

23:56 Yep. Totally. And I think that is even something even for teams that have those 3 months release cycles like where like maybe the business for some reason can only ship every 3 months but then you need to find the way where push it to a staging environment, push it to like a 24:11 environment, like push continuously somewhere, and make sure that that somewhere runs correctly and have some like sanity checks, like write a few test cases that just go through the happy paths and like one sad path of that application. And so happy path and sad path is basically a test that should work and a test that shouldn't work.

24:32 And if you just, even though you know you just releasing this in three months, but you know today like all the changes that have been happening over the three months have been pushed somewhere they have been tested somewhere, somebody looked at those somewhere on a continuous basis, so once the release comes there is no surprise anymore, there is nothing new that is coming, it's all been running somewhere and I think that is something that every team can do. There might be like maybe somebody has some exotic example where this is absolutely not possible for some reason, but I think for the vast majority of teams that's something that should not be too hard to do and I think most people should be able to do that in some way or another. Just to get that feedback as early as possible to make that feedback loop as small as possible. And get going better.

25:19 I find usually when people say it's not possible that usually is some kind of a sign that something is not quite put together right in the app. So there is no way to test this part because it calls into this thing which goes into this whole infrastructure which does that- maybe you need some way to break your app up, some design patterns, dependency injection, things like this that will actually allow you to break your app in that small pieces that you can test and should be testing.

25:47 I agree. If it's too complex to test it, then it's probably just too complex for anything.

25:52 Yeah, that's a great way to put it. I have my own opinions on this but I'll get yours- is there like a size of team where you should start doing continuous integration like if there is only two of us should we do this, is it worth it, and is there an upper bound like if you have 200 people it's just too crazy, too complicated.

26:14 Yeah, I'm obviously like totally biased in that regard but I think yeah, there is definitely, I think you should do CI on every level on every team size. I think just the approach to CI becomes different on different team sizes, I think that especially if you are one of two people like early product just the proof of concept or something you might just want to something where you write like a few tests and then it depends on do you have customers or not, but in the end, you don't want to break stuff for your customers especially not your early customers. Because even though they expect it to break, there are the ones that can be the most committed evangelists for your product, so if you break it for them, then you just lose basically your whole marketing team in the early days, like the evangelists the people that you- the outside people, the outside users and customers your early users are your marketing team.

27:04 They are the ones who are going to talk to everybody else about your product. So you don't want to break it for them constantly. So what you can do there instead of writing like a million small unit tests you just make sure that you have like 10, 15, 20 very high level tests that make sure the most important workflows of your application aren't broken. Like, so people can sign up, people can do whatever is the main business function of your product, people can do that, like the main things. And those can be easy to write, they can be on a very high level so they don't break constantly even though you are changing your application and they just make sure that your customers can do at least the most important stuff on a constant basis, especially early users.

27:43 If some small feature on the site 27:46 gets broken at some point, they will miss it and they will tell you, but it's probably not going to break everything for them and they are probably not going to leave the product or leave the evangelizing for you that quickly. But you should write something more like- at least the most important and main paths of your application are covered, so that you can push faster. I think that is also something in your early times, you want to move so fast, you want to push your product out the door as quickly as possible and then again, if you push something out the door, that breaks other features and you don't know it like you only know like after a couple of days it happened than you don't know exactly what broke like it takes so much time away from actually doing something productive that spending a little time on writing good tests and writing tests from the right perspective of the user on the very high level actually makes a lot more sense.

28:36 I think the larger the team gets, the larger the codebase gets you want to code lower in that stack in terms of like high level tests or you go away from, like maybe you are in the beginning you just write tests like with Selenium or something where you actually like you just point a browser at your application and just click through your application automatically basically to verify the things work. At some level you want to do more unit test level stuff more low level tests stuff, more integration of different components, so you actually know that these things work on a relatively low level as well and not just on a high level but you still want to keep that high level tested very well because in the end like that is what customers care about, like a customer doesn't care if the unit test for a validation on a model works or doesn't. That's not important to them. The important part is does the workflow work, does the product work, does the feature work. So that needs to be tested, in my opinion at first.

29:30 And once you have tested those few happy and sad paths on the upper level, then you can go like, Ok, I've tested that evalidation in generally fires and error message get shown on the web UI so now let me write ten more tests but all these different validations that we run on the model before we save it and before that works. Because I don't want to test all of them on a very high level because that would be really slow. So you want to have that mix of slow very high level tests to make sure that from a customer's perspective it works and then lots and lots of like small level tests that verify specific things that you wouldn't want to verify on the high level because it would just take too long. So that is at least the mix that I found, that we found works pretty well and that really captures and catches a lot of problems early on that might impact your customers.

30:24 That's a really pragmatic and powerful way to look at things. I guess the way I would kind of summarize that is do not let the perfect be the enemy of the good; you know, don't let the goal for saying we must have 100% test driven development, we must have 95% code coverage push you into the point to say well but we have to go faster, we are just going to do zero.

30:46 Yeah, no, totally. It's a misconception that that makes you faster, that not writing tests makes you faster. It only makes you faster if you don't count the times where you have to actually fix things. Like, if you do that, than it is just not going to make you any faster. It's just going to make you slower. In my opinion and in my experience. That's what we- the experience that we had from the teams that I worked in and from many different customers that I have talked to and I have worked with that- you have to do it the right way. And I have seen teams that have gone up from early teams that have written thousands and thousands of low level unit tests but their applications still broke constantly. Like, that's not really the right way to go. In my opinion the right way to go is really see it from a customer's perspective and whatever the customer needs to make sure that it validates well, and that it works well, that's what you should focus on.

31:41 Yeah, that's a good point. My rule is kind of when I look at the part that I am trying to test or think about whether I should write a test for it I think is the customer actually paying me for this part of my application, you know, is this my core business, is this the trading, if I work like a stock based company or is this the continuous integration execution part if I work in Codeship or something like- does the "about our team" page work or not. You know, I don't really care and I am not going to put energy into it but if people are paying me for that part, my company, then probably that part should have a test.

32:19 Yeah exactly, and I think it can be as simple as like open up the page and see if the page has an error, it doesn't have to be that's- for a lot of things- and that is something for example like when we deploy something new like we deploy to a staging environment and just simply open up the application. And that captured, a lot of different problems- not a lot, but from time to time it captures something where maybe we have done something wrong in the configuration of the page and just the whole web app doesn't load. And it really pays off to just load one single web site, or one single page of the application just to make sure does this thing actually boot when we push it somewhere. And that definitely captures a couple of times where we would have pushed something up that would have come bad.

32:59 And it didn't. And it was very easy to catch and very easy to deploy and fix that and it never saw the customer, but I think putting that in place and I think that especially the about page is I think a good example because that is something you wouldn't put a lot of effort into setting up the infrastructure to test your about page, so that is just something that should be there because you need it from many other things as well, to just make sure that your about page is actually loaded and can actually be shown is something that shouldn't take more than a few seconds to implement, in a test, but it's really critical because you also don't want your about page if somebody looks up your website and the about page breaks that's not really a good sign. Especially for a tech product, like I would at least expect the about page to load and I think that is- some of the things were like it should be easy, it should be fast, to test that it should be like a lot of different ways how to do that. But yeah, your developer should not have to spend a lot of time on doing that.

33:56 Yeah. I agree, and like you said, just making a request to the page and seeing it 200 rather than 500 comes back, actually can do quite a bit.

34:05 Yeah, that catches a lot of stuff.

34:05 [music]

34:05 This episode is brought to you by Digital Ocean. DigitalOcean offers simple cloud infrastructure, built for developers.

34:05 Over half a million developers deploy to DigitalOcean because it's easy to get started, flexible for scale, and just plain awesome.

34:05 In fact, Digital Ocean provides key infrastructure for delivering Talk Python episodes every day. When you (or your podcast client) download an episode, it comes straight out of a custom Flask app on Digital Ocean and it's been bulletproof.

34:05 On release days, the measured bandwidth on the single $10 a month server jumps to 900 Mbit/sec for sustained periods with no trouble. That's because they provide great servers on great hardware at a great price.

34:05 Head on over to digitalocean.com today and use the promo code TALKPYTHON to get started with a $10 credit.

34:05 [music]

35:21 In the early days, back in 2008, 2009, 2010 it was all about continuous integration and making sure that we can check in our code that if we are using a compiled language it compiles, for writing test the tests run in the automated sort of way but since then people have been moving on to actually turning this whole system into actually a deployment mechanism, right. Like I can check in and magically, stuff is in production and that's called continuous delivery, right?

35:48 The main idea with continuous delivery is that you keep your application in a state where it either can be or is deployed somewhere constantly. And I think that can be through CI, that can be by deploying it to a staging environment, but that can also be by just pushing it out there on a continuous basis.

36:07 How do I go about that, like I totally know how I check in to get and I setup a continuous integration server to pull that and run my test, but how do I get that say into my like if I wanted to do continuous delivery, how would I actually make changes to my server in AWS for example or Digital Ocean, something like that?

36:26 So I think the important part there is that, so to take one quick step back and something that we find really important for continuous delivery in general is how you start triggering those changes basically. I think continuous delivery and the whole deployment part of continuous delivery really needs to be also fully automated and done by the CI continuous delivery system for you, because in the end, what you want, even on a daily basis like if you deploy all the time, you don't actually want your developers to think about like ok, now I actually have to deploy this.

36:58 What you want them to do is here is code, and like I am merging code into master and just do stuff to it. But you want them to focus on the code all the time, on a daily basis you don't want anybody to have to think about like deployment strategy, and which script is running which way, and where do I get answers, all that kind of stuff. So I think the important part and the way we see it is in terms of triggering it, is through the repository. So we think the repository is a really great way to capture the intent of releasing something. If you merge something from a feature branch into the master branch, like there is some intent to test it but also put it somewhere. So you can use that and just say, basically whenever you are on the specific branch like run all the test commands and then also release them to production.

37:49 But at that point, the developer can already be working on something completely different and just is already into the next task, and works on the next thing, but they don't have to actually think about this. I think that is a core part of continuous delivery because you really only want your developers to focus on code on a daily basis. That also helps with a couple of other things because it is easy to move between different parts of the infrastructure then, because for example, it doesn't matter if you want to release changes to your DNS system, or your main web application, if all the things you need to do is merge code from a feature branch into the master branch and then it gets released, like that's easy to understand for anybody and anybody can go into any part of the infrastructure application and actually do it then.

38:36 And from then, it depends on the specific technologies if you want to use something like platform as a service like a raw core or AWS elastic beanstalk or if your own servers different ways, different tools, different abilities, but I think the important part is really capturing it in an automated fashion but in a way where it just happens kind of automatically for developers like they know it happens, but they don't have to consciously think about it, at some point that merge button on GitHub in the pull request, is just like you are merging code and then you are off to the next thing and you don't really actually think about the deployment at all at that point any more.

39:13 Yeah, that's really cool, deploying modern software can involve a lot of infrastructure and a lot of pieces especially if you are doing like microservices, it can involve a whole bunch of them, and so if you can automate that, even if it takes you a week to automate that, and normally it would take you half an hour to do a deploy, the speed that you get on the other side of that, the ability to like you said push a button and merge to a branch and just magically new software is out, that's really cool, right?

39:40 Totally, I think it just more of the I think it's more of that focus and productivity for your engineering team, like that's something that is always important it's really important that the engineering team can focus as much as possible on getting new stuff out there and building your features and product for customers and anything else that is in the way and that is manual testing, that is thinking about the release processing terms of how do I actually get this out the door, that's all wasted time.

40:08 That just shouldn't be- and it's not just the waste of time on like existing engineers we actually know that but actually when you are on board and new engineer and you have to walk them through every single way that it deploy all the different systems and that's just a huge time investment that you need to do and on the other hand you can just tell them "You open up a feature branch and then just merge it into the master branch" and tehcnologically you know it's just a Django or Rails application like you know how to work with that, like our infrastructure might be a bit different than what you have seen in the past but you don't really have to worry about that at all, just do your code and just merge it in and over time you will learn more about how actually the whole system works.

40:52 And I think it takes away some of that stress of onboarding new people and that wish that a lot of companies have where you have somebody new and they should release something on day number one which I'm not the biggest fan of but at least in the first week I think that it just puts a lot of pressure unnecessary pressure on everybody to have them release on day #1. But I think it removes a lot of that the problem with showing somebody how that stuff actually works and instead like you tell them 5 minutes how the workflow works and then they can do it anywhere from as I mentioned the DNS system to your main application to your back end application to whatever.

41:31 Yeah, it makes a lot of sense. You talked about having different levels of tests and running them possibly different times- obviously you would want to run your test before you push it through the continuous delivery path. Do you see people having a different set of tests for a regular feature branch check in versus this final step, would people run more tests if that made their tests slow or something?

41:57 Sometimes. I think that's mostly due to time constraints though. I think that's also something like- I mean that's one of the challenge that we tried to solve for them it's basically giving them a system that makes it so fast that they can run anything in parallel but yeah, we definitely see that where people like they have a lot of they run their maybe unit tests on every feature branch and run like the full on like integration or like functional testing suite only on the master branch, this definitely happens. But I think from what I have seen- I haven't seen a team that says I don't want to run it on all the tests on my branch, it's more I can't run it there because it's just too slow. So I think it's a lot of it is just down to optimizations in the test suite and making faster, making it more parallelized all that kind of stuff.

42:44 Yeah. If it takes two hours to give you feedback even on your feature branch, maybe better to save it for later so you know right away whether you- it's mostly good, right?

42:52 Yeah, exactly and I think that's- so what we have seen- I think like a team should get their whole build on the feature branch I'm not including deployments on the master but just running their tests, if you can't get it down to like 2 or 3 minutes, that's below the level of like where people actually have to think about it. So when I think about it as a developer, like I code something locally, I commit something locally, I push it into the repository and if I get an answer if that works or not in 2 or 3 minutes or in less than that, I don't think about it actually, again. I'm still coding on the next thing and at some point a notification pops up and tells me, "Hey, everything worked on that thing that you just pushed" or a notification pops up and says, "It didn't work" and like you have to fix something that you broke on that feature branch.

43:37 But it's not like while I'm still working on the other thing I'm not thinking like "Hey did this already finish, or is this still running" I think that is like 2 or 3 minute level. Anything beyond that then the problem is that you are sitting there, you are coding on your stuff, you are thinking, "Hey wait, did my build already finish? Let's just look that up" So you go to a browser, you open up the build, you open up the build page and you've just lot a little bit of time, you lost focus, from your development effort and over time, if you have a large team, and a lot of people that have to do this constantly, that just is a lot of focus lost from actually developing.

44:13 So I think our goal is really to get any team of any size to a point where you can actually run the builds in like this 2 or 3 minute time window where you are not actually actively thinking about the results of your build and you just basically keep working and you just constantly keep working, and you just get notified hey, this thing worked and this thing worked and this thing worked without having to go back. And I think that's something that in my opinion people really should strive for. Again, it's all down to productivity and focus for your engineers anything that takes even slight productivity and focus away from your engineers is something that actually hurts your bottom line. And so you have to get rid of that.

44:55 Yeah, there is a lot of talk in programming about flow and focus and getting into the zone and these things just can really kill that if you are like half still thinking, "Well the thing that I did is it actually ok or can I go assuming it's ok..." or do you know, because the build finished that you are fine, right.

45:11 Exactly. And if it just takes 15 minutes you are just going to open up the page. And then you open the page, you wait for the build to finish, you open up redit or something, all that kind of stuff is just taking away that focus and I think it's really important to keep that flow, to keep that focus going.

45:28 Sure. So what's the longest, or the craziest build you have seen running?

45:35 We ran it through the night and at some point we just stopped because it just took too long. Basically, I mean there is definitely things where like it takes hours and hours and I think a lot of it comes down to- especially with the teams we work with they are not huge enterprise like 48 hours straight running through, like 100% CPU usage, so often just comes down to optimization. I think often times teams build their test suite and it gets progressively slow, like no team starts out with like 2 minute build and then like 3 weeks later it's like at an hour. It just happens over longer time, like it's 2 minutes then it's 3 minutes, then it's 5 minutes, 8 minutes, 15 minutes, and just it gets slower and slower and slower, and the team never really does the decision to make this faster and look into it.

46:25 So I've suddenly seen teams where the tests have been running and they have been running slow but they have only been running slow because they didn't really look into it, and I mean that has been the case with us a couple of or a year ago, one of our engineers joined and he is really strong at optimizing test suites, he's been doing it for a while, and he got our test suite down from 15 minutes to 3 or something. And we are pretty good with like test suites. And I think that is something where he just invested the time, the experience and knowledge. But that really helps, spending time on actually making a test suite faster and seeing where it's slow and how to make it faster actually like setting an SLA like I think as a team you should set an SLA on your test suite and if it just takes longer than that SLA you just put resources on it to make it faster.

47:20 Because otherwise, if you are just progressively letting it get slower and slower and slower, again, it will just stop the process. Continuous delivery really relies on your test suite being fast, because if it's not, then your developers have to constantly wait for the results of what's happening and there is nothing worse on having to wait on like the build result getting in before you can open up a pull request and like you can ask other people for feedback or maybe somebody has already looked into it and given you feedback, but then the built comes back half an hour later and tells you that this actually failed like you have to re-code it in some way and then they have to do another go over it and do a code review and that just takes so much productivity away from you.

48:05 Yeah, and you wait again, that 15 minutes.

48:07 Yeah, exactly. Having a strict rule of our build should be that fast and then dedicating resources to it, it's so much cheaper than just putting the resources on features and just waiting- it's not going to magically become faster and it's just going to block your team like crazy all the time and it's just so frustrating if the test really is slow that you really should dedicate resources to it. It just develop of productivity, develop a productivity is really important. Developers are really expensive and it's really hard to get new ones so you really should make sure that your team is productive, your team is happy and can actually work on the most important tasks. And yeah, making test suites fast is definitely something that pays off very quickly.

48:53 Yeah. Totally agree. When you are talking about continuous delivery, putting the new version of the source code or the compiled bits on the server, and making that run is pretty straight forward. But usually it gets really sort of tricky when you are talking about databases. What do you see people doing there?

49:11 Putting a lot of it into services. I think that is a relatively consistent team that we have seen, like maintaining databases is really hard. I think continuous delivery is something where you can't, whenever you store state that's really hard to continuously deliver, and so I think a lot of teams, and that is certainly us as well, we just put that off to a service and we just let heroco in our case or amazon deal with that because they have the infrastructure and the knowledge to actually do that. If you are on your own data system it's definitely possible to do that with the follower like do some level of application and then switching that over but it's generally really hard.

49:54 And the question is whenever I look at how often we deploy our web application and how often we would have to upgrade our database, it's not really that big of a deal to deploy- like you don't really have to deploy like some infrastructure parts and parts where you store state that often compared to like the things that don't store data that you can actually push constantly so I think a lot of teams do not even have to really spend a lot of cycle on like how do I actually continuously deploy my database, because oftentimes it's just fine enough to do it like on a Sunday morning and do it one hour window there where you like just do a switch over and then stop at the other one.

50:35 So I think that's something where- again, don't let perfect be the enemy of good and I think really focusing the continuous delivery on the parts that you actually can control very easily, mainly being like everything that you don't have state in makes sense. And then I think putting the other systems into like a service. Putting it like Heroco into 51:00 rds at least for us I think I'm not in the business of maintaining databases, that's not my product, I just need a database so if somebody else can provide a really good product around the database, that's totally worth it for me and so that I don't have to do that and again, put my developers on a much more productive for customers, much more productive for business, much more productive path. So I think that's how we typically try to deal with that.

51:23 But I think for most companies if you really have to do like zero downtime database updates than you are probably in a position where you have enough resources to do that because if you are really small but you have to do like zero downtime database deployments you are something on the business sounds a little bit off, because that sounds really hard to do, like if you have little customers and you are small team but the customers don't pay you a lot, but on the other hand, they still can require really hard technical things like continuously updating our like zero downtime deploying your databases, that's not easy. You should totally get more money from them. But on the other hand I think that's just don't worry too much about it if you can just get away with just Sunday early morning update. I think it's just really hard for small teams to do that.

52:16 Yeah, it definitely is. So one of the things I saw you guys talking about is Docker. And I think Docker is going to become increasingly important in the whole way that we host and maintain and evolve our applications. What's the story with Docker?

52:31 So generally, I fully agree, I think it's definitely a very important part of how we- Docker and containers and generally are very low level and easy to do virtualization, is a big part of how we are going to build infrastructure int he future, I think so for me a lot of the things that are important they are that the system can maintain itself it's just you easily push a container out that contains everything in there to actually run your application, which also makes that separation between developers and operations much easier because the developers can actually fully control what's actually running as part of their application.

53:08 But it's still fast. So that's really important. And on the test and continuous integration side it's really key for us in the future so we have just rebuilt our whole system basically to run on top of Docker and the main idea there is that developers should be able to control their built environment fully. And each repository is different, each repository and each application needs a different setup and Docker can definitely provide that, and that's just not limited to like Codeship, I think many people are using it from many different ways in their test and CI setup but it's really about give up the development team a lot more power about the infrastructure that the code is running in.

53:50 And so the operations team, there is a clear separation between how to run the whole container, like that's just a clear interface and then whatever is inside of it the development team has full control over that. And I think that separation of concern that you don't just throw code over the wall basically and just let the operations team or somebody else deal with it is really key and then you can do a lot more automation around it, like a lot more health checks, a lot easier way to run several instances of the getting up and running again when something goes down, really use the most of your resources in your machines. So, I think Docker- I think it's less about just the technology of Docker and like Docker as company and I think they are doing great, and they are priding a great product.

54:36 I think the more important part there is that it really enables developers and the whole team to take full control over their environment and work in a different way where it's not just about, "let's update those 3 servers that we are running, but just let's push this artefact into our environment and then our environment decides where, when and in how many items this artefact is running." So I think that's something that is really critical where you have a lot of more separation of concerns there where you don't have to think about everything all the time, and there is just a system taken care of a lot of this. And, nice subtractions in space to make this run and work.

55:17 Yeah, it's cool, it seems like Docker and continuous delivery go well together because now you are almost able to deliver the infrastructure as part of this right, here is the Docker container for this version plus my code, right?

55:30 Yeah, absolutely. You just have a bunch of servers running and that can be int he cloud, that can be as cloud servers that can be on your own infrastructure and all that thing provides is just a way to run arbitrate Docker containers. And I think that's really powerful because again, like the development team can take full charge, no operations person has to know which version of Java runs in there and they don't have to be involved in our Python or Ruby or whatever. They don't have to be involved in updating that or making sure that all the different projects that are running as part of the company all need to use the same Java version because otherwise like stuff will break all the time, like it totally frees every team, every repository, every developer basically to use whatever is best for that specific problem and not have to deal with like all the different complexities of running many different applications on the same kind of infrastructure.

56:20 And virtualization in the past is definitely gone a long way there but then it takes a lot more resources to do that and with Docker and containers it just takes a lot less resources that the underlying technology is proven, like Google runs their operating system on it, Facebook runs through operating their data centers on it so I think we can say that this actually scales to whatever- I think we just need the tools to make it scale down to small teams. I think that's what really Docker is providing great tooling for that we can use containers for a lot of different stuff, a lot of different workflows, a lot of different processes in our either production systems but also in our whole built environment, so I think that's where- it's really about control and giving the developers the utmost level of control about everything they are running as part of their application and not have other people either take away control or have to worry about that control at all.

57:18 I think you are absolutely right, I think Docker just moves a little bit more of that control back to the software team and moves it into this continuous integration story, right? So we can build out these Docker containers and I can say, "Look, if we are going to switch from Python 3.4 to 3.5 and use some of the new features there, I don't have to make sure I coordinate with infrastructure team to make sure that that right versions of the Python are on the servers we are going to push to, when we do continuous delivery." We just say, "We are just going to push a different version of the container and boom- it works." Right?

57:52 Yeah exactly and on different branches maybe like on a different branch you want to test against the new version of Python because like on the master you are still running against the different version of Python but on the feature branch because you want to update you are running against the new version there so you want the whole build environment and the whole deployment environment to run on the different version. Just change it in the config file and use a different one.

58:14 That's a really interesting component to it as well because as you move forward, you are sort of moving forward in time on your infrastructure and all the other pieces as well, on your branch, so if you need to go back one month, and run code form back then, if maybe solve a bug that someone reported on a different version of your code or something, to really do that genuinely, you have to somehow roll back your system to that level, right, but if the Docker container specification is part of that check in history, you can literally roll back to what you built it on, right?

58:47 Yeah, absolutely. You can have the exact same- that is why we built a new system on top of Docker, to be able to exactly support that, like that should be easy. Like you should be able to roll back to like your build workflow, your code, your environment, that you built, you should be able to roll back everything and have complete history of everything that happened as part of your build and even production system. Because only then as you mentioned before, if you want to do a roll back, if you want to fix something, if you want to look into something, debug something, there isn't really any other way to do it and to be able to go back historically and really look up what exactly happened there.

59:25 Yeah, yeah, very cool. Yeah, we'll see more Docker in the feature, I'm sure.

59:29 I'm very sure, yeah.

59:31 Yeah, definitely. Ok, so we are getting kind of to the end of the show, do you have final call to action, or something you would like to make sure people go out and try or do?

59:41 So 2 things. If people are interested in Docker based CI system, give us a call, we are happy to show you guys what we are doing. But I think on the Codeship basis, I think we really need to make sure that as a community we focus more on the productivity and focus on our engineering teams, like that needs to be number one. The customer success obviously, but that is driven and done through productivity and focus of our engineering team. So I think that anything and we deeply need to analyze on a daily basis like how is our engineering doing things. What can we change, what is in their way, how can we improve the processes, what can we automate, how can we automate operations more, how can we automate half checks so that there is no downtime and no issues or problems; how can we do all of that so they can actually focus on building stuff for our customers that actually makes us money and is interesting new technology. So I think that is something that is really key and really important. And that would be my main takeaway for people like really focus on those processes to make sure that your team can actually focus as much as possible on building stuff for your customers.

01:00:58 That's great, I really love the focus on team happiness because I think it often gets lost in the technical bits of this story, right.

01:01:08 Yes, absolutely.

01:01:10 Flo, it's been great talking to you, thanks for being on the show.

01:01:13 Thanks for having me, it was great.

01:01:14 You bet, talk to you later.

01:01:14 This has been another episode of Talk Python To Me. Today's guest was Florian Motlik and this episode has been sponsored by Hired and Digital Ocean. Thank you guys for supporting the show!

01:01:14 Hired wants to help you find your next big thing. Visit hired.com/talkpythontome to get 5 or more offers with salary and equity right up front and a special listener signing bonus of $4,000 USD.

01:01:14 Digital Ocean is amazing hosting blended with simplicity and crazy affordability. Create an account and within 60 seconds, you can have Linux server with a 30 GB SSD at your command. Seriously, I do it all the time. Remember the discount code - TALKPYTHON

01:01:14 You can find the links from the show at talkpython.fm/episodes/show/38

01:01:14 Be sure to subscribe to the show. Open your favorite podcatcher and search for Python. We should be right at the top. You can also find the iTunes and direct RSS feeds in the footer on the website.

01:01:14 Our theme music is Developers Developers Developers by Cory Smith, who goes by Smixx. You can hear the entire song on our website.

01:01:14 This is your host, Michael Kennedy. Thanks for listening!

01:01:14 Smixx, take us out of here.

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon