Monitoring high performance Python apps at Opbeat

Episode #43, published Tue, Jan 26, 2016, recorded Wed, Jan 13, 2016

Episode Deep Dive Transcript

What does it take to track detailed analytics and errors from literally thousands of web applications all at once? Could you build such a system entirely in Python?

The answer is yes and we'll hear from Ron Cohen from Opbeat about how they do it for Django, Flask, and even NodeJS apps.

Links from the show:

Opbeat: opbeat.com
docopt package: docopt.org
Ron on Twitter: @roncohen

Episode Deep Dive

Guest introduction and background

Ron Cohen is the CTO and co-founder of Opbeat, a company focused on application performance monitoring for Python web apps. Before starting Opbeat, Ron worked as a consultant building web applications and managing them in production. This experience gave him a strong perspective on what it takes to track errors, performance, and deployment data at scale. In this episode, he shares how Opbeat was built mostly with Python, why they support Django and Flask (and more), and how small, frequent releases and developer-centric monitoring can improve the software development process.

What to Know If You're New to Python

Here are a few pointers to help you get more from this discussion:

Make sure you're comfortable with basic Python syntax and how to run Python applications.
Have a working knowledge of web frameworks (e.g. Django, Flask) so you can follow the references to incoming web requests, concurrency, and performance tracking.
A quick overview of logs and error handling in Python will help you understand the difference between structured logging and just writing errors to files.
Python for Absolute Beginners: If you’re new to the language itself, this free-form, hands-on course is a great place to start.

Key points and takeaways

Developer-Centric Monitoring at Opbeat Opbeat targets developers by capturing structured analytics, error tracking, and performance metrics specifically in the context of the app’s source code. Rather than focusing on servers and CPU usage in isolation, Opbeat collects data on exceptions, database queries, and code deployments so developers can quickly identify and fix problems. Ron shared how building for developers means integrating directly with Git-based workflows, assignment to the last committer, and more.
- Links and Tools:
  - Opbeat (historical reference / product acquired)
  - GitHub (committer-based assignments)
Structured Error Logging and Stack Traces Traditional ops-focused monitoring tools often required searching through raw log files. Opbeat (and similar developer-focused tools) gather structured error data, including stack traces and local variables, so programmers can see exactly where the issue occurred. This approach speeds up debugging by assigning the error to whoever last changed that area of code.
- Links and Tools:
  - Requests library (mentioned as a must-have for HTTP in Python)
  - docopt (for building command-line interfaces in Python)
Building a Resilient Queue-Based Architecture Opbeat’s intake service is intentionally simple and stateless: data arrives, authorization is checked, and the service pushes everything into RabbitMQ. This design allows for higher availability and easier scaling compared to directly inserting data into a central database. A separate worker service then processes each item from the queue, meaning if the database goes down or the intake needs maintenance, data is still being captured.
- Links and Tools:
  - RabbitMQ
  - Flask (for microservice patterns, though Opbeat also supports Django)
Supporting Multiple Frameworks and Languages Although Opbeat started with Django, it grew to also cover Flask, Node.js, and more. Key to this expansion is an API-based approach to instrumentation (e.g., calling begin_transaction and end_transaction around requests). This made it easier to add new integrations rather than being tied to just one framework or language.
- Links and Tools:
  - Django
  - Node.js Beta Program (historical reference)
Small, Frequent Releases (Shipping Software Continuously) Ron emphasized the benefits of releasing small changes rather than building up large, monolithic updates. Small releases are easier to code-review, test, deploy, and roll back if something goes wrong. This approach also encourages immediate feedback from users and helps avoid “big bang” deployments that can break many parts of an application.
- Links and Tools:
  - SnapCI (historical sponsor)
  - Continuous Integration Concepts
Dev, Ops, and More: Breaking Down Silos Ron highlighted how true DevOps extends beyond just developers and operations; it often includes marketing, product designers, and more. When every discipline is involved early, especially around deployments and feature releases, the entire product team can move faster and share ownership. This helps keep communication loops short and reduces surprises for everyone.
- Tools and Resources:
  - DevOps at Scale: Articles and Case Studies
“You Build It, You Ship It” Accountability At Opbeat, the same engineer who writes a feature is the one to press the “deploy” button and watch production behavior. This fosters direct responsibility for the code, ensuring developers maintain high-quality standards and respond quickly if anything breaks. The same principle also helps keep features small and manageable.
- Tools and Resources:
  - Git-based Deployment Workflows
Python Concurrency vs. Go and Rust The conversation briefly compared Python to Go and Rust for concurrency. While both Go and Rust offer interesting concurrency models, Python remains highly productive for web services due to its clarity and rich ecosystem. Yet, developers should be aware of Python’s concurrency story (asyncio, gevent, etc.) if building high-scale, network-heavy apps.
- Tools and Resources:
  - gevent
  - asyncio
Importance of Real-World Testing and Error Visibility No matter how robust the testing strategy, odd issues crop up when real users interact with your code (e.g., strange user agent strings or malformed form entries). Monitoring solutions like Opbeat give developers instant feedback so they can handle these unknown edge cases quickly.
- Tools and Resources:
  - pytest
  - Flask error handling docs
Balance Between Speed and Reliability Managing thousands of inbound data points per second means the system must be fault-tolerant and still respond in near real-time. Caching, separating the intake layer, and designing a read/write-optimized database strategy all become essential for scaling. Combine that with continuous deployment, and you have a recipe for a “move-fast-but-don’t-break-things” approach.

Tools and Resources:
- SQLAlchemy (common for Python data-driven web apps)
- Pydantic (typed data validation in Python)

Interesting quotes and stories

"If we’re down, then people will see errors in their logs when sending data to us, and that defeats the point of monitoring." – Ron

"Developers think in terms of code, stack traces, and commits, so our goal is to align the monitoring data with the way developers actually work." – Ron

"We quickly realized that if you wrote it, you should also be shipping it, which means you’re accountable for fixing it if it breaks in production." – Ron

Key definitions and terms

DevOps: A set of practices that combine software development (dev) and IT operations (ops) aimed at shortening the development lifecycle while delivering features, fixes, and updates frequently.
Queue-Based Architecture: A design that decouples components by using queues (e.g., RabbitMQ) to store incoming data, allowing asynchronous and more fault-tolerant data flow.
Structured Logging: Logging errors, warnings, and other information in a machine-readable format (e.g., JSON), rather than storing them as raw text lines.
Microservices: An architectural style that structures an application as a collection of small, loosely coupled, and independently deployable services.

Learning resources

Below are some additional resources to boost your Python and web development journey:

Python for Absolute Beginners: Ideal if you’re just getting started with Python.
Django: Getting Started: Learn the most popular Python web framework used by Opbeat in this episode.
HTMX + Flask: Modern Python Web Apps, Hold the JavaScript: Build dynamic, interactive web apps in Flask without heavy client-side frameworks.
Effective PyCharm: If you choose PyCharm for Python development, here’s how to master it.

Overall takeaway

Monitoring your Python web applications with a developer-focused tool provides immediate insight into performance bottlenecks, exceptions, and deployment details. By aligning monitoring data with version control commits and integrating error reports directly into the development workflow, teams can fix issues faster and ship updates more confidently. The conversation with Ron Cohen highlights the importance of building resilient services using queues, encouraging accountability via small releases, and staying flexible with technologies like Python, Go, and Rust where appropriate. Above all, having a feedback loop that ties code changes directly to real-world impact fosters a culture of continuous improvement for modern software teams.

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 What does it take to track detailed analytics and errors from literally thousands of web

00:04 applications at once? Could you build such a system entirely in Python? Answer is yes.

00:10 And we'll hear from Ron Cohen from Opbeat about how they do it for Django, Flask,

00:15 and even Node.js apps. This is episode 43, recorded January 13th, 2016.

00:21 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the

00:50 ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter,

00:54 where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython.fm,

00:59 and follow the show on Twitter via at Talk Python. This episode is brought to you by Hired and SnapCI.

01:06 Thank them for supporting the show on Twitter via at Hired underscore HQ and at SnapCI.

01:12 Hey, everyone. Do you remember that t-shirt Kickstarter I did last summer to create a cool

01:18 Talk Python To Me podcast t-shirt? It was super successful reaching its funding goal within

01:22 just two hours. Well, the shirt is back. I've worked with our friends at pythongear.com to make

01:28 the shirt available on demand for just $25. Visit talkpython.fm/shirt and get yourself one

01:36 while they're hot. The proceeds support the show and the shirt helps spread the word about the podcast.

01:42 Now let's get right to the conversation with Ron Cohen, the CTO and co-founder at OpBeat.

01:47 Ron, welcome to the show.

01:49 Thank you. Thank you so much, Michael.

01:51 You're welcome. I'm a big fan of OpBeat, and I know you guys do a ton of stuff with Python.

01:55 So it's going to be a really interesting conversation. And before we get into all that,

01:59 though, as usual, how did you get started in programming in Python?

02:03 I got started programming initially because one of my friends borrowed some books at the library.

02:09 This is many, many years ago. And it was about the programming language,

02:14 basic, as I'm sure a lot of people got started with basic. I immediately found it very intriguing to be

02:20 able to tell a computer what to do and sort of interact with it. And my friend quickly lost interest,

02:25 but I sort of kept going. And my dad was actually also programming and working as a programmer.

02:31 So a lot of people think I got into programming because of him, but it turns out it actually was

02:38 in spite of him also programming. And Python, the way I got started with Python was another buddy of mine

02:46 who really thought I should take a look at this Django thing, this thing called Django.

02:51 Because at the time I was working as a consultant doing web applications,

02:56 and I had done web applications in Rails and PHP and all sorts of different languages and framework.

03:04 But he really insisted that I try out Django. And then I had to learn Python to actually

03:08 to work with Django. And I've pretty much stuck with it ever since.

03:12 Oh, that's excellent. When was that? What time? What year?

03:15 Oh, good question. I think it's like three or four years ago.

03:19 Yeah, Django is excellent. Python is just such a fun programming language to work with.

03:25 It's such a fun ecosystem. It's hard to not love it.

03:27 Yeah, it really is. And so the thing I like about it is that it's so explicit.

03:33 There's no surprises.

03:33 Yeah, there's very few gotchas. You often hear about JavaScript gotchas and all the things you've got to be careful of. You don't typically hear about the Python gotchas.

03:41 I love that.

03:42 Yeah, JavaScript is interesting.

03:45 Yes, it is. I'll ask you more about that later, actually.

03:48 You like Python so much, you started a company called Opby, right?

03:51 Yeah, yeah, exactly. So I helped start Opby with a friend.

03:54 In the beginning, it was just me and him. He's called Rasmus.

03:57 And we had been working as consultants building web applications for a while together.

04:03 And every time we sort of built something, we found that we were also the people that ended up running it.

04:09 So maintaining it and making sure that it actually worked.

04:12 What we found there is that all the tools were sort of targeted people who were technical ops people.

04:18 So ops people, basically.

04:20 Developers just really think in a different way than ops people do.

04:24 And they also need to know about different data.

04:29 So we thought we'd build a product that helps developers operate their applications that they build that was targeted to developers directly.

04:38 And that's how we sort of got started with Opby.

04:41 At the time, I was really into Python.

04:42 So, and I mean, I still am, obviously.

04:45 So we started building in Django and Python.

04:49 And today, it's pretty much all Python.

04:51 That's awesome.

04:52 So in sort of the whole back end of what you guys have going there, not just what plugs into the app, it's all Python, huh?

04:59 Mostly Python, let's say.

05:00 Yes.

05:01 We started now doing, we started working on support for Node.

05:06 So Node.js, because that's also a big opportunity for us.

05:11 That's going to be really interesting to see how that goes.

05:13 Yeah.

05:14 I just saw today that you guys have a beta program that people can go sign up for.

05:18 So that's cool.

05:19 So let's talk about monitoring apps in general.

05:22 You said there's sort of the infrastructure side of folks, and they have one set of things they want to know about.

05:28 We've got the developers.

05:30 And then also we've got this, you know, this growing sort of DevOps continuous delivery set of people that maybe have kind of a mix of those things.

05:39 But what do people want to know about in their apps?

05:41 Like what kinds of things can be tracked in applications like Opby, things like that?

05:46 How does that all work?

05:46 Right.

05:48 So you're absolutely right.

05:49 There are different sort of disciplines within this DevOps ideology or terminology or whatever you want to call it.

05:57 So if we start with monitoring, that's really sort of focused on delivering actionable metrics for developers.

06:09 And for us, that means that all the data that we give you, it has to sort of relate to your application.

06:14 So it has to be something that you can use to improve the performance of your application.

06:18 That means we don't deal with sort of the machines that your application runs on very much more sort of the code that you've written.

06:27 And that's something we hear is really attractive to developers because that's sort of also their angle of attack, if you will.

06:36 Basically, it all depends on the code and deals with the code.

06:40 And things like response times for an application, that's, of course, really interesting.

06:46 And then also sort of delivering it to developers in a way that they are used to think about stuff.

06:54 So some tools deal with something called aptics, which isn't really a metric that people are very used to working with.

07:06 It's probably something that ops people are more used to, or even just business people who need to deliver some kind of SLA.

07:12 Developers are more familiar with averages and percentiles.

07:17 So that's, for example, a choice we've picked instead of aptics.

07:22 Then there's error logging.

07:23 So there's a lot of ways that you can be logging errors.

07:28 The most basic thing is write them to a file.

07:31 A lot of people do that, and then they ship those lines in the file off to some service.

07:36 And then they search through the lines of the files to see what kind of errors they had.

07:42 And that's also, I feel like, something that caters to ops people more than developers.

07:47 Developers are used to dealing with code.

07:50 So when an error happens, they would really like to see a proper stack trace.

07:55 They'd really like to get an email or get notified on a mobile phone that something is now broken.

08:01 They don't want to just tail some log and sit there and watch it go by.

08:04 They want detailed stuff that they can go back and actually track down the bug with.

08:08 Not just knowing there's a problem, but all the details, right?

08:11 Which maybe don't necessarily fit on a line in a log file.

08:14 Exactly, exactly.

08:16 And when you have the details sort of in a structured way, there's a lot of stuff you can do to make it more useful, more actionable.

08:23 Yeah.

08:24 Yeah.

08:25 I guess you could say, like, what are the most common exception types?

08:28 How many exceptions do we have per hour?

08:30 Right?

08:31 Yeah.

08:31 Whose check-in has caused the most exception?

08:34 Yeah.

08:35 And we actually do that.

08:36 Or we do something called automatic assignments.

08:39 So, based on who checked in the code, we'll automatically assign that error to the person who checked in the code.

08:44 That's awesome.

08:46 Thanks.

08:47 That's something you can only do if you have the data in a structured way.

08:50 Yeah.

08:51 Just one example.

08:53 Yeah, yeah.

08:54 I love the blame feature of source control.

08:58 You can say, all right, this looks ridiculous.

08:59 Who wrote this?

09:00 And that's kind of the error equivalent of blame, right?

09:03 Yeah.

09:04 And it's, you know, it's not a good way to think about it, blame.

09:08 But it's also not very effective for you to sit and work on something that someone else on your team knows much more about.

09:17 Yeah, well, there's the negative way of looking at, like, blame, right?

09:21 Whose fault is this?

09:22 But then there's also the sort of the positive perspective of, like, whoever wrote that code and probably just checked it in, they're more likely to be able to quickly fix it, quickly go, oh, geez, yes, I understand.

09:34 Let me do this, da-da-da-da-da, right?

09:36 If you just give it to somebody out of the blue, go, here's a random problem with some app.

09:39 You probably didn't write it, but fix it, right?

09:41 That's much harder to get it fixed quickly.

09:43 Yeah, absolutely.

09:44 Absolutely.

09:44 It's also about accountability in some sense.

09:47 We can talk about it a little bit later.

09:48 But in my experience, good developers, they like to be held accountable.

09:53 So they like to know when they actually made a mistake.

09:55 And this is sort of a way to complete the circle, if you will, after you write something.

10:02 You will also be assigned to the errors that your code caused.

10:07 Yeah, I think, especially when you're learning the program, when you're a fairly junior developer,

10:14 the whole error handling, dealing with malformed data, unexpected things, and so on, that's much harder to get your head around.

10:23 It's much easier, I think, to write the code so it's supposed to take this and it's supposed to do these things and it's done, right?

10:29 Just like the happy path, if you will.

10:32 But knowing to be aware of all these other errors, right?

10:36 That's something that takes more experience, I think.

10:38 And so if you can help catch those sooner and maybe learn those lessons sooner, that's also good.

10:43 Yeah, absolutely.

10:44 Nice.

10:45 So you talked a little bit about performance and you talked about errors.

10:49 You guys also talk about, like, deployment and workflow.

10:52 What's the story of that?

10:52 Right.

10:53 Again, coming back to the DevOps paradigm, if you will, what we found is that developers now are more and more empowered to deploy their own code whenever they feel like it's finished.

11:06 And whenever the CI tests pass, it's usually the case nowadays that developers actually have the power to deploy their code.

11:15 Now, that's really cool.

11:17 But it also means that it can get pretty difficult to figure out what was actually deployed at what time.

11:21 Because you probably used to have an ops department that would make a little note in a changelog somewhere to sort of keep track of what was deployed.

11:31 But developers don't really do that.

11:33 So we help them do that by what we call release tracking, which is basically a list of releases.

11:40 And each item in the list will contain the commits that went into a specific release.

11:45 That makes it really easy for you to go back and see exactly what you deployed at what time.

11:49 Yeah.

11:50 So maybe you can link those back to a series of GitHub issues that have been closed or something like that, right?

11:55 Yeah, exactly.

11:57 Or errors that started happening after a specific release, etc.

12:01 Yeah, it's another sort of case where the tools and the sort of workflows that you used to have don't really fit anymore.

12:11 And that's why we did this release tracking.

12:14 Yeah, that's really cool.

12:15 I mean, it definitely ties together with continuous integration and continuous delivery and services that companies like SnapCI and those guys build, right?

12:25 To sort of do the checking before it goes out.

12:28 But you guys are kind of on the other end, right?

12:30 Once it hits production, if something happens, you can sort of say, after this release, these errors started happening.

12:37 Is that right?

12:38 Yeah, yeah, exactly.

12:39 And CI is obviously still a really important part of the modern workflow.

12:45 And yeah, it's definitely not a replacement.

12:49 Yeah, absolutely.

12:50 But, you know, the thing is, there's the unit test you write and the scenarios you test for and look for.

12:57 And then there's the real world, right?

13:00 Yeah, absolutely.

13:01 You know, no matter how good your CI system is or your tests are, chances are on some major application, there's something happening that's going to happen that you just didn't account for.

13:13 Like, why are there browsers on my page that have no user agent?

13:17 I didn't plan for this, right?

13:18 You know, just weird stuff like that, right?

13:20 This episode is brought to you by Hired.

13:33 Hired is a two-sided, curated marketplace that connects the world's knowledge workers to the best opportunities.

13:38 Each offer you receive has salary and equity presented right up front, and you can view the offers to accept or reject them before you even talk to the company.

13:46 Typically, candidates receive five or more offers within the first week, and there are no obligations, ever.

13:51 Sounds awesome, doesn't it?

13:53 Well, did I mention the signing bonus?

13:55 Everyone who accepts a job from Hired gets a $1,000 signing bonus.

13:58 And as Talk Python listeners, it gets way sweeter.

14:01 Use the link Hired.com slash Talk Python To Me, and Hired will double the signing bonus to $2,000.

14:06 Opportunity's knocking.

14:09 Visit Hired.com slash Talk Python To Me and answer the call.

14:12 Yeah, absolutely.

14:19 And it turns out that users are really creative in what they will enter into a form, and you basically have no chance to guess what all the different scenarios are going to be.

14:29 Yeah, yeah, absolutely.

14:31 So right now you guys support Django, and that's where you started.

14:34 And recently you added Flask support, and you also are about to add node support, or you're beta testing it.

14:41 What about other apps?

14:42 Like a lot of the apps that I work on, the web apps, are pyramid apps.

14:47 Is there a way to add tracking to apps that are not one of those three?

14:52 Yes, there is, actually.

14:53 And we've not been very good at documenting this, but the Upbeat module has a very simple API, and it comes down to calling begin transaction whenever you sort of start a new request or a background job starts.

15:07 And then you call end transaction whenever you've sent back the response or your background task has finished.

15:15 And the Upbeat module will automatically pick up all the information it needs in between those.

15:20 That's performance metrics, performance monitoring.

15:23 And error logging, usually there's a way to look into the framework's sort of unhandled exception signal or something like that.

15:31 So it should definitely be doable.

15:34 We just haven't really had the time to look at it yet.

15:37 Okay, that's interesting.

15:39 So if I was able to sort of trigger like a bit, like do a begin transaction and an end transaction, all the calls I'm making, say, to SQLAlchemy or out to other web services, those would get tracked?

15:50 Actually, what you need to do is just call begin transaction whenever your request starts.

15:56 And we already instrument most of the modules you use, I hope.

16:01 So that data should automatically show up, actually.

16:03 What you need, yeah, the only thing you need to do is call begin transaction when the request starts and then end transaction when the request ends, the web request, let's say.

16:11 Oh, sweet.

16:12 Well, I may have to go play with this after.

16:14 Cool.

16:14 Let me know how it goes.

16:16 Yeah, very cool.

16:17 So what's the craziest sort of monitoring example you've seen?

16:22 Like there's got to be some company or some piece of software that's just done something way crazier than you've expected.

16:32 That's a really good question.

16:33 Well, I can tell you the first time your main database server just drops off the face of the net.

16:39 It's a very unpleasant experience.

16:43 But that was a bit of a rough night.

16:46 We had to fail over to the replica database and the site was down while we did it.

16:52 So we did it pretty quickly.

16:54 I would say it was like 15 minutes, but it was still not a very nice experience.

16:59 Yeah, I guess so.

17:00 Because, I mean, you guys are running real-time data collection from many potentially popular apps.

17:06 And so you're sort of, you guys must have a lot of load, a lot of requests, huh?

17:11 Yeah, yeah.

17:12 We have quite a lot of load, quite a lot of requests.

17:15 If we're down, then people will get us a little notification in their log whenever they try to send something to us.

17:20 And that's really not cool, right?

17:23 You want your monitoring service to be up all the time.

17:26 Otherwise, it's sort of useless.

17:28 So we spent a lot of time trying to make sure that we can't go down.

17:33 And so what we recently did was we changed it so that we should still be able to receive data, even if the main database server is down.

17:41 So that data is just going to keep being accepted by us.

17:44 And then whenever the server is back up, it'll start processing the data.

17:48 What infrastructure are you guys running on?

17:51 Is it like Amazon Web Services or something else?

17:54 Yeah.

17:54 Yeah, so it's all Amazon Web Services.

17:56 We've set it up ourselves.

17:58 So it runs on EC2.

18:00 We don't use too many of the sort of Amazon services on top of it.

18:04 We use a bit of S3, but mostly EC2.

18:07 Yeah, EC2, S3.

18:09 Those are the main ones, right?

18:10 Yeah, exactly.

18:11 You talked about performance and collecting data, even if it can't necessarily be processed.

18:18 And one of the really nice ways to do that is to add some sort of queue, asynchronous queuing mechanism to the whole process, right?

18:26 Yeah, exactly.

18:27 And we do that a lot.

18:29 So whenever some data comes in, it immediately gets put into a queue.

18:32 We use RabbitMQ, which is a very, very powerful software.

18:36 That gives us a lot of freedom in scaling out, handling failures, et cetera.

18:42 Right, because it's much easier to keep a queue alive than maybe a complex database where the schema could change and it can no longer insert into it or something like this, right?

18:51 Yeah, exactly.

18:54 Maybe talk me through what pieces are involved.

18:57 What does it look like from some web app external to you guys sending some piece of data until it actually gets totally stored in some database?

19:07 Right.

19:08 So we have a separate service called the intake, which is responsible for basically accepting data and put it into a queue.

19:15 It's also what does authentication and authorization of the data.

19:22 So whenever data comes in, we need to make sure that it has the right tokens, et cetera.

19:26 We also rate limit you there.

19:29 So if you send us a lot of data, we will rate limit you right there.

19:34 We need to validate that the structure of the data is actually correct.

19:39 So we also do that.

19:41 And then we put it into a queue.

19:43 So that service is very sort of simple in the sense that it just needs to accept the data and put it into a queue.

19:50 And that makes it easier for us to scale up when we have a lot of data coming in.

19:56 Yeah, but it's almost stateless, right?

19:58 Other than knowing the authorization part, it's like entirely stateless, right?

20:02 Yeah, exactly, exactly.

20:04 So a lot of the, you know, we can cache the authentication stuff really heavily.

20:09 And it's also a very resilient interface of failures because it's read-only from the database, like the authentication stuff.

20:17 It just needs to be able to put data into a queue.

20:21 Yeah, excellent.

20:22 So then there's something else that gets the data back out and really does the processing, right?

20:25 Right.

20:26 So we have a separate service that pulls the data out and then processes it.

20:29 And that's also very convenient for us because it means that we can scale that out very easily.

20:36 We have more freedom in sort of, let's say, we need to do some maintenance.

20:39 We can stop that for a short period of time and then keep going.

20:44 And things will just be in the queue for waiting for us to process them.

20:47 Yeah, I think queues are somewhat underused.

20:50 They're so easy to use, and yet they provide so much architectural flexibility and response time flexibility and so on.

20:58 And even as you say, sort of like a deployment infrastructure management perspective, like long as you don't read things offline until the queues can't take anymore, then you're kind of golden, right?

21:07 Yeah, yeah.

21:08 We do have some requirements for processing time.

21:11 So we can't leave stuff around forever, but it does give us a lot of flexibility in switching out things while everything is running.

21:19 So that's great.

21:20 And I agree on the point that queues are undervalued and also probably not that well understood in the best majority of developers.

21:28 There's also some, as you mentioned, architectural benefits.

21:31 It sort of forces you to decouple a lot of systems, which is always a good thing.

21:37 Yeah, you know, there's a lot of talk about microservices and building more smaller pieces of software and putting a queue in between those two pieces makes it real easy rather than having a monolithic thing that does all the intake, all the processing, all the reporting, and on and on and on, right?

21:52 Yeah, absolutely.

21:52 And there's also a pattern emerging where people use the queue as a sort of bus to talk between the services.

22:00 So it basically becomes a sort of communication medium instead of, for example, using HTTP, people will put a request in the queue and then expect a response on some other queue.

22:14 And that's also quite useful because it gives you some additional architectural advantages when it comes to timeouts and things like these.

22:25 Yeah, that's really cool.

22:26 All right, so I have a big question for you.

22:28 Hit me.

22:29 Python 2 or Python 3?

22:30 Python 3 for the sake of progress.

22:33 Oh, beautiful, beautiful.

22:35 I know a lot of people are on Python 2, but any chance we can get to kind of move forward, we should take that chance, right?

22:40 Yeah, I agree.

22:42 It's a bit of a cost that we have to pay now, sort of upfront, but it's the right thing to do, in my opinion.

22:49 Yeah, excellent.

22:50 I agree.

22:50 A lot of the systems that people are writing that are kind of in the realm of what you guys are doing, they're maybe choosing languages like Go and Rust.

23:01 Do you know sort of what the advantages of those are, like what the disadvantages are?

23:07 Have you guys considered those?

23:08 Not that I'm necessarily encouraging you to do so, but I know a lot of people are thinking about that.

23:14 Yeah.

23:14 So I've written some Go and a little bit of Rust, and I think they're really interesting.

23:20 I think what still is very clear to me is that Python helps me get things done very quickly and with very little code in a very robust way.

23:29 If we start with Go, for example, I think Go is mostly interesting in the way that concurrency works in Go.

23:36 I think the main reason why you would write something in Go instead of Python is the concurrency primitives that exist in Go.

23:43 So Python has a really sad concurrency story, in my opinion.

23:48 And usually, like if you do Django, it's not a big problem.

23:53 But as soon as you have to write a service that talks to the outside world and you want to talk to many different web services at the same time or something like that, then that becomes kind of difficult in Python.

24:03 And it basically comes down to the event loop, in my opinion.

24:06 Go is sort of built on top of an event loop that is seamless to you when you program.

24:13 Getting an event loop into Python usually involves some kind of monkey patching, for example, with g event or some really sort of strange, at least if you're coming from the Python world, some strange modifications that you must make to your application to get this kind of event loop concurrency.

24:31 And then there's Rust, which I think is also super interesting.

24:35 It has a much more interesting type system than Go.

24:38 But at the same time, the Rust feels more like a replacement for C++.

24:42 So some of the things that you will typically use Python for, it also makes sense to use Go for.

24:50 But I would say Rust is in a different category.

24:52 And I feel like a lot of people are comparing.

24:54 So there was recently a lot of people talking about if they should use Go or Rust.

24:59 But in my opinion, they are applicable to different use cases.

25:02 So Rust is more low level.

25:04 You have to deal with memory management yourself.

25:07 And that's important.

25:08 But it's sort of other sorts of applications that you write in Rust than it is Go and Python.

25:14 I see it almost more as a replacement for things like, I would have done this in C.

25:18 So now I'll do it in one of these languages.

25:20 But that could just be my lack of experience with them, right?

25:24 No, I think you're right.

25:25 Especially when it comes to Rust.

25:26 I think Go is somewhere in between.

25:28 It's aesthetically typed.

25:30 It feels a lot more high level, I would say, than C or Rust.

25:54 SnapCI for sponsoring this episode by trying them for free at snap.ci.

26:00 simply do a get push and they auto detect and run all the necessary tests through their multi-stage

26:05 pipelines. Something fails, you can even debug it directly in the browser. With a one-click

26:10 deployment that you can do from your desk or from 30,000 feet in the air, Snap offers flexibility

26:15 and ease of mind. Imagine all the time you'll save. Thanks SnapCI for sponsoring this episode

26:21 by trying them for free at snap.ci slash talkpython.

26:25 So let's talk about shipping software a little bit more. Yes. You said you had some recommendations

26:41 for sort of how to make your team a high-performance shipping machine. What's the story there?

26:45 Right. My role here at Upbeat has transitioned from coding every day to more and more trying to get my

26:55 team to be efficient managing, if you will. And along that path, different things sort of became

27:01 clear. Yeah, there's some different things that you should be aware of, I think, when you are managing

27:08 a team of developers or even if you're just a single developer building applications, especially for

27:14 the web. That's sort of what we've been focusing on, building applications that live on the internet.

27:19 That's really interesting. You know, I think a lot of people who are working by themselves

27:25 don't necessarily adopt some of the sort of what you would think of as best practices and tooling that

27:30 maybe teams would automatically adopt. Things like continuous integration, things like, you know,

27:36 sometimes even source control. But, you know, things like application monitoring and so on.

27:42 So you think even if there's one person working on a project, maybe you should put this stuff in place?

27:47 Yeah, absolutely. Absolutely. Especially like things like CI. I think you should definitely

27:52 have CI even if you're just a one person team.

27:55 So if I have like my files on the hard drive and I just zip them up periodically and put a date on it,

28:00 that's probably not enough?

28:01 I've seen that by the way.

28:05 I've seen that before.

28:08 No, really, that's not. Okay. Well, let's talk about first source control.

28:12 Yeah. Oh, that's horrible.

28:14 It's been a few years, but still.

28:16 Right.

28:18 So maybe you could make it concrete. Like, what do you guys do to ship software like

28:22 at Opby to sort of push out new versions and so on?

28:26 Yeah. Good question.

28:26 So one of the things we really focus on is getting things shipped early in the sense that

28:32 whenever there's something that is an improvement to what we have today, we'll generally try to ship it.

28:38 And what typically happens when you are sitting and programming and working on some feature is that

28:45 you sort of got started on this feature and it's going well, but your feature relies on something else.

28:50 depends on some other code and you sort of take a peek into that code and it feels really sort of,

28:57 it has some of those bad code smells that we as developers are familiar with and are sort of trained to recognize.

29:04 So you consider whether you should just quickly refactor that other thing that your feature is going to depend on.

29:11 What usually ends up happening is that you end up actually spending time both working on your feature

29:17 and refactoring that other thing that you found.

29:20 And then that thing relies on some third thing and you think, oh, well, I made it.

29:26 I might as well just refactor that thing too.

29:28 And then it ends up being a huge release when you finally get it shipped.

29:31 And that's sort of a big red flag for us.

29:34 Big releases are a problem.

29:36 They're a problem for multiple reasons.

29:38 First of all, they are much more cumbersome to review.

29:42 So everything we do gets peer reviewed.

29:45 And if you have a big release, that just takes much more time.

29:47 And it's much harder to get an overview over the impact of this particular release when it's huge.

29:53 And it also takes much more time before your stuff actually comes out to the users.

29:59 The sad thing about that is that we know as developers that, you know, you can think of a lot of scenarios on how you have some idea about how your users are going to use a specific feature.

30:10 And you obviously think that the feature is really valuable to them.

30:14 But it turns out that our assumptions are often wrong about this kind of stuff.

30:18 So getting new features into the hands of users is really important because then you will learn how they are actually going to use it.

30:25 It's really hard to predict what people will find valuable, what they won't.

30:29 Yeah, exactly.

30:30 Like you said, how they're going to use it.

30:32 Speed is an advantage in the software business, right?

30:36 Yeah.

30:36 Small releases actually give you a lot of speed, in my opinion.

30:41 There's also the things like when something breaks.

30:44 If you have just released a huge change, it's really difficult to figure out what part of that change actually made things break.

30:53 But if you're releasing small releases all the time, it's much, much easier to go back and see exactly what caused a specific problem.

31:00 So we're huge proponents of small incremental releases at Upbeat.

31:04 And it's something to reiterate on, sort of talk about often.

31:08 The other side of that story is if everything you are releasing is a small little feature or a small piece and something goes wrong, it's pretty painless to just say, whoops, we're just going to roll it back to the way it was before.

31:21 Yeah, yeah, exactly.

31:22 But if you've had to do some massive database migration to like roll out a huge new thing and then you're kind of stuck, right?

31:29 Not only is it speed going forward, it also enables you to go, oh, no, roll it back, roll it back.

31:34 Hopefully nobody saw that.

31:35 Yeah.

31:37 Another thing that is really useful when you're building applications is to make sure you try to break down silos.

31:45 For us, that means that getting something shipped is a sort of cross-discipline process.

31:52 We have a product designer that works with a visual designer that works with a developer and finally with marketing.

32:00 And that means everybody is sort of aligned on shipping this feature.

32:06 Of course, there's a lot of talk about DevOps, which one of the points there is that your operations people should be aligned or should have the same goals as developers.

32:17 And that makes for a much better process and a much nicer product in the end.

32:23 That's really interesting.

32:24 That's really interesting.

32:24 So, you know, I used to, I feel like teams used to be structured horizontally, like here's the data guys.

32:30 Here's the middle tier service guys.

32:33 Here's the front end people, right?

32:35 Yeah.

32:36 That doesn't seem like a great workflow.

32:38 So you're saying a vertical slice through that is a much better way to group people and features and work, right?

32:45 Yeah, exactly.

32:45 And even including the marketing people and designers and product designers in that slice so that you have like a wide range of capabilities on a specific team or aligned together to ship a feature.

33:02 My point was that, you know, people talk about DevOps, but really we should be talking about collaboration between all the sort of different roles or disciplines that are involved in shipping a feature.

33:14 So DevOps, the collaboration between ops and developers is just the very start.

33:20 You should have marketing and product design, et cetera, in there as well.

33:25 Yeah.

33:25 So you're proposing like a DevOps mark prod team.

33:29 So it's going to be a new password.

33:32 But I think you're right.

33:34 That makes a lot of sense.

33:35 And it's really easy as software people to forget once you write the software, unless you have a very established business,

33:43 or you're writing internal software.

33:45 That's only part of it, right?

33:46 You've got to have a marketing effort and a product development.

33:49 Like you said, it's a whole team effort and software is only part of it.

33:52 Yeah, exactly.

33:54 Another thing that's important for us when we ship code is that the person who wrote the code is also the person who actually presses the button to get that code into production.

34:03 That's something we've insisted on from the very beginning, because that means that if something breaks, that developer who wrote the code will be around to fix it.

34:09 So he didn't go home.

34:10 And the developer who actually presses the button knows exactly what code is going to go out.

34:16 And it's basically about accountability.

34:18 So you want to make sure that the developers that you're working with and that you also understand that if you've built something, so you've written some code, you're the person who ships it.

34:28 And then if it breaks, you're the person who fixes it.

34:30 That's something we've really sort of been adamant about from the very beginning.

34:35 And it seems to work very well.

34:36 Yeah, I think that's great advice.

34:37 You know, there's always on teams people who embrace things like continuous integration that you just have seen more, and there's people who embrace it, let's say, less.

34:46 But making everyone be actively part of the shipping puts the accountability on them, which means maybe they'll rely on the process a little more.

34:56 Maybe they will run those tests before they send it out, right?

34:59 Something like this.

35:00 Exactly, exactly.

35:01 Let me ask you a few more questions.

35:03 We'll kind of get into the end of the show here.

35:05 What's your favorite editor?

35:08 If you're going to write some Python code, what do you open up?

35:10 These days, it's actually mostly PyCharm.

35:14 I found it to be really useful, and it has a lot of interesting features.

35:19 It helps me find the stuff that I need very easily.

35:22 It's a bit heavy, but I still feel like the JetBrains people have spent a lot of time making it fast.

35:29 So I'm pretty happy with PyCharm.

35:32 Yeah, that's cool.

35:33 Like I've said a bunch of times on the show, that's the one I use as well.

35:36 And like you said, it's a bit heavy.

35:38 But if you're willing to wait five seconds, the whole next few hours are a lot nicer.

35:42 So that's worth five seconds in my opinion.

35:46 That's awesome.

35:46 I agree.

35:47 And there's thousands and thousands of packages out there that you can use in Python.

35:52 What are some of the ones that maybe not everyone knows about that you're like, oh, this thing is awesome.

35:56 You should know about X.

35:58 Oh, good question.

35:59 Of course, I'm going to have to say the upbeat module.

36:02 But apart from that, some of the modules we've used that are really useful are, for example, something called dog opt, which is a way to...

36:13 It basically helps you write command line applications.

36:16 And it helps you parse command line arguments.

36:18 And the way it works is opposite of how all the other ones work.

36:21 So in here, you actually write your usage document.

36:26 So the stuff that comes out when you do dash H.

36:29 And then it'll parse that.

36:30 And from that, it'll know how to parse the arguments to the application.

36:34 So that's really useful, in my opinion.

36:37 The modules that come with Python, opt-parse, and arg-parse are really not very good, in my opinion.

36:44 Yeah, that's a really cool package.

36:46 I've heard of that one before.

36:48 And what surprised me was that there's actually a specification for the way that you write that help documentation that's, like, well-structured.

36:56 And so this thing just looks at that and builds the actual command line for you, right?

37:00 Oh, yeah.

37:00 Exactly.

37:01 Man, that's really awesome.

37:02 Yeah.

37:03 I think some other things that we're using is requests is always a good module.

37:09 Yeah.

37:10 I'm sure most people know that.

37:12 Request is amazing.

37:13 You can't get away without requests, right?

37:15 Right.

37:15 Exactly.

37:16 All right, Ron.

37:18 So how about a final call to action?

37:20 How do people get started with Upbeat?

37:22 What should they do?

37:23 It's really easy to get started with Upbeat.

37:25 You go to Upbeat.com and you sign up, create your first Django application or Flask application, and the instructions are right there.

37:33 It takes, like, five minutes to set up, probably.

37:35 And it's free to get started.

37:36 Everyone can go and check it out.

37:38 Please let us know if you run into any issues or get any feedback.

37:41 We're always trying to improve it.

37:43 So, yeah, looking forward to hearing the feedback.

37:46 All right, very cool.

37:48 Thanks for the look inside of what you guys are doing there.

37:50 There's a lot of cool stuff happening at Upbeat.

37:52 Thanks for having me, Michael.

37:53 Yeah, you bet.

37:54 Thanks for the advice on shipping software.

37:55 That's great.

37:56 Talk to you later.

37:57 Bye.

37:58 Bye.

37:58 This has been another episode of Talk Python To Me.

38:01 Today's guest was Ron Cohen, and this episode has been sponsored by Hired and SnapCI.

38:05 Thank you guys for supporting the show.

38:07 Hired wants to help you find your next big thing.

38:10 Visit Hired.com slash Talk Python To Me to get five or more offers with salary and equity

38:15 presented right up front and a special listener signing bonus of $2,000.

38:19 SnapCI is modern, continuous integration and delivery.

38:24 Build, test, and deploy your code directly from GitHub, all in your browser with debugging,

38:29 Docker, and parallelism included.

38:31 Try them for free at snap.ci slash Talk Python.

38:34 You can find the links from today's show at talkpython.fm/episodes slash show slash 43.

38:41 Be sure to subscribe to the show.

38:44 Open your favorite podcatcher and search for Python.

38:46 We should be right at the top.

38:47 You can also find the iTunes and direct RSS feeds in the footer of the website.

38:52 And don't forget to check out the podcast t-shirt at talkpython.fm/shirt.

38:57 Our theme music is Developers, Developers, Developers by Corey Smith, who goes by Smix.

39:02 You can hear the entire song on our website.

39:05 This is your host, Michael Kennedy.

39:06 As always, thank you so much for listening.

39:09 Smix, take us out of here.

39:11 Stay tuned.