Tuning Python Web App Performance

Episode #143, published Wed, Dec 20, 2017, recorded Mon, Dec 11, 2017

Episode Deep Dive Links Transcript

Do you run a web application or web service? You probably do a couple of things to optimize the performance of your site. Make sure the database response quickly and more. But did you know a well of performance improvements live in your web servers themselves?

Join Ben Cane and me to discuss how to optimize your Python web app as well as uWSGI and Nginx.

Episode Deep Dive

Guest introduction and background

Ben Kane is a seasoned engineer focused on high-performance systems. He has deep experience tuning Python web apps and has worked extensively with critical payment systems at American Express, ensuring that every swipe or online transaction is answered with lightning speed. His background spans sysadmin work, performance benchmarking, and a strong Python focus, especially with Flask, Nginx, and MicroWsgi (uWSGI). Ben shares how he discovered Python, why he loves its ease of use, and how it became central to his work in performance engineering.

What to Know If You're New to Python

If you’re new to Python and want to follow along with this performance discussion, having a solid grounding in virtual environments, threading, and how Python’s Global Interpreter Lock (GIL) affects concurrency can be helpful. Understanding the basics of a Python web framework like Flask will also make many of the tuning examples clearer. Be prepared to experiment, measure, and iterate, as performance tuning often means trying small configuration changes to see how they affect response times.

Key points and takeaways

Tuning the entire stack matters A core message is that web performance is not just about optimizing your Python code. Ben emphasizes looking at every layer, from the operating system and web server (Nginx) to the application server (uWSGI or Gunicorn) and the database. Each layer has its own knobs to turn, and fixing bottlenecks at one layer can yield major gains.
- Links and tools:
Concurrency and threads in Python Even with the GIL, multi-process or multi-thread setups can dramatically improve throughput for I/O-bound web apps. Ben shared how uWSGI can spin up multiple worker processes, each serving requests in parallel. Tweaking the process and thread counts, plus leveraging asynchronous or non-blocking operations for database or network I/O, often provides big boosts in responsiveness.
- Links and tools:
  - Python GIL info
Tuning Nginx for performance Configuration defaults might hold you back. Increasing the number of worker processes (or letting Nginx automatically set them) and managing open file limits can nearly double or triple requests per second. For static files, enabling caching at the Nginx level means fewer hits on your Python app, improving overall response times.
- Links and tools:
  - Nginx Worker Processes docs
  - Nginx File Caching docs
Database optimization and indexing Many web slowdowns come from inefficient queries rather than Python logic. Proper indexing and carefully structured queries can cut query times from seconds to milliseconds. Also consider whether you need a database at all for certain parts of your application, sometimes static pre-generation or in-memory data can cut out entire tiers.
- Links and tools:
  - Ben’s article “Eliminate the DBA”
  - PostgreSQL Docs
Performance testing vs. performance tuning Testing lets you discover if you meet a specific baseline, while tuning is the concerted effort to surpass or improve that baseline. Ben recommends creating measurements you can reproduce consistently (using tools like ApacheBench or Go Bench), changing only one variable at a time, and comparing results against your baseline.
- Links and tools:
  - ApacheBench (ab)
  - Go Bench tool (note: or searching “Go Bench benchmarking tool” for references)
Rolling restarts and short-lived worker processes Sometimes forcing worker processes to restart after serving a certain number of requests or after a set time can keep memory usage in check and maintain performance. For small or fast-starting apps, you may see speed gains by avoiding long-lived memory fragmentation. Just ensure you have enough worker processes so user requests aren’t stalled during restarts.
- Links and tools:
  - uWSGI process management
Logging overhead Writing logs to disk can hurt throughput if you’re logging per request. Disabling or minimizing logs (or sending them asynchronously via syslog over UDP) can yield noticeable performance improvements. Ben notes that you should track only necessary events, especially under high load.
- Links and tools:
  - syslog UDP Info
  - Frozen Flask (an example mentioned as a static generation approach, not directly for logging but helpful for reducing overhead in some contexts)
Profiling your Python code Performance bottlenecks often arise in unexpected places. Tools like cProfile or a framework’s debug toolbar (e.g. Pyramid Debug Toolbar) can pinpoint slow functions and reveal hidden inefficiencies. Always remember that profiling itself can affect performance, so measure carefully and compare results with a baseline.
- Links and tools:
  - cProfile docs
  - Pyramid Debug Toolbar
Containers and Kubernetes Docker and Kubernetes can help with horizontal scaling but can also introduce complexity. You may face extra layers of networking or resource limits if your containers are not well-tuned. Ben emphasizes that while these tools make scaling simpler, they also obscure some performance details, be sure to measure inside and outside the container environments.
- Links and tools:
  - Docker
  - Kubernetes
Low-hanging fruit first, then iterate Ben highlights small config changes, like adjusting worker processes, disabling overzealous logging, or setting open file limits, as a great first step. You can often double or triple throughput with these easy tweaks. Then move on to deeper changes like refactoring code, redesigning data flow, or considering microservices only after you exhaust these simpler wins.

Links and tools:
- Ben's blog - bencane.com
- Fabric for Python sysadmin tasks

Interesting quotes and stories

"One of my keys is you have to tune pretty much everything, the entire stack. So Python code, absolutely. But it’s not just that." -- Ben Kane

"Sometimes we see the biggest performance wins come from logging fewer lines. Writing to disk every request can be a huge overhead." -- Ben Kane

"Measure, measure, measure. That’s the key: You have to create a baseline and only change one thing at a time." -- Ben Kane

"I love I’m just a sucker for a good performance talk... so I’m really excited to talk about all the different layers of Python web apps and performance tuning, mostly outside of the Python code itself." -- Michael Kennedy

Key definitions and terms

GIL (Global Interpreter Lock): A mechanism in the standard CPython interpreter that allows only one thread at a time to execute Python code.
uWSGI (Micro WSGI): A popular, high-performance application server for Python web apps, often run behind Nginx.
Nginx: A powerful, event-driven web server frequently used as a reverse proxy or load balancer.
Baseline: The initial reference point for performance (e.g., requests per second, memory usage). All subsequent tuning should compare to this baseline.
Indexing (Database): A data structure (commonly B-tree or similar) that significantly speeds up queries by letting the database quickly locate rows based on search criteria.

Learning resources

If you want to dive deeper into Python and become more comfortable with performance-related topics, check out these courses from Talk Python Training:

Python for Absolute Beginners: Ideal for anyone just starting out with Python and wanting a solid foundation.
Async Techniques and Examples in Python: Learn how to leverage async/await, threading, multiprocessing, and more in Python for performance.
Python Memory Management and Tips: Understand Python’s memory model, how reference counting works, and ways to optimize memory usage.
Full Web Apps with FastAPI: Although a different web framework, learn how to structure and optimize a modern Python web application that can be deployed behind Nginx or similar servers.

Overall takeaway

Performance in Python web applications goes far beyond your Python code itself. By understanding and carefully tuning every layer, Nginx, uWSGI or another app server, the database, your logging strategy, and even short-lived processes, you can often gain massive speed improvements with minimal risk. As Ben Kane emphasizes, measuring each step and iterating systematically can help you discover surprising bottlenecks and keep your apps running at peak performance. The best approach is to remain curious, experiment often, and never stop refining your entire stack for efficiency.

Links from the show

Ben on Twitter: @madflojo
Ben's articles: bencane.com
Tuning nginx: blog.codeship.com/tuning-nginx
Tuning uWSGI: blog.codeship.com/getting-every-microsecond-out-of-uwsgi
Eliminate the database for higher availability article: americanexpress.io/eliminate-the-database-for-higher-availability
Tuning Postgres: blog.codeship.com/tuning-postgresql-with-pgbench
Frozen flask: pythonhosted.org/Frozen-Flask
Episode #143 deep-dive: talkpython.fm/143
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode #143 deep-dive: talkpython.fm/143

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Do you run a web application or web service? You probably do a couple things to optimize the

00:04 performance of your site. You make sure that the database responds quickly and more. But did you

00:09 know a well of performance improvements actually lives inside your web servers themselves?

00:13 Join Ben Kane and me to discuss how to optimize your Python web application as well as

00:19 uWSGI and Nginx. This is Talk Python To Me, episode 143, recorded December 11, 2017.

00:26 Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the

00:45 ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where

00:50 I'm at mkennedy. Keep up with the show and listen to past episodes at talkpython.fm and follow the

00:56 show on Twitter via at talkpython. This episode has been sponsored by Rollbar and GoCD. Thank them

01:03 both for supporting the podcast by checking out what they're offering during their segments.

01:08 Hey everyone, before we get to the interview, I want to share a quick update about our Python

01:12 courses with you. Do you work on a software team that needs training and could really use

01:16 a chance to level up their Python? Maybe your entire company is looking to become more proficient.

01:21 We have special offers that make our courses here at Talk Python the best option for everyone

01:26 you work with. Our courses don't require an ongoing subscription like so many corporate

01:30 training options do. And they're roughly priced about the same as a book. We're here to help

01:36 you succeed. Send us a note at sales at talkpython.fm to start a conversation. Now let's get to the

01:42 interview. Ben, welcome to Talk Python.

01:44 Hey, thanks, Michael. Thanks for inviting me.

01:46 It's great to have you here. I love I'm just a sucker for a good performance talk. So I'm really

01:51 excited to talk about all the different layers of Python web apps and performance tuning, mostly

01:57 outside of the Python code itself, right? Yeah, well, I would say one of my my keys is you have to tune

02:04 pretty much everything, the entire stack. So Python code, absolutely. But it's not just that. And

02:10 that's, that's a real important thing to remember is, it's everything, because everything matters.

02:15 Yeah, absolutely. So it's gonna be great to dig into that. But before we do, let's get to your story.

02:20 How'd you get into programming in Python?

02:21 In the 90s, when I was a teenager, I was making anime fan sites, which is fun, just for fun, you know,

02:28 using HTML, CSS, and I wanted to really kick them up a notch. So I learned PHP. And that was,

02:35 you know, kind of the web framework at the time. And I did a little bit of that for fun, mostly some for

02:43 money. But it's funny, in my adult life, I didn't actually make a career out of it, I went into like

02:50 retail. And then it wasn't until a friend said, dude, what are you doing? You can make these awesome

02:56 sites. Why don't you make a career out of this? And I kind of thought about it. I was like, Oh, he's

03:00 right. I should make some money out of this.

03:03 This is way more fun than what I'm doing.

03:04 Exactly, exactly. So fast forward, and here I am.

03:08 Yeah, that's awesome. Cool. And what are you doing day to day these days?

03:12 I'm a staff engineer at American Express. That is, is that's essentially the payment network

03:17 for American Express. So the way I like to summarize it is it's it's really the yes,

03:23 no machines. So when you swipe your card, that transaction has to get somewhere and that's

03:29 somewhere to say yes and no all those systems that it goes through in order to get to that yes and no.

03:34 That's essentially what I work on every day.

03:36 Wow. So a lot of services, a lot of low latency demands, right? The person is standing there with

03:41 their card, like, yeah, you know, oddly, uncomfortably looking at the cashier who's also

03:47 uncomfortably trying not to look at them, just waiting for your systems, right?

03:51 Yeah, exactly. And a whole bunch of them too, right? Because it's not just that one person.

03:55 It's millions of people around the world.

03:58 Yeah. When you think of the number of people are shopping and swiping cards and just using

04:02 payment at any given moment, it's huge, right?

04:04 Yeah. And then when you add things like Black Friday, Cyber Monday, all those big shopping

04:09 holidays, it's it's huge, right? You have to...

04:12 Does that make a big dent for you guys?

04:13 Oh, yeah. The more people are shopping, the more they're swiping the card,

04:16 the more our systems have to be there to say the yes and the no.

04:19 A lot more yes, no questions, huh?

04:23 Exactly.

04:23 Yeah, nice. So when you talked about the PHP stuff, I kind of distracted you from the Python

04:31 part of it.

04:31 With Python, I was actually working at a web hosting company as a sysadmin in about 2005.

04:38 Back then, Perl was really the language of choice for sysadmins. And my buddy, Marcel,

04:44 introduced me to this new language called Python. And from then, it was like love at first sight.

04:49 Like I just, I loved it. I love the syntax of it. I loved how easy it was to work with.

04:55 But then, of course, I moved to a Perl only sysadmin shop. So I was the only one who knew

05:02 Python. Everyone else knew Perl. So I was kind of like, all right, well, I guess I'll just write

05:06 stuff in Perl if I have to. I was really a big fan of Python at the time. So I really didn't do much

05:12 with it then. And then about 2013, 2014, I realized, I was at Amex at that time, I realized that the

05:21 traditional sysadmin role was kind of dying off. And I would say at this point, it's pretty much dead.

05:27 It's just not everyone knows it yet. Yeah, I think it's that much of a change lately.

05:32 Yeah. So like DevOps and stuff like that, it's kind of replacing it Ansible and SaltStack.

05:37 Oh, yeah, exactly. And if you want to stay relevant, you know, I figured you got to learn

05:42 how to program. And that's when I really buckled down and learned how to program again, above like

05:49 scripts, right? I've been writing scripts forever, but more than just a script. I mean, it's a big

05:54 difference between the two. And of course, Python was my language of choice to do it.

05:58 That's really cool. So how did you get interested in performance tuning and that kind of stuff?

06:04 It's really part of my job. With things being the authorizations and that yes and no, like you said,

06:10 people standing at a terminal waiting for an answer. You know, we get a lot of requests,

06:15 and these requests have to be very, very fast. So one of the things that we're very acutely aware of

06:22 is performance. And we actually hold ourselves to certain performance benchmarks. And we constantly

06:30 test, do performance testing during the development cycle, just to make sure that we're meeting those

06:35 benchmarks.

06:35 Do you have like performance requirements or measures in say an automated build or anything like that?

06:42 Yeah, but that's more for developer satisfaction, right? But for us, it's more about how fast those

06:49 transactions get processed. When I think performance tuning, that's my immediate thing.

06:54 Of course, everything else that goes along with it is important. But to me, like that's the thing that

07:00 shines so bright in that. The cool thing is you can take what you learn from that and apply it to all

07:07 sorts of different tools.

07:08 Yeah, absolutely. And as a sysadmin DevOps person, you see these as the whole system,

07:14 right? Like Linux plus the server servers and services plus the app. How does that thing work?

07:21 Right. Or maybe even across, right? Like, how does the front end web servers plus the load balancer plus

07:27 the back end services? How does that thing perform? It's really more what you might want to

07:31 have like the final measure be, right?

07:33 It all matters, right? So you have to really look at the whole picture in order to see how things

07:39 interact or how things change the performance of a system.

07:43 Absolutely. So one thing I kind of wanted to touch on just a little bit at the beginning here is there's

07:48 a little bit of a difference between scalability and straight performance, right?

07:53 If you kind of think about scalability and performance, a lot of times they're really two

08:00 separate problems, but I do think they're very closely related. With scalability or performance,

08:06 you might say like one request versus a million requests. And with performance, it's not just

08:13 necessarily one request, right? You might also have a whole bunch of requests. Maybe it's not a million,

08:18 maybe it is in one instance of an application. And you need to be able to see how many concurrent

08:25 requests can I handle? Because that impacts performance overall.

08:28 Yeah. One of the things I think is kind of funny to think about is you could have an app that responds

08:33 like it takes 10 seconds to process a request, which sounds like performance sucks. But maybe if you throw

08:39 a million requests at it and it only goes up to 11 seconds per request, that's a pretty scalable app.

08:44 It just doesn't perform very well, right?

08:46 Yeah, exactly. You got to make that thing faster, right? But scalability is also,

08:51 you can talk about scale up and scale out too, right? You can scale out the number of instances. So,

08:57 you know, if let's say your application can only handle a couple of thousand requests at a time,

09:04 you know, by adding additional instances, you can add that many number of requests at a time. But then

09:10 there's other trade-offs with that too. You add complexity to the application, which that's a

09:14 trade-off not only in performance and scalability, but also with availability. And that gets a little

09:21 tricky as well.

09:21 Yeah. And some people can get away with like very small amounts of lack of uptime and availability.

09:28 I suspect you guys not so much.

09:30 Yeah, not at all.

09:31 It's frowned upon to have the yes, no machine down.

09:33 I don't know why. I mean, yeah, no, it's highly frowned upon to have that down. We go through

09:39 a lot of effort just to make sure that it's up all the time.

09:42 Yeah, I'm sure. But it totally adds more complexity, which we'll talk about. So,

09:47 I guess one of the things to think about is the difference between performance testing and

09:53 performance tuning. Can you maybe compare those for us?

09:56 Performance testing, at least in my mind, although I think many would agree with me,

09:59 performance testing is really just the execution of tests that measure performance.

10:05 Performance tuning is more of a concerted effort to improve performance. So, you know,

10:12 to give an example, in our development process, we have automated performance tests that run

10:17 all the time. We know whether that benchmark is being met or not. But really, performance tuning

10:24 is adjusting that benchmark. Is that benchmark now a higher benchmark? Does it need to necessarily go

10:32 lower? Although generally, we never go backwards, we always go forwards. You only go backwards if you

10:37 really have to.

10:37 Right. Maybe you add some major feature and it's worth it, but it doesn't actually,

10:42 it does make it go slower because it's doing more or something.

10:44 Yeah. All of these things are about trade-offs. You gain in one area, but you trade in another

10:48 area. So, yeah, it's very important to kind of know the difference. Really, the concerted effort

10:55 with performance tuning, that's important. If your goal is to really squeeze every microsecond out of

11:01 an application, it's important to have that concerted effort. And it's not just about what's the measurement

11:07 tool telling us.

11:08 And if you have graphs or some other kind of reporting, it's really nice to actually go,

11:12 wait a minute, when we deployed that new version yesterday, what's the response time now than what

11:17 it was before or memory usage now or whatever?

11:19 Yeah. Measurement from production is huge, right? Because you can run all the tests you want in

11:26 kind of pre-production environments, but production is where things get crazy. And if you start seeing

11:32 differences in performance there, it really gives you an indication of where you need to start looking

11:38 as well. And it's a really good idea to just measure both. Measure your pre-production

11:44 environments, compare that with your production environments, and see where the differences lie

11:48 and why they are different.

11:50 Yeah, absolutely. Production, that's where reality lives, right?

11:53 Exactly. That's what really matters at the end of the day.

11:56 Exactly. You're building this stuff to run it for millions of people in production. It doesn't matter

12:01 what your tests say. Like that's the final arbiter of how it is.

12:04 Yeah.

12:04 Yeah. So before we get farther into it, let's maybe take a moment and just talk about what a typical

12:10 web stack in Python looks like.

12:13 Really, if you kind of start at the top, you have something like Ngenix, which is a web server,

12:19 and I'm actually skipping a whole bunch of layers, but we'll kind of talk from the web server down.

12:25 In one web server, maybe we're not talking about the scaled out architecture with all the machines

12:29 working together, right?

12:30 Exactly. Exactly. So in your typical one stack kind of approach, you have your web server,

12:37 you have things like Ngenix, Apache, there's several others. Ngenix, one being known for being very

12:44 performant out of the box. And really what those do is they serve those HTTPS requests,

12:49 they serve kind of the static content. And you can even do some like caching with them as well,

12:55 which is interesting. And then you go to like your application server, you have UWSGI,

13:01 G Unicorn, things like that. And those are really there for being the worker processes for your

13:08 running application. So they'll start the application, they'll manage the application,

13:12 make sure it's running, and really make sure there's enough workers of that application to handle

13:19 those requests. And then, of course, you have your app framework as well, your web app framework. So

13:24 Flask, Web2Py, and then Pyramid, Django, there's a whole bunch of those. All of them are kind of a

13:31 little bit different. Some have different areas of expertise, and some have more features, some have

13:36 less features. One of the interesting things with performance, if performance is a big factor for you,

13:43 one of the kind of caveats, or one of the more golden rules I have is, the less features, the more

13:49 performant it's going to be. That's not always true, but it's a good general rule of thumb, at least.

13:53 Yeah, the less you're doing, the more you can do of it. Yeah, that's for sure.

13:57 And then, you also have the database, and that's another whole factor to think about, and whether

14:02 it's SQL, no SQL. Not every web app is going to have that, but a good chunk of them will.

14:08 Yeah, most of them will have some kind of data store, and usually the choice of the database. You

14:13 actually have some really interesting things to say about that. We'll talk about that. One thing I do

14:18 want to maybe take a step back and talk about, because I think it's important to understand for

14:22 people who don't live in web deployments all the time, is you talked about two web servers. You

14:30 talked about Nginx, and you talked about Microwiskey or UWSGI, and why do we need two?

14:35 That's actually kind of important, and I think I will simplify it with Nginx is really good at what

14:45 it does. It's really good handling some proxying. Let's take an example of HTTPS. In order to do that

14:54 SSL handling, right, the SSL handshakes, the decryption, all of that, leveraging Nginx for that

15:02 is very fast. It's good at that. It does that very well, and it does that very fast, and it's tuned

15:09 specifically for that type of task. It also is really good for serving static content. So if you

15:17 take an application server like UWSGI, that's running your Python web app. Well, if you have static content

15:25 with that, offload that kind of workload to Nginx. Let Nginx do the static content, because that's very

15:31 static. It doesn't need to talk to your Python app. If there's no need, then don't do it. But

15:36 UWSGI is more for executing the requests across your Python application as well, and that's really

15:45 kind of your worker process. Now, it all kind of, there's many ways to set up this type of stuff. You can

15:52 do things in many different ways. Some things work better. Some setups work better for certain

15:57 environments, but your typical deployment is going to have kind of all three, web server, the application

16:02 server, and then that web app framework as well. Yeah. And one of the things, I don't know how much

16:08 micro-whiskey suffers from this, but certainly the Python web servers themselves, the pure Python ones,

16:15 can suffer from the fact that you only have the global interpreter log, the GIL. So you can really only do so

16:22 much serving on any one thread at a time. And if you're busy serving up like a large image file,

16:30 you know, you're not processing requests, right? Like, so putting something in there to like offload

16:37 everything except for the true application requests like Nginx is, I find this pretty awesome.

16:43 That's what Nginx is good at. So let it do its job. Yeah. And it can be like a load balancer or proxy

16:49 server. It's really quite advanced what you can do with Nginx. You know, you mentioned run it all in

16:54 kind of one server, one kind of instance, but that's exactly right. You can put Nginx up one level and

17:02 have it do the load balancing across multiple backend applications. And that's really powerful as well.

17:08 Yeah, that's awesome. Another thing that I do at the Nginx level, at least on my site,

17:13 is that's where all the SSL exchange happens, right? Beyond that, like the micro whiskey stuff,

17:20 it doesn't even know that it's encrypted. Well, I guess because it's not, but it's in the data center,

17:24 right? Exactly. Although even today that sometimes you're starting to see even that good, that layer

17:31 get encrypted as well. You know, things change over time. Yeah, I can see in the more machines you

17:36 involve, the more encrypted it is. I guess in my setup, I have micro whiskey and Nginx on the same

17:41 machine. So it's like a loop back. So encryption doesn't make as much sense.

17:45 This portion of Talk Python To Me has been brought to you by Rollbar. One of the frustrating things

17:51 about being a developer is dealing with errors, relying on users to report errors, digging through

17:57 log files, trying to debug issues, or getting millions of alerts just flooding your inbox and

18:02 ruining your day. With Rollbar's full stack error monitoring, you get the context, insight and control

18:07 you need to find and fix bugs faster. Adding Rollbar to your Python app is as easy as pip install Rollbar.

18:13 You can start tracking production errors and deployments in eight minutes or less.

18:17 Are you considering self hosting tools for security or compliance reasons? Then you should really check

18:23 out Rollbar's compliant SaaS option. Get advanced security features and meet compliance without the

18:28 hassle of self hosting, including HIPAA, ISO 27001, Privacy Shield and more. They'd love to give you a demo.

18:36 Give Rollbar a try today. Go to talkpython.fm/Rollbar and check them out.

18:41 If you look at like UWSGI, that one in particular is good chunks of it are written in C, right? And that can

18:50 also help with performance because, you know, it's C. And C is very fast. C is pre-compiled. It's got performance

18:58 in its nature, right? So being able to leverage that is also very useful. And Ngenix is also

19:04 written in C. And there's kind of that, you know, you use your image example. That's a really good

19:09 example, right? That's where you can leverage that aspect of Ngenix to really get that boost

19:17 of performance.

19:18 Right. The threading and parallelism, that's all just, all runs over there in C. And I'm,

19:22 you know, I'm glad I don't have to maintain that.

19:24 Yeah, I would agree with you on that one.

19:26 Let's start by thinking about how you might approach performance tuning. Like, I've got

19:30 an app. It's kind of working pretty well, but certainly it could be better maybe under times

19:36 of load. It's like too slow. Or I'm just thinking, you know, 300 millisecond response time is fine,

19:41 but could we do 25 instead? But how do you think about this tuning problem?

19:46 I like to think of performance tuning as if it's a science experiment. So first step is put

19:52 on a lab coat. And then kind of after you got your lab coat established, really, you know,

19:58 kind of start with the observation. Now, I would say one of the keys here and kind of the next step,

20:04 which is creating questions, you know, with your observations and your questions, it's really good

20:09 to have as many perspectives as possible. With you mentioned kind of the Linux stack, you have web

20:15 servers, you have application servers, you have the actual Python code itself. Many times in many

20:21 areas, you know, some of these things are managed by different people and bringing those people in to

20:27 kind of add their input into observations and what kind of questions can be asked, you know,

20:33 what kind of knobs can be turned, you really start to get multiple perspectives. And that's where things

20:39 get very interesting. Now, with the same thing with science experiments is you only want to change one

20:46 thing. And then kind of that's important as well. So as you're testing and you're validating,

20:52 and you're kind of adjusting as necessary, only making one change at a time is very important,

20:59 because otherwise, if you make too many changes at a time, and this is a real common mistake, I see,

21:04 if you make too many changes at a time, you get a difference, but you don't know which thing caused

21:09 that difference. And sometimes that leads you down like a rabbit hole, right? You start chasing something

21:14 that you thought made a big difference. But in reality, it was something completely different.

21:18 And that's really important.

21:21 Yeah, or one change made it faster and one change made it slower, but you did them at the same time.

21:25 So it looked like it had no effect.

21:26 And another, another key piece is really establishing your baseline. And that's one thing I really talk a

21:34 lot when I'm telling people about performance tuning is the first thing you do, the first thing you do

21:39 before you make any changes is establish a baseline. And then you also establish a base between changes.

21:47 So usually when you have big performance tuning effort, you're not just change one thing, and then

21:51 everyone goes about their day, you want to change multiple things, you want to have some fun with it,

21:55 you just want to see all the little knobs you can turn to make this thing go faster.

22:00 So being able to stop and baseline between each kind of iteration is important. And also being able to go back to a previous state, it's important to test things individually and together in kind of separate tests. And it can take longer. And that's complicated. And people want to tend to want to rush through when they're first kind of getting started with performance tuning.

22:27 And you just got to take your time. And it's key to really measure it very well.

22:31 Well, and with these deployment stacks, or what do you want to call them, you know, you've got

22:36 Nginx, and you can tune Nginx, you've got like with year, or G unicorn, or whatever, you can performance tune that. And you've got your Python code. And so measuring them

22:46 separately, I think can be a little bit challenging. While you're talking, it occurred to me that

22:50 it's pretty easy to measure Nginx directly, maybe against a static file. If you have your app running in micro

22:58 whiskey, you could just start hitting it, you showed for both of those scenarios, you have a B Apache benchmark, right?

23:03 An Apache benchmark is actually comes with like the Apache to utils package. And it's a very common benchmarking tool, I would say it's, it's got its own problems, you know, there's, there's some things it does really well, some things it doesn't do, but it's a good general purpose tool. Another one I'm a big fan of is a go bench. It's written in go. It's a little bit different. It approaches benchmarking, web requests a little bit different. In some cases, I've seen it faster. And that's actually an

23:33 interesting problem. It's an interesting problem too, is these benchmarking tools are applications in themselves, and they're running in environments themselves. So often, you can run into situations when your application is tuned so well that your benchmarking tool is actually where your bottlenecks are. And that gets like into a real interesting problem.

23:56 Yeah, so do you do you recommend running the benchmarking tools on a separate machine with like a super low latency connection, like in the same data center or something?

24:04 Yeah, sometimes we've just daisy chain servers to get that, that low latency connection. But yeah, absolutely. If you can run it on a different machine. That's awesome. That's great. You should do that. Sometimes, though, in order to really cut out network latency, we've had to either run it on the same machine or like I said, daisy chain some servers so that they don't go through any switches on the way.

24:27 Yeah, definitely. It's a complicated problem, right?

24:57 That's a lot of things that happens when a request goes through your infrastructure. Because it's easy to go, yeah, this is the config file and I put it up and then it works. But, you know, knowing more about the actual steps before it hits your code is actually pretty interesting.

25:10 Yeah, and that's important for 3am calls as well when you have a problem, right? So the more you know kind of about your application, some of the benefits to really performance tune your application is, you know that this is why it works. And this is how it works. So finding problems is a lot easier. But that also really plays into getting even more performance out there. The more you know about your application, the more you come up with ideas on what to experiment with.

25:39 And where to make those changes and how they would affect it.

26:09 that we didn't have an idea of one thing. And we're like, oh yeah, this one thing is going to make our application just scream. And it had the complete opposite effect.

26:18 Yeah, yeah. So that's a really good point. Thinking about, you know, how good our intuition is what around performance is. And so one thing I wanted to sort of wrap that up with those you have a B and you have a go bench, and you can test your server level things. But if you actually want to test your app in isolation, you know, you maybe don't even want to use the development server, you want to just call it directly. And so you could do things like time, say a unit test, or profile a unit test.

26:48 Or something where it's literally just your code running.

27:18 are occurring, and how long they take to occur. Now, an interesting thing is, when you do profiling, sometimes that also affects changes in performance as well. So you have to take some things with a grain of salt, but it is a really good way to kind of look at the execution of what's happening underneath the covers of all that code. And it helps you to really kind of isolate where within the actual application itself, you might make some performance improvements.

27:48 Right, you do have to be a little bit where I'm cognizant of the sort of observer effect, sort of quantum mechanics style, right? Like, it was doing one thing until I observed it, and I did another thing. Darn it.

28:00 Yeah, and the same is true with monitoring and production as well. I've seen several times where maybe you make this method of monitoring performance statistics, but in doing so, you actually create a load on the system, and that load then starts potentially affecting performance in itself.

28:20 So it's all about balance. That's kind of like a key thing. And, you know, you have to look at these things. Sometimes it's worth it, and sometimes it's not worth it. And you really kind of have to take a look at your application, what you're running, and what those trade-offs are. And there's no real hard-line way to say this is worth it, or this is not worth it. It's all very situational. It depends on the application. It depends on the environment.

28:46 And what's actually happening.

28:48 For sure. One final thing on this profiling bit that I wanted to throw out there is, I'm pretty sure some of the other frameworks have something very, very similar. But in Pyramid, it has this thing called the debug toolbar, which lets you analyze the request and see what's happening. And it has a performance tab. And you can check a box, and it'll actually collect the C profile performance data as you click around the site on a page-by-page basis. And that's really nice to just drop in and see, okay, this page is slow because what?

29:15 Just go request it, and then flip tabs over to the other thing.

29:19 That sounds pretty cool. I'm a big fan of Flask, so I haven't really given Pyramid a try, but that sounds very interesting. I'll have to check that out.

29:26 Yeah, it is pretty cool. I'm a fan of Flask as well. And I think Flask has some kind of debug toolbar, but I don't know if it has the profiling built in, because I just haven't done enough with it.

29:34 Yeah, I haven't looked at that.

29:36 Yeah, another thing that I feel like is often, maybe this is my perception from the outside, and it's just like, I'm looking at this like, I know the database for the site sucks. I know it. That's why it's taking five seconds to load this page. I just do. And so I feel like a lot of people skip the real optimization around the database stuff.

29:54 Like, indexes, indexes, like, if you have a query that doesn't use an index, you need a really good justification for that, for example, in my mind.

30:01 Yeah, absolutely. And when you talk about your traditional SQL databases, although some know SQL databases have indexes as well, you know, when you talk about your traditional SQL database, indexes are incredibly important. And think about what they do, right? And this is kind of goes down to knowing what is underneath the covers of every little piece.

30:23 If you think about what an index is, is a database is an application itself, right? It sounds like this ominous thing, but at the end of the day, it's just an application.

30:33 And really, indexes are a way for the database application to know where on disk is, am I most likely to find this data?

30:44 It's a very fast way to find one little piece of data that then leads you to get all of the different data you need.

30:53 And, you know, in SQL talk, it's really, you know, where's this index key? Let me find that key. And then boom, here's all my row of data.

31:02 It really helps with performance. I would say indexes are very important as well. But sometimes it's also queries. Sometimes queries can be very, very complicated. Indexes are no indexes. And simplifying some of your queries, simplifying some of your database structure can really help out with performance as well.

31:23 Right. And maybe your queries are terrible because your models are not quite right in your database. You know, I mean, there's like all sorts. We can't go too far down here, but it definitely, I think optimizing the database is something to consider, right?

31:35 Yeah, exactly. And one thing to remember is, you know, the database was modeled at the beginning of this application. But in reality is most applications grow over time and the usage grows over time.

31:46 So sometimes those queries get the way they get because, well, we wanted to change and have this ability to pull this data over here, but we didn't want to make major changes to the database model. And sometimes it's just necessary.

31:59 Yeah, for sure.

32:00 So you wrote a really interesting article called Eliminate the DBA. I don't want to go too deeply into it, but you had some really interesting ideas. I definitely want to point people at it. Maybe give us a flyover on that.

32:12 Really, it's Eliminate the DBA for higher availability. And it's a bit of a trolling title, to be honest. A lot of DBAs internally did not like me for that post.

32:24 But really what my point is, is when you're creating a highly available application and highly performant application, the goal is to minimize complexity because complexity leads to problems.

32:41 The more complex an application is, the harder it is to troubleshoot. And not just an application, but an environment in total.

32:48 The harder it is to troubleshoot, the more opportunities for failure are there.

32:53 If you just have an application and there's no database, a database going down doesn't affect that application.

32:59 But if both are there, you have two failure points versus one.

33:04 And that's, you know, really kind of what the article is all about is kind of calling out that if you're going to use a database, make it worthwhile.

33:14 Don't just use a database just to use a database because it's easier.

33:17 That is often kind of really what the design pattern is all about is only use a database if it's absolutely necessary.

33:26 And that's really only when you're talking about super high availability environments.

33:31 Some environments, it doesn't really matter.

33:34 If it's easier to use a database, then use it.

33:36 You've got that web app.

33:38 You've got the database.

33:38 They talk to each other.

33:39 Maybe they're even on the same machine.

33:41 Maybe.

33:42 Exactly.

33:43 And sometimes it's fine, but other times it isn't.

33:47 And really what that article was all about is knowing when to think about should I or shouldn't I include a database in it.

33:56 Yeah, I liked it because it made me think like at first, like, no, that's not possible.

33:59 And then I'm like, all right, so how is it possible if I think the answer is that you can't do it?

34:04 Do you know what I mean?

34:05 Yeah, it's pretty cool.

34:07 And it all depends on use case.

34:08 Some applications, it's completely not possible.

34:11 In other applications, it is.

34:13 And I'll give you a really good common example, not even like card related, is, you know, my personal blog is actually a statically generated HTML.

34:24 Now, that doesn't mean I write in HTML.

34:27 I write my blog in Markdown, and then I use Python to take that Markdown and generate HTML.

34:33 Some blogs, like if you look at like WordPress, for example, that's got a database backend.

34:39 Now, my stack HTML is going to definitely be a lot less complicated to run than a whole web stack just to write a blog.

34:48 Right.

34:48 You might not even need a server.

34:49 You could potentially drop it on like S3 or something.

34:52 Yeah, exactly.

34:53 Exactly.

34:53 You can get real interesting once it's static pages.

34:55 Right.

34:56 Absolutely.

34:57 All right.

34:57 So there's a lot of stuff we can do at the architectural level, you know, caching, queuing, asyncio, changing the runtime to say PyPy or Cython or something.

35:06 But I want to make sure we touch on all the stacks at a pretty good level.

35:09 So maybe let's move up one level into MicroWSGI and say, this is the thing that runs your Python code.

35:18 What are the knobs and levers that we can turn here?

35:20 There's quite a few.

35:22 One of the simplest ones is actually enabling more threads.

35:27 So threads are interesting because you can actually go, you can go too far and have too many threads and or also go too few as well.

35:38 And really what that is, is if you look at that configuration, it's processes equals a number.

35:46 So one of the things that you kind of want to look at is how many CPUs does my actual machine that I'm running this on have?

35:55 And that's your production machine, not your development machine, because those are two different things.

36:00 And sometimes you have to also adjust for the environment as well.

36:05 And that's something to kind of think about when you're thinking about performance tuning things is what's it run on my laptop is going to be very different than how things run in production.

36:14 In production, you might have a machine with a whole ton of CPUs available.

36:19 And on your laptop, you only have, you know, maybe four or eight, right?

36:24 Depending on your machine.

36:25 And then another thing.

36:27 So kind of the golden rule there is try not to exceed the number of processes for a CPU.

36:34 But I have found in some cases and some workloads, you can actually go up to twice of it and still get a performance increase.

36:43 It's kind of interesting.

36:45 It's really one of those things where you've got to adjust the number and slowly adjust it as you go.

36:51 Start from the lowest and work your way up or potentially work your way down if you've already got something deployed and it's starting to hit some interesting areas.

37:01 Yeah.

37:01 It gets really interesting, too, because basically the parallelism of that is tied to the parallelism of Python, which has its own interesting mixes.

37:10 Right.

37:10 And so if you're doing something that's computational, that takes really long.

37:14 Right.

37:15 Like you're I mean, it doesn't have to be science.

37:17 It could be generating a really large RSS feed.

37:19 For example, some of us has experience with that.

37:21 And, you know, the RSS feed for Doc Python is like 700K.

37:27 And it's quite a bit to generate it at this point.

37:29 At some point, I may have to do something about it, but it's hanging in there just fine.

37:32 But, you know, that that kind of stuff that kind of locks that process up even with the threads.

37:37 Right.

37:38 But if what you're doing is like you're basically I come in, I process requests, I call a database, I wait, I call a web service, I wait and I give it back.

37:44 That one can keep flying because those network I.O. things kind of break it free.

37:48 Right.

37:48 They release the kill.

37:49 And so it gets really it's I think it also depends on how your app is working.

37:54 What kind of app are you running there?

37:55 You're exactly right.

37:56 One thing I actually want to call out just to circle back a little bit, sorry, is there's kind of two adjustments you can make.

38:04 You know, by default, UWSGI processes all have the same CPU affinity.

38:10 So if you do have like a two CPU machine, for example, just changing the processes to four will actually lock all four of those to the same CPU.

38:20 But if you enable threads equals two or enable threads equals true, what that'll actually do is I'll actually split the processes across multiple CPUs.

38:30 And that's actually a very common problem that people run into when doing multithreading is they run into like, oh, well, I'll just add some threads and we're good to go.

38:42 But how it actually lays out in the stack is a little bit different.

38:46 Linux tries to get things running on the same CPU as much as possible to really leverage things like L2 cache.

38:53 But there are ways to split it out to multiple CPUs.

38:58 And sometimes that can really be a big benefit.

39:00 But yeah, depending on what the application does, the reverse can be true as well, where running on that same CPU can give you that performance benefit as well.

39:10 And like your RSS example, I would say, because you're kind of looking all at the same data and generating it, you might even get a benefit running that on one CPU versus two, right?

39:21 Yeah, for sure.

39:23 This portion of Talk Python To Me was brought to you by GoCD.

39:26 GoCD is an on-premise, open-source, continuous delivery tool to help you get better visibility into and control of your team's deployments.

39:35 With GoCD's comprehensive pipeline modeling, you can model complex workflows for multiple teams with ease.

39:41 And GoCD's value stream map lets you track changes from commit to deploy at a glance.

39:47 Say goodbye to deployment panic and hello to consistent, predictable deliveries.

39:51 We all know that continuous integration is super important to the code quality of your applications.

39:55 Choose the open-source, local CI server, GoCD.

39:59 Learn more at talkpython.fm/gocd.

40:03 That's talkpython.fm/gocd.

40:06 You have an example in this, there's an article that you wrote about optimizing Mike Grovitzki or UWSGI, and you have one for Nginx that we'll talk about as well.

40:14 And you start out as your baseline in this one at 347 requests, and just that change knocked it up quite a bit to 1068, which is quite the improvement.

40:24 Yeah, it is.

40:26 And that's as simple as going from one CPU to two.

40:28 Yeah, exactly.

40:31 It seems like, you know, math would say, well, if I have two CPU and I'm getting 347, shouldn't I get around 6800?

40:37 So, you know, 6800 range.

40:41 You know, sometimes you can even go a little bit higher, right?

40:44 With you have less contention.

40:46 Another thing to kind of think about with Linux is there's a task scheduler, right?

40:53 And this task scheduler is figuring out what processes should I give priority to CPU time.

41:01 And when you're all running on a single CPU, you also have other processes that are running against that single CPU.

41:09 So you're going to have conflicts in CPU time that the task scheduler's job is to figure all that out.

41:15 So having two kind of allows you to reduce some of those task scheduler conflicts as well.

41:21 Yeah, it's pretty interesting.

41:22 So the two other major things, one of them I think is somewhat obvious.

41:28 One of them is sort of counterintuitive.

41:30 One, you say, is to disable logging, which you may or may not want to do that based on you might want to have logs for certain reasons.

41:37 But if you can, disabling logging actually has a pretty significant performance change.

41:42 Yeah, because that's disk IO, essentially.

41:45 So every log message, you know, and there's many ways to solve this problem.

41:53 In my article, I kind of took the easy approach by just disabling it because it was an article and it was easy.

41:58 But really what the root of that is, is by disabling logging, I'm telling, you know, micro WSGI to stop writing to disk, essentially for every request.

42:12 So for every request by default, it's just going to write to disk details about that.

42:17 And whether it's asynchronous or synchronous, and those do matter a lot.

42:22 You know, some platforms will default to kind of synchronous logging.

42:26 And what that is, is it makes sure that that data is written to disk before kind of going to the next step.

42:33 And asynchronous is more like, well, let's kick off a, let's throw this in a buffer.

42:37 Let's kick off a thread to write these to disk and let things kind of continue.

42:41 Those can be huge.

42:44 Just going from synchronous to asynchronous can be a big performance increase, but nothing will give you better performance than just disabling it.

42:52 But there's some trade-offs with that, like you said.

42:54 It's hard to optimize faster than doing nothing.

42:57 Yeah.

42:57 One little tip I tend to like is actually using syslog with UDP for logging instead of going to disk.

43:05 So syslog is a very well-established protocol.

43:11 When you're using UDP, you don't have to worry too much about like TCP handshakes.

43:15 It's kind of fire and forget.

43:17 So pushing that to a network place versus disk, which, you know, disk is traditionally slow, although solid state drives have made it a lot faster.

43:27 It's still slower than, you know, memory and going to kind of that network stack can make a big difference.

43:33 And that's kind of an interesting trick.

43:36 You don't necessarily lose your log data, but you also, you know, don't have to go to disk.

43:41 But another kind of key one there is make sure you're not writing too many log entries.

43:49 Kind of finding the right balance of how much logging is there is really important.

43:53 Now, when you're using a framework, the framework is going to do what it's going to do.

43:56 But even within your application, the less logging, the better.

44:00 But at the same time, you have to have this kind of minimum amount of logging in order to support it.

44:04 Yeah, absolutely.

44:06 So the last one that I said was non-intuitive is you can tell the worker process to live only a certain amount of time.

44:15 And after that time to just be killed off and basically start fresh again, which there's some, you know, startup costs.

44:21 And there's like some cost to having this new process come up and it reads like your template files potentially or whatever.

44:27 So it seems like that would be slow, but you actually flipped it to something like restart the worker process every 30 seconds.

44:33 And it was quite fast.

44:35 That really depends on the application.

44:37 If your application is going to hold a lot of data in memory, that could be good or bad, right?

44:44 Depending on how things start up.

44:45 If you have to kind of load things in memory before you can really start serving requests, then that startup time really, that's a hit, right?

44:53 But if your worker process is really just executing something very fast, restarting it, as long as it kind of starts up very quick and you have a very low startup time within the actual application itself, you can get a big benefit.

45:09 The example I had was a very simple, simple application.

45:12 There wasn't a whole lot of kind of data stored in memory or anything like that that you had to kind of build up.

45:17 So the start time was really small and that's where I kind of got that benefit.

45:21 But again, it all depends on the application.

45:24 The benefit can be the lack of memory management or simple allocation because the memory is not fragmented.

45:30 You know, one of the things that Instagram, pretty sure it was Instagram, did, that was a super counterintuitive for performance.

45:36 But at its levels, they turned off the Python GC.

45:39 They left only the reference counting bit, but that doesn't handle cycles and stuff.

45:44 So there's definitely memory leaks when you do that.

45:46 Yeah.

45:46 But if you restart enough.

45:47 Exactly.

45:48 You just go, it's our workload, our data.

45:50 That means we can run for six hours before we run out of memory.

45:53 So let's just restart every hour.

45:55 Something like that.

45:56 And they got like 12% improvement or something.

45:58 I mean, it was really significant.

45:59 That goes into kind of the overall architecture, right?

46:02 It's sometimes it's okay to kill off an application as long as you're doing it gracefully and as long as you have others to take its place.

46:11 Right.

46:12 And that's kind of where that whole microservices approach really lends a hand.

46:17 Because if you break things down really small, then you can run, it's a lot easier to run multiple of them.

46:23 So you can actually handle the distribution of load to other processes when these ones, you want to start taking them down.

46:32 Right.

46:33 And that's where having like Nginx up front doing some of that load balancing really plays into a big hand in that.

46:39 For sure.

46:39 So let's talk about Nginx.

46:41 We've sort of said what it is.

46:43 But just like before you had this baseline analysis, in this case, you had a little under 3000 requests per second, which this is to like a static HTML file or something like that.

46:53 Right.

46:53 Exactly.

46:54 Yeah.

46:54 So it gets all the other stuff out of the way.

46:56 It's not to the app.

46:57 It's just to serve up a thing.

46:58 And so the first thing that you said that you might want to look at is worker threads.

47:03 Again, that's just like with uWSGI, right?

47:07 That's really the number of processes on the system.

47:11 So a really interesting thing about Nginx is by default, it's actually at auto, which tells Nginx to create one worker thread for every CPU available to the system.

47:25 Now, what I actually did was I changed it to two.

47:30 So actually, no, I changed it to four.

47:32 I had two CPUs on the system, which is basically two worker threads per CPU and two processes.

47:38 And I actually got a pretty good boost out of that.

47:41 And it wasn't too bad.

47:42 But then if you changed it to eight, when I kind of was tinkering, and this goes into experimentation, right?

47:50 Measure, measure, measure.

47:51 When I changed it to eight, performance dropped quite a bit.

47:54 So back it goes.

47:56 Yeah.

47:56 So it's a very close balance.

47:59 It's walking a very thin line.

48:00 You know, you can optimize things to work really well in these situations.

48:05 But once you go a little too far, then you start hitting other contention, right?

48:10 And that really breaks down to like that CPU task scheduler.

48:14 And that's actually why I did that in that article is to kind of show sometimes things don't always work out when you just add more numbers.

48:20 Yeah.

48:21 So by messing with the worker threads, you were able to get it to go from a little under 3,000.

48:26 You added another 2,250.

48:29 So not quite doubling, but still quite good.

48:32 And then the other thing you said is like, maybe some of these connections are going to last for a long time or they're sort of backed up.

48:39 Like, what if we let it accept more connections?

48:41 And this is important for requests, but it's super important for like really long, large files, I would imagine, or lots of them.

48:47 People downloaded them or even like WebSockets, these persistent type of things.

48:52 Yeah, exactly.

48:53 And that's exactly it, right?

48:55 Sometimes, and actually in this case, it was, I believe it was a simple REST API.

49:00 So there wasn't really a whole lot of like static connections, but you're right.

49:05 That is a big performance increase when you have those long live connections.

49:09 And sometimes, you know, that's letting Ingenix do some of the work, let Ingenix kind of handle that connectivity.

49:16 By increasing the number of connections per worker, if they're very fast requests to the downstream application,

49:24 you can actually kind of leverage Ingenix handling connectivity with the end client and through the process very quick.

49:32 So by making some changes, I think in that case, like I changed it to like 1024, for example, it went up to like 6,000 requests per second, which was a huge improvement.

49:42 It's really cool.

49:43 Now, another thing in that kind of worker space is to look at the number of open files, which is a very common Linux limitation that people run into.

49:55 In Ingenix, as it's serving static content, for example, or even just the fact that it has logging enabled as well, it's going to have open file handles.

50:05 And by default in Linux, there's a limitation.

50:08 I want to say these days it's 4096, but actually, I think the Ingenix default limitation is much smaller.

50:16 I forget what it is exactly.

50:17 I mean, I think in that example, all I really did was just upped it to 4096, which allowed it to have even more files open, which gave a little bit of a boost, but not too much.

50:29 Yeah, it was still really nice.

50:31 I think it gave it like 300 more or something about it.

50:35 Yeah, which is cool.

50:36 But the other thing you can do is most web workloads are hitting a number of files, 10, 50, 100, or 1,000.

50:44 But after that, how many unique static files do you have in most situations?

50:49 I know there are some that there's tons, but most sites, they've got their CSS and their images and whatever, and it's mostly shared, right?

50:55 So you can also tell it to cache that stuff, right?

50:58 Yeah, you can.

50:59 And I would say one caveat, though, is just because you only have a certain amount of files doesn't mean that that process isn't opening other files.

51:08 Sometimes there's things like shared libraries that it opens, and all of those count as well.

51:12 And even like sockets and things like that, they all kind of count towards the limitations in the OS.

51:19 But in regards to caching, that's actually pretty cool.

51:23 There's some options with NGENX to do like an open file cache, which allows you to increase the default amount of cache for open file handles.

51:35 So NGENX will open up those CSS files and those HTML files, like you were saying, and it will actually load them in memory.

51:42 So that way, when you get a request, even though it's a file on the file system, since NGENX has it open and it's cached in memory, it doesn't have to go to disk for access to that, which makes it a lot faster.

51:56 Yeah, just serve it straight back out of memory.

51:58 That's awesome.

51:58 So in the end, you were able to tune it NGENX just on its own bit from 2,900 up to 6,900.

52:08 That's a serious bit of change by just tweaking a few config settings.

52:14 And then you did the same thing, you know, something similar on the UWSGI level.

52:18 And then, of course, you could, you know, tweak your architecture as well.

52:21 But just making the stuff that contains your app go that much faster, that's pretty awesome.

52:26 There's another article that I wrote about kind of benchmarking Postgres.

52:30 And I had a similar experience there is all I really did in that article was just adjust a shared buffers configuration.

52:38 And what that is, is that's essentially a query cache for Postgres.

52:42 And that change alone for the example I was given had a big, had a big performance increase.

52:49 So, you know, sometimes I kind of call these a little bit of low-hanging fruit because they're just little knobs you can change in existing systems.

52:58 And sometimes those low-hanging fruit can be a really good first step.

53:02 None of those seem super scary, right?

53:04 You just change some config files.

53:05 I mean, if you mess up your config, you will take your website down.

53:08 But, you know, you just put it back.

53:10 It's not super complicated, right?

53:12 The key is test before you put it in production, right?

53:15 So as long as you're testing before production, then, you know, if you make a change and it doesn't work, then, oh, well, who cares, right?

53:23 But really, that's kind of getting into the measuring, establishing your baseline and measuring each little change and how they interact.

53:32 That's real important.

53:34 So, yeah, that way, when you go to production, you know exactly what you're changing.

53:38 Yeah, that sounds good.

53:39 So we're just about out of time for our conversation here.

53:43 But I did want to just ask you, like, how do things like Docker and Kubernetes change this?

53:48 Do they just make it more complicated?

53:49 Do they simplify it?

53:51 Like, what do you think containers mean around this conversation?

53:53 They simplify some things.

53:55 Like, when we're talking like scale-out type approach, it makes it real easy to spin up a new one.

54:00 I was kind of thinking when you asked that question, I think of some of the challenges I've ran into with those.

54:07 One thing that I've run into with, like, Docker is if you just pull down a service and use that service out of the box without changing it, you're going to get kind of a default performance.

54:19 So by using a Docker package, a lot of times you kind of forget about all those little knobs that you have to turn in order to get it fast and performant.

54:30 And then another area that that kind of goes into is Docker is it has some services as well that things run through.

54:40 And Kubernetes is a big example of this.

54:43 So with Kubernetes, you have a software-defined network, right?

54:46 So if you have, like, a cluster, and let's just give kind of a scenario.

54:52 For whatever reason, you had, you know, half your cluster on one side and half your cluster on another, and there's a network latency between getting to that other half.

55:03 With Kubernetes and services, you would have your traffic land on any host within that cluster.

55:10 And then that software-defined networking is responsible for moving that request to the appropriate hosts that might be running that service.

55:17 So if there's some latency in there, you can actually start seeing that.

55:21 And that's actually very obscured away once you start getting into that area.

55:26 It's hard to kind of pin that down because that's so far removed from what you're doing in your application that that can actually be pretty tricky.

55:34 Plus the fact that, you know, you have to go through that software-defined networking means you're taking some penalties there.

55:40 But again, it's probably worth it.

55:43 Yeah, probably.

55:44 Very, very interesting.

55:46 All right, before I get to the final two questions for you, you said that you, we were talking before we hit record about an open source project that you're working on as well.

55:54 You want to give a quick elevator pitch for what that is so people know about it?

55:57 Yeah, absolutely.

55:58 So Automatron is the open source tool that I've created.

56:03 It's kind of a second version of something called Runbook.

56:07 I launched an open source project called Runbook, and things happened where I was like, I need to redo this and start fresh.

56:15 And that became Automatron.

56:17 And really what it is, is it's kind of like if Nagios met IFTTT.

56:22 So where you have these health checks and they monitor, you know, whatever you tell them to monitor, the health checks are really just executables.

56:31 What it will do is it will SSH to the remote server, to the monitored system, run that health check.

56:39 And based on kind of the Nagios return codes, it's either good or bad.

56:44 Right.

56:44 Like SSH in and ask for the free memory or something like that.

56:47 Yeah, exactly.

56:48 Exactly.

56:48 And if it's bad, you know, if you beyond a threshold.

56:51 If it's zero.

56:53 Yeah, yeah, exactly.

56:54 Then really the exit code would indicate a failure.

56:57 And that exit code will actually trigger an action to take place, which is, again, SSHing out to a system and then executing either a command or a script or something like that.

57:09 And this is all built in Python.

57:11 And it actually uses Fabric very heavily, which is a really cool SSH command execution wrapper.

57:18 It's very cool.

57:19 I'm a big fan of it.

57:20 And I used it very heavily there.

57:22 And it's kind of a cool little side project.

57:25 And really what it's there for is I hate on-call and I really wish it would go away.

57:31 And this is one of my ways to hopefully help people make it go away.

57:37 Could a machine just go restart the web server process and just not call me?

57:41 Exactly.

57:42 Or, you know, in some cases when you've kind of designed your environment well enough, you could just maybe reboot the box and who cares?

57:49 Yeah, true.

57:50 At a certain scale, is it worth finding the root cause of a single issue or is it more worth finding the root cause of continuous issues?

57:59 And that's kind of one of the real philosophy changes that that kind of project brings is, in my opinion at least,

58:06 it's worth fighting, you know, more frequent and reoccurring problems.

58:12 And just one-off problems are not worth it.

58:15 Just restart the thing.

58:16 Yep.

58:17 Sounds awesome.

58:18 All right, cool.

58:18 So people check that out.

58:19 We'll put it in the show notes.

58:20 All right, last two questions before we go.

58:22 If you're going to write some Python code, what editor do you use?

58:25 Right now, it's Vim screen and syntax highlighting.

58:28 That's it.

58:29 I'm kind of weird, I believe.

58:31 So, but that's where I feel comfortable.

58:33 And I think that's all those years and kind of operations has led me to that.

58:37 Yeah, cool.

58:38 And then notable PyPI package.

58:40 You already called out Fabric, right?

58:42 That's pretty awesome.

58:43 Yeah, Fabric is an awesome one.

58:45 I actually wanted to call out, since we're talking performance tuning, Frozen Flask.

58:49 Now, it's not one that I've used personally quite a bit, but the whole concept is it allows

58:56 you to pre-generate static pages from a Flask application.

58:59 So it's really cool.

59:01 And the real awesome thing that you can do is you can kind of combine that with certain

59:07 like Ngenix rules to where it'll pre-generate the static HTML.

59:11 And then based on, you know, regular expressions in the paths and things like that, you can have

59:17 Ngenix serve that static HTML without having to go down further in the stack.

59:22 And that's real cool because you can use Frozen Flask to still kind of keep that all within

59:27 your Python application.

59:29 And it's really just at runtime, it'll generate that static HTML.

59:32 It'll freeze it.

59:34 And just Ngenix configuration from there takes it away.

59:38 Yeah, that's pretty awesome.

59:39 So you get the dynamic sort of data-driven bit, but then you could just freeze it, if you

59:44 will, like turn it to static files and then serve it through that way.

59:47 That's cool.

59:47 Most web applications, you know, you have static pages and dynamic pages and really using it

59:53 to establish those static pages and pre-generate them can be a big benefit in kind of production

01:00:01 workloads, not just from a performance perspective, but cost of what it takes to run it as well.

01:00:06 For sure.

01:00:06 That's really interesting.

01:00:07 And sometimes those sort of landing pages and main catalog pages, that's where like really

01:00:12 the busy traffic is anyway.

01:00:14 That's pretty much my use case that I've used things like that.

01:00:17 I haven't used that one in particular, but I've done some things like that where I just

01:00:21 pre-fetch pages and save the HTML for, yeah, it was hacky, but you know what?

01:00:27 It really helped keep things slim.

01:00:29 So it works.

01:00:31 That's awesome.

01:00:31 Cool.

01:00:32 All right, Ben, final call to action.

01:00:34 People are excited.

01:00:35 They realize like there's a few knobs that make their code much faster.

01:00:38 What do you think?

01:00:39 They should start by reading your two articles about optimization?

01:00:42 And really just don't be afraid to just jump right into it.

01:00:46 Even if you don't know how something works, you know, sometimes just turning that knob and

01:00:50 then figuring out why it works is a real benefit.

01:00:52 But for kind of self-promotion, for sure, check out those blog posts.

01:00:56 You know, you can kind of follow me on Twitter.

01:00:58 Mad Flojo is my handle.

01:01:00 I have quite a few posts out there and lots of different stuff and lots of different kind

01:01:05 of areas.

01:01:06 But that's definitely a good start.

01:01:08 Yeah.

01:01:08 People should check out bencane.com slash archive.html because you have a ton of awesome

01:01:13 articles there.

01:01:13 We just chose a few to speak about.

01:01:15 Awesome.

01:01:16 Thanks.

01:01:16 Yeah.

01:01:17 I have tons of stuff, whether it's, you know, Docker related performance tuning.

01:01:21 One article I wrote is kind of building self-healing environments using things like salt and just

01:01:28 some Python code, which is...

01:01:30 Very cool.

01:01:30 Well, thanks so much for sharing your experience.

01:01:32 It was great to chat with you.

01:01:33 Awesome.

01:01:33 Thank you.

01:01:34 Thank you for having me.

01:01:35 You bet.

01:01:36 This has been another episode of Talk Python To Me.

01:01:39 Today's guest has been Ben Kane, and this episode is brought to you by Rollbar and GoCD.

01:01:44 Rollbar takes the pain out of errors.

01:01:47 They give you the context and insight you need to quickly locate and fix errors that might have

01:01:52 gone unnoticed until your users complain, of course.

01:01:54 As Talk Python To Me listeners, track a ridiculous number of errors for free at rollbar.com slash

01:02:01 Talk Python To Me.

01:02:03 GoCD is the on-premise, open-source, continuous delivery server.

01:02:07 Want to improve your deployment workflow but keep your code and builds in-house?

01:02:11 Check out GoCD at talkpython.fm/gocd and take control over your process.

01:02:17 Are you or a colleague trying to learn Python?

01:02:20 Have you tried books and videos that just left you bored by covering topics point by point?

01:02:24 Well, check out my online course, Python Jumpstart by Building 10 Apps at talkpython.fm slash

01:02:30 course to experience a more engaging way to learn Python.

01:02:33 And if you're looking for something a little more advanced, try my Write Pythonic Code course

01:02:38 at talkpython.fm/pythonic.

01:02:40 Be sure to subscribe to the show.

01:02:43 Open your favorite podcatcher and search for Python.

01:02:45 We should be right at the top.

01:02:46 You can also find the iTunes feed at /itunes, Google Play feed at /play, and direct

01:02:52 RSS feed at /rss on talkpython.fm.

01:02:56 This is your host, Michael Kennedy.

01:02:57 Thanks so much for listening.

01:02:59 I really appreciate it.

01:03:00 Now get out there and write some Python code.

01:03:02 I'll see you next time.