Debugging Python in Production with PyStack

Episode #419, published Wed, Jun 14, 2023, recorded Tue, Jun 6, 2023

Episode Deep Dive Links Transcript

Here's the situation. You have a Python app that is locked or even has completely crashed and all you're left with is a core dump on the server. Now what? It's time for PyStack! You can capture a view of your app as if you've set a breakpoint and even view the callstack and locals across language calls (for example from Python to C++ and back). We have the maintainers, Pablo Galindo Salgado and Matt Wozniski, here to dive into PyStack. You'll definitely want to have this tool in your toolbox.

Play on YouTube

Watch the live stream version

Episode Deep Dive

Guests Introduction and Background

Pablo Galindo Salgado and Matt Wozniski are seasoned Python developers from Bloomberg’s Python Infrastructure team. Pablo is a Python core developer heavily involved with Python releases (release manager for Python 3.10 and 3.11) as well as part of the Python Steering Council. He also contributes to the Faster CPython project, the garbage collector, and other internals of CPython. Matt focuses on developing Python infrastructure tools at Bloomberg, maintaining the interpreter, and moderating the Python Discord community. Together, they co-maintain PyStack, a production debugging tool for Python and C-level code.

What to Know If You're New to Python

Here are a few essentials to help you follow the conversation on debugging:

Familiarize yourself with Python’s runtime and the idea of stack frames. Even a simple understanding of how Python’s call stacks and exceptions work will help you see what PyStack is capturing.
Recognize that Python can call (and be called by) external C/C++ modules, meaning Python code often interacts with low-level native code.
Know that a core dump is a file capturing a program’s memory at the time of a crash. Debuggers and tools like PyStack can read this file to see what happened during a crash.

Key Points and Takeaways

Why PyStack Exists PyStack addresses the challenge of debugging Python applications that mix Python and native extensions (e.g., C/C++ or Rust). Traditional Python debuggers like pdb show only Python-level information, and tools like GDB show only the C call stacks. PyStack combines both, allowing developers to diagnose locked or crashed processes across language boundaries.
- Links / Tools:
  - PyStack GitHub
  - GDB
Debugging Frozen or Crashed Python Apps A major use case is diagnosing production deadlocks or crashes. When a Python app becomes unresponsive (CPU at 0% or stuck on a deadlock) or abruptly crashes and leaves behind a core dump, PyStack can snapshot the state of each thread and reveal whether the freeze is in Python or in the underlying native code.
- Links / Tools:
  - Core dumps documentation (varies by OS / man core)
  - PyStack ReadTheDocs (unofficial user guide if searching online)
Hybrid Stacks: Seeing Python and Native Calls Together Python often calls C/C++ libraries (like NumPy or custom extensions). These libraries might then call back into Python, creating a “Python → C → Python” nested flow. PyStack visualizes this interleaving so you can see exactly how your code entered and exited the native layer.
- Links / Tools:
  - NumPy (example of a native-extended library)
  - PyStack GitHub issues (for discussion or questions)
Safe vs. Invasive Debugging Approaches GDB can change process behavior because it can inject code into the running process, but PyStack aims to be a read-only tool. It freezes the process, inspects memory, and resumes execution without altering application state. This makes it safer for use in sensitive or regulated environments where GDB may be prohibited.
- Links / Tools:
  - GDB overview
  - Bloomberg’s Tech Blog (insights from the maintainers)
Working with Highly Optimized Python Binaries Many production Python installations lack full debug symbols, making it harder for GDB alone to offer a clear Python stack. PyStack compensates by using heuristics and “forbidden magic” to read memory structures. It also automatically fetches debug info from distribution servers (on many Linux distros), providing inline function details and more clarity without complex user setup.
- Tools / References:
  - DebugInfoD servers
  - Linux distros’ debug symbol packages
PyStack and Other Profiler Tools (PySpy, Austin) While PyStack and profilers like PySpy or Austin share some underlying techniques, PyStack focuses on correctness over speed and works offline on core dumps. Profilers aim to capture many quick snapshots for performance analysis, but may not fully reconstruct complicated crashed states. PyStack’s purpose is a single, accurate snapshot—even from a dead process.
- Links / Tools:
  - PySpy
  - Austin
Local Variables and Deeper Inspection PyStack can often retrieve local variable values from Python frames (e.g., seeing the contents of lists and dictionaries). It doesn’t call Python’s own __repr__—instead, it manually interprets objects in memory. For built-in types like list or dict, it can show the real values, which is extremely helpful when diagnosing weird edge cases or parameter misuse.
- Links / Tools:
  - CPython internals docs: Dev Guide (peps.python.org/dev/)
Bloomberg’s Python Ecosystem Both guests highlighted that Bloomberg transitioned from a primarily C++ codebase to a massive amount of Python. Consequently, they created PyStack to handle the challenge of bridging Python with their custom native libraries at scale. This environment ensures that PyStack is tested in complex, real-world scenarios.
- Links / Tools:
  - Bloomberg Open Source Projects
Safe in Production, Yet Great for Testing PyStack can attach to an unresponsive production process without injecting new code, making it a strong choice to keep downtime minimal. It’s also a valuable tool for QA or testing (e.g., via a plugin for pytest) to automatically capture debugging snapshots if a test run deadlocks or crashes.
- Links / Tools:
  - pytest plugin for PyStack (GitHub issues mention)
  - pytest
Future Enhancements: Subinterpreters, Async, and More With Python 3.12 subinterpreters potentially on the horizon, PyStack may adapt to display each subinterpreter’s state. The maintainers are also exploring supporting asynchronous stacks, async/await flows, and further speed or reliability improvements so that PyStack remains relevant for modern Python challenges.

Interesting Quotes and Stories

On debugging in production: “It's a bit like being an emergency room doctor—nobody wants to visit one, but they're extremely glad it’s there when needed.”
On building PyStack: “We say it uses ‘forbidden magic.’ We poke into memory structures that CPython doesn’t officially expose, but that’s how we reconstruct the full picture when everything’s gone wrong.”

Key Definitions and Terms

Core dump: A snapshot of a program’s memory at the moment it crashes. Used for post-mortem debugging.
Frame: A call stack element representing a function call (in Python or native code).
GIL (Global Interpreter Lock): A mutex that prevents multiple native threads in Python from executing Python bytecodes simultaneously.
Inline Functions: A compiler optimization that replaces a function call with the function’s body to reduce overhead, potentially obscuring standard function frames.
Heuristics: Techniques PyStack uses to guess memory layouts in the absence of full debugging symbols.

Learning Resources

Here are some courses and materials you might find helpful.

Python Memory Management and Tips: Deepen your understanding of how memory and references work in Python.
Getting started with pytest: Explore structured testing in Python, an excellent complement to debugging with tools like PyStack.
Python for Absolute Beginners: If you are brand-new to Python, this can help you build a solid foundation before diving into advanced debugging.

Overall Takeaway

PyStack stands out as a powerful, safe, and production-ready tool to troubleshoot Python applications that interweave native code and Python logic. From capturing frozen states to analyzing complex crash scenarios with minimal risk, it fills a crucial debugging gap. Whether you’re scaling your apps, exploring concurrency bugs, or hunting elusive memory errors, PyStack offers a compelling blend of “just read memory” accuracy and user-friendly convenience.

Links from the show

Pablo Galindo Salgado: @pyblogsal
Matt Wozniski: github.com
pystack: github.com
Watch this episode on YouTube: youtube.com
Episode #419 deep-dive: talkpython.fm/419
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy
Episode #419 deep-dive: talkpython.fm/419

Episode Transcript

Collapse transcript

WebVTT format On GitHub

00:00 Here's the situation. You have a Python app that is locked up or even completely crashed,

00:05 and all you're left with is a core dump on the server. Now what? It's time for PyStack.

00:10 You can capture a view of your app as if you've set a breakpoint and even view the call stack and

00:15 locals across language calls. For example, from Python to C++ and back. We have the maintainers,

00:22 Pablo Galindo Salgado and Matt Wozniacki, here to dive into PyStack. You'll definitely want to

00:27 have this tool in your toolbox. This is Talk Python To Me, episode 419, recorded Tuesday, January 6th, 2023.

00:36 Welcome to Talk Python To Me, a weekly podcast on Python. This is your host, Michael Kennedy. Follow

00:55 me on Mastodon, where I'm @mkennedy and follow the podcast using @talkpython, both on

01:01 fosstodon.org. Be careful with impersonating accounts on other instances. There are many.

01:06 Keep up with the show and listen to over seven years of past episodes at talkpython.fm.

01:11 We've started streaming most of our episodes live on YouTube. Subscribe to our YouTube channel over

01:16 at talkpython.fm/youtube to get notified about upcoming shows and be part of that episode.

01:23 This episode is brought to you by Sentry and their awesome error monitoring product. And

01:29 it's brought to you by Compiler from Red Hat. Listen to an episode of their podcast as they

01:33 demystify the tech industry over at talkpython.fm/compiler. Hey, Pablo. Hey, Matt. Great to have

01:40 you both here. I'm very excited to talk about some cool tools that give us a nice internal look inside of

01:49 our Python apps. Yeah, I already excited to talk about these tools as well. Yeah. It sounds like you've been

01:54 working on them for a couple of years, you two, and especially you, Pablo, and it's going to be great. So

01:59 that's looking at sort of debugging Python apps and even Python apps that have crashed, which is really,

02:06 really fantastic. Maybe some profiling as well. We'll see what we get to.

02:10 Cracking apps is not that fantastic, but debugging them maybe. Right.

02:15 Exactly. I'm so excited my app crashed because I get to use this PyStack tool that y'all worked on.

02:20 For a while, I actually was because that means that I could use it. But let's keep that secret.

02:26 Well, you need test cases, right? And you need examples.

02:28 Right.

02:29 About to say exactly that. It's a great way to see if we have any bugs. It's a great chance to try things out.

02:34 Meta debugging. If you're trying to debug the debugger.

02:37 Oh, we have learned that for sure. Yeah.

02:38 I bet you. How funny. Awesome. Well, let's just have a quick round of introductions from you guys

02:47 before we can jump into it.

02:48 Absolutely. Matt, will you know first?

02:50 Sure. I am Matt Wisniewski. I'm a senior software engineer at Bloomberg. I originally joined in 2009,

02:58 I think. So I've been around for quite a while. I work on the Python infrastructure team at Bloomberg.

03:04 So our job is building tools and libraries and maintaining the interpreters for use by other teams at Bloomberg.

03:11 Right on. Sounds very fun. Pablo?

03:13 Hey, I'm Pablo Alino. Apart from working at Bloomberg in exactly the same things as Matt because he's my co-worker.

03:19 I do a bunch of things in the Python community. So let me see if I don't miss anything.

03:25 So I'm in the Python team council. I think this is my third year. I'm also the release manager 310, 311.

03:31 310 is going to security fixes. And today I actually need to do the last one. So that's kind of exciting, I suppose.

03:37 You have a 310 release coming today?

03:39 Yes. The 312 beta one, I think we are also releasing 311 and 310. Like 311 is a bug fix. 310 is a first security release, which means it's source only.

03:48 And it's my first security release as well. But I think that makes it easier. Hopefully.

03:52 Yeah. And you were on the podcast before talking about the actual release of 311 and that whole process. So if people want to go back and see that in more detail, that was really fun.

04:02 We broke GitHub. That's an exciting time when we released 311. So yeah, I'm also in the Python core team, mainly working on the parser and the garbage collector. Therefore breaking black auto formatters a lot with new syntax.

04:18 Yeah. Excellent.

04:18 Faster CPython. You can't forget faster CPython.

04:21 Right, right. Sorry. And I'm collaborating with the faster CPython team as well. Yes.

04:24 Sorry. I just see. I always forget something.

04:27 As far as community stuff goes, I'd be remiss if I didn't mention that I am a moderator on the Python Discord too. So if anyone is not a member of Python Discord, they should join. It's a cool place to hang out.

04:37 Awesome. Yeah, we'll put a link to that in the show notes as well so people can check that out. So you all are busy is what I hear you saying.

04:44 Yes.

04:45 Yes.

04:46 Absolutely. All right. Excellent. Well, let's maybe set the stage for talking about PyStack here. And the first thing I want to talk about is just hold on.

04:56 I do want to point out, like you said, you both work on the infrastructure team at Bloomberg. I'm not sure people fully appreciate how much Python is happening at Bloomberg. So maybe we should just give like a big picture because it's germane to this conversation like PyStack and other tools like Memory, which we may get a chance to talk about if we have time. Those are coming out of supporting this large community.

05:19 And I got a sense of how big Python is at Bloomberg when there were 60 engineers from Bloomberg at PyCon and everybody's booth duty was measured in minutes.

05:31 Right, right. And they didn't copy paste the same engineer 60 times.

05:36 Yeah, it's quite wild. I can start if you want to. It's very interesting, actually, that you point this out because Bloomberg, being a very old company, you know, when people talk about legacy and it's like, oh, we have legacy because we have this Python 2.7 script.

05:50 Well, this is just to see what a company started in the 70s can do about that. But it used to be a C++ house. And when I started, actually, the main language were C++ and JavaScript for the front end in Bloomberg terminal.

06:05 And now we are actually, I think, if I'm not mistaken, Matt can correct me, but I think we can say probably that now we are a Python house because I think we just surpassed for the first time the amount of lines of Python compared with C++.

06:18 And if that is not true, it's mostly there, which is quite exciting.

06:22 At the point where I joined the company, we had onboarding training for juniors that included a Fortran for C programmers section. And at this point, we have a C++ for Python programmer section instead. So that's how much it's changed over the last decade or so.

06:38 Wow, that's incredible.

06:39 That's the C++ is now legacy. Who said that? Certainly not me. The Steam Council member says that.

06:49 Okay, so yeah, but this is an interesting and this kind of goes into itself. I think this is an interesting scenario because that C++ didn't win a way.

06:57 Among other things is because it's not legacy, you know, like Python is not the fastest language on the planet, although we are trying to make it fast, but certainly not.

07:05 We are not trying to make it compete with C++. So that C++ is both there and needed because, you know, that you cannot just do everything in Python for sure.

07:13 So the situation that we have at Bloomberg, which is served by many other big companies and finance companies as well, is that we have a lot of C++ under that Python.

07:20 So, well, C++ by itself, but let's talk about the actual part that is interesting for Python.

07:25 So you can write a lot of pure Python scripts, right? What most people, our company does is write some Python and 99% of the time, it ends reaching some C++ underneath.

07:36 And this is not just because they are using NumPy or Pandas or any of these compiled very common scenarios as well, but also because they are using Bloomberg code underneath, which happens to be written in C++.

07:48 And this means that, you know, now we have a huge community of people that need to be aware of both languages at the same time.

07:55 So you have Python programmers that need to be aware of C++ as well as C++ programmers that need to be aware of Python, either because they just happen to switch to Python or because they use Python to run tests or things like that.

08:07 So it's a very interesting scenario.

08:08 Yeah. And I imagine, you know, leading up to this conversation, if something goes wrong, you might not be entirely sure.

08:14 Is it the Python code? Is it the C++ code? Is it the interaction of these things? Right? That can make it tricky.

08:22 This is quite, what you point out is basically what, among other things, one of the things that make us work on this in the first place, because what happened normally is that when an application, one of these applications that is hybrid crash at Bloomberg, we used to have these, many companies does this.

08:37 I mean, I always assume all of them, but what happened is that when an application crashes, we either get a core file, right?

08:44 Which is this kind of like file that is dumped by the kernel with all the memory dump of the process and a bunch of information and you can analyze it later.

08:52 Or a debugger basically attaches to the process and shows you the factories.

08:56 But the problem is that this debugger was GDB, which means that the factories that you get is a C backtrace.

09:02 It's not a Python backtrace, right?

09:04 And the factories is useless for everyone.

09:06 So it's like, it doesn't matter if you're a C++ programmer or a Python programmer, it's always useless.

09:10 It's useless for Python programmers because you see C functions.

09:14 And like a Python programmer, normally, it's like, what is this?

09:17 Like, what is that register, right?

09:18 But like, on the other hand, it's also useless for C++ programmers because the function that evaluates Python code, which is called pyeval eval frame default, is repeated 600 times.

09:28 And it doesn't make any sense.

09:30 Not only do they also expect to see their Python code, but now their C++ code is just buried among these mysterious calls.

09:37 So nobody could make sense of what's going on, which means that when, for a long time, when a Python application crashed at Bloomberg, I mean, we're talking crash here as like hard crash, like secfold.

09:46 Right.

09:46 Which is common in C++ secfolds or sometimes even Python can secfold, right?

09:51 In some situations.

09:52 So obviously a Python exception is reported as a Python exception.

09:55 We are not talking that our system picks GDB if a Python exception is raised.

09:59 Obviously, that is fine.

10:00 It's when this kind of like hybrid setup crashes in very deeply, either because Python crashes or because NumPy crashes or because the Bloomberg code crashes.

10:09 So in those cases, nobody could make sense of what's going on because, you know, the backtracing is useless by itself.

10:15 And not just crashes, that it's also a problem for deadlocks as well.

10:20 One of the big differences between Python and C++, they've got very different models for how they represent the lifetimes of objects and things like that.

10:29 You can very easily get into a situation where like a C++ object that is owned by a Python object in our hybrid work environment is trying to communicate with a background thread to, I don't know, tell it to stop or something like that.

10:44 But that background thread winds up needing to pick up the GIL for some reason.

10:46 And if you have the foreground thread still holding the gill, you've just introduced a deadlock.

10:51 And it's very difficult to track that down if the only tool at your disposal is GDB and you can't easily see what Python stuff is being called.

11:00 Yeah, and Python, sorry, C++ multithreading is like critical sections and locks and like it's really explicit.

11:07 You enter a lock, you exit a lock and, you know, Python has that as well.

11:11 But in addition, it just has this implicit GIL that you're talking about, right?

11:16 And that there is a whole nother layer of potential deadlocks in there that I imagine is pretty tricky.

11:21 And some of them are not only difficult to spoil, but it's very difficult to reason about just because most of these things are embedded.

11:27 So difficult, actually, that even if you go to some popular tools in the wild, they don't handle all the S cases because it's quite hardcore.

11:35 Like the fact that you can just request the GIL in the middle of life.

11:39 For instance, we are not going, I don't know if we will end talking about memory profilers here, but most memory profilers that are measuring like native allocations in Python and C++, they need to see the Python stacks and how.

11:52 Like when an allocation is made, you say, oh, give me some memory.

11:54 You need to say, who called this function?

11:56 And then you need to see both the C sometimes.

11:59 So you only call Python, you just need to say, okay, what is the Python stack trace, right?

12:02 Like an exception, but like who called this function?

12:05 And for that, you need the GIL.

12:06 So you need to say, give me the GIL.

12:07 And then I will print the stack.

12:09 But like sometimes one of the situations that Matt is describing will introduce a dialogue.

12:13 It's very, very rare situations, but it's the kind of situations that will involve threats and like, you know, specific things.

12:20 But which means that it will happen very, very rarely, but it will happen or it can happen.

12:25 So it's both going to be rare to debug because, you know, this person who is going to appear when, you know, when the moon is high and, you know, Wednesday is going to say, well, only in these cases, the deck locks.

12:38 And it's going to be very difficult to fix as well.

12:41 Because now you cannot just have the GIL.

12:42 So you need a complete redesign sometimes.

12:45 So it can be very challenging to fix, to find, to debug as well.

12:49 Those situations are so tricky.

12:51 You know, they often get the term highs and bugs to indicate kind of the quantum mechanical uncertainty, right?

12:59 Like it could be in some state, but if you measure it, then it's in a different state, right?

13:03 It's kind of how do you actually track these down?

13:06 And they happen in a lot of times in production, but only under heavy load after like 12 hours.

13:11 So how do you use normal debugging techniques to step through that?

13:15 All these things are really, really tricky.

13:17 But it's also very interesting as well.

13:21 We can certainly discuss exactly what you just said, because it's just spot on.

13:24 And one of the reasons here is that, as you mentioned, the moment you touch a debugger, because the debugging process normally slows down the code, means that it's reproducing the deadlock may be harder.

13:35 So sometimes the only way to properly debug this thing is letting it crash without the debugger, producing a core file, and then analyzing the core file.

13:43 But because the core file is dead, I mean, it's just basically a dump.

13:46 You cannot call functions now.

13:48 Then GDP can be useless if you don't know how to use like certain tools.

13:53 So PyStack actually is very useful here as well, because you can kind of reproduce the deadlock at full speed, generating a core or letting it crash, and then use PyStack.

14:02 But we can talk about that later.

14:04 Or maybe you don't know that you're trying to reproduce a deadlock.

14:06 Right.

14:07 And you're like, you know, it won't respond, but the CPU load on the server is zero.

14:12 That.

14:13 Yeah, that sort of, is it waiting for something, or is it ever going to respond, or you just don't know what it's doing yet?

14:21 Right.

14:21 Yeah.

14:21 Yeah, exactly.

14:22 And, you know, logging will help a little bit, but not, you've got to have some pretty intense logging to get that level of what's going on.

14:30 And usually, you don't want that in production either.

14:32 Yeah.

14:33 Yeah.

14:34 This portion of Talk Python To Me is brought to you by Sentry.

14:38 You know that Sentry captures the errors that would otherwise go unnoticed.

14:41 Of course, they have incredible support for basically any Python framework.

14:46 They have direct integrations with Flask, Django, FastAPI, and even things like AWS Lambda and Celery.

14:53 But did you know they also have native integrations with mobile app frameworks?

14:57 Whether you're building an Android or iOS app or both, you can gain complete visibility into your application's correctness both on the mobile side and server side.

15:08 We just completely rewrote Talk Python's mobile apps for taking our courses.

15:12 And we massively benefited from having Sentry integration right from the start.

15:17 We used Flutter for our native mobile framework.

15:20 And with Sentry, it was literally just two lines of code to start capturing errors as soon as they happen.

15:26 Of course, we don't love errors, but we do love making our users happy.

15:30 Solving problems as soon as possible with Sentry on the mobile Flutter code and the Python server side code together made understanding error reports a breeze.

15:40 So whether you're building Python server side apps or mobile apps or both, give Sentry a try to get a complete view of your app's correctness.

15:49 Thank you to Sentry for sponsoring the show and helping us ship more reliable mobile apps to all of you.

15:55 So let's start our conversation about just the spectrum of what the options are out there.

16:04 Now, you mentioned GDB, but I guess on one side of the spectrum, we have PyTarm and VS Code and friends where you press F5 or you press the little debug thing and you step through your code.

16:17 And that's helpful when you're actually developing your application.

16:20 But you're talking about running a production or running applications or applications that have crashed.

16:26 And here's a core file.

16:28 Those are not exactly the same situation, are they?

16:30 We are talking about the specific case.

16:32 We are sending ourselves in two main scenarios, although you can certainly use it in others, but just to have a good mental model over what these tools are going to be useful for and what are they different from PDB or VS Code divided, right?

16:45 We're talking about things that crash, crash as like hard crash, like, you know, or things that are frozen.

16:51 It could be a deadlock, but it could be also that your application is waiting for something and it never arrives and you want to know what is going on.

16:58 And specifically when these tools are going to become even more useful is when this code involves C++ or C code or RAS code for whatever I mean.

17:06 So it's a native code, right?

17:08 And the reason this is going to be more useful is because PDB or VS Code devarrier or PyCharm normally attach Python devarriers, which means that these devarriers only know about the Python world.

17:18 And that normally is insufficient because you want to see both worlds at the same time.

17:22 And this is going to become very critical.

17:24 You don't want to be one world and the other separately.

17:27 You want some tool that understands both worlds at the same time and can show you what really is happening and how this works.

17:33 Like you're entering one world, leaving it, but then you're entering again or something like that.

17:37 Right.

17:37 So if you've got this mixed code, a Python profiler won't, they won't see the steps happening below that.

17:43 Right.

17:44 So for instance, a Bloomberg, a Python profiler will tell you that the application, if it's, let's say it's frozen and then you say, you use VS Code, it's going to tell you that it's frozen on run, a function called run.

17:54 And then there's 600 layers of C++ underneath, which ones you don't know.

17:58 Or like, for instance, if you say, let's say for mysterious reasons, NumPy has a bug or something, right?

18:04 And then you're adding two NumPy arrays.

18:06 And it's telling you that the addition of the NumPy arrays is what is being frozen, but you don't know what is happening underneath.

18:13 And then with, you know, you want to see what's happening on the NumPy C++ code or the Cypy code or the TensorFlow code or, you know, there is countless examples of like heavy C++ or compiled code underneath.

18:26 Like for instance, Pydantic these days is running on Rust, which means that if something goes wrong there or it just crashes or is freezing or whatever, you're going to not see it either.

18:35 So we're talking about those cases.

18:38 We've seen deadlocks on the GC del of a object that's defined in an extension module.

18:45 And when that happens, all the Python debugger is going to tell you is absolutely nothing.

18:48 It doesn't see the call into the TP del method.

18:51 It doesn't see, all it shows you is that there is a variable that's going out of scope or being reassigned or something like that on the last Python line that was run.

19:00 And anything happening under that is just totally opaque to it.

19:03 Right.

19:03 On the other hand, just talking about the other tools.

19:06 So on the other hand, things like GDB have the same problem that we're talking about.

19:10 GDB only understands C code.

19:11 So what it's going to show you is that your interpreter is going to show you the kind of NumPy code underneath or the C++ code underneath or the RAS code underneath.

19:19 But it's not going to show you the Python code.

19:21 So you don't know what we'll call that.

19:23 So this is the other side of the coin.

19:24 You're going to see.

19:25 Let's see.

19:26 Let's see action happening.

19:27 But how I reach the C, we don't know.

19:29 And the situation when this is really bad is when you kind of enter and exit the C realm multiple times.

19:35 Right.

19:36 So you have some Python code that calls some C code and in turn it's called some Python code again and some C code again.

19:42 So that is kind of really hard because you're going to not see anything.

19:46 Like it's impossible because you don't know how you reach this situation.

19:49 And that's quite hard.

19:50 So like the classic debuggers won't be able to do so.

19:54 Interestingly, though, because we are going to see that there is some other tools that have some functionality close to what PyStack does.

20:01 So in Python, in CPython, we provide some like plugins, let's call it.

20:06 This is basically an extension files that you can kind of add to GDB that allows you to do some similar things.

20:13 So for instance, the Python files that you can put in GDB allow you to pretty print Python objects.

20:20 So even if you're in the C world, you can kind of print some objects.

20:23 And you can also print like some kind of like Python stack tree.

20:26 So you can ask like, hey, can you kind of show me when I'm in Python?

20:30 But it's not a hybrid one.

20:31 So it will only tell you where you are in Python, which means that you're going to miss the C version.

20:36 Because in this GDB, you can also ask for the C version, but you're not going to see them together.

20:40 You're going to see either one or the other.

20:42 And then it's up to you how to kind of like merge them, which sometimes is very hard because there's very hard rules to know how to do that.

20:50 Especially for instance in Python 3.11 and some of the optimizations we did in the fastest in Python project means that the same evaluator loop can be reused for multiple Python functions.

21:01 So you do some in 3.11, you are not even able to do it without extra metadata.

21:06 So it's quite hard.

21:07 So it's quite hard.

21:07 And even if this, let's say, plugins are actually useful and can get you closer.

21:13 And for a lot of time, that was useful.

21:15 And the only way to do that, if it's specific for core files, because if you have a core file, your only option is using GDB or was on before bias test.

21:23 You could somehow do it.

21:25 The problem is that GDB relies on, to do this, relies on debugging information inside the core, which means that if you have Python interpreters that don't have the right information, which by the way, is most Python interpreters that are shipping distributions.

21:39 Or most Python interpreters that are used normally because you don't want to ship gigantic debugging information to production most of the time because this can be really big.

21:47 It means that GDB is not going to work because it relies on the fact that it can inspect local CIF variables inside the frames.

21:55 And as many, you know, CIF programs will tell you is the most common thing that GDB will tell you is optimize expression, which means that I cannot tell you anything here, which means there is not going to work.

22:06 There is ways to make it work, but, you know, we are entering the expert realm here.

22:10 Like, you know, making GDB kind of work here is the kind of like, you know, you need two staff engineers and like, or a Cpython core dev that knows what is going on.

22:19 So, so there's certainly not for, for your average Python program, right?

22:22 Which is, let's remember, we are trying to debug something, which probably we're under pressure and we don't want to read the, you know, the man page of GDB and Python and how to debug Python with GDP and like debugging, what is debugging information?

22:36 So, you know, if you're a Python programmer and just want to like, you know, paste a stack trace in a back report, for instance, because someone asked you to do that, or you just want to see what's going on, or you just want to tell someone where it's crashing.

22:47 You don't want to learn all of these things, right?

22:49 You don't even have time.

22:50 Maybe you need to fix it.

22:52 It's crashing on production, right?

22:53 So you cannot just tell your boss, oh, yes, we're here until I read the man page of GDB, right?

22:58 Like, it's not going to help.

22:59 Yeah.

22:59 I don't know how many people, I imagine a good number, but not all of the people listening have had an application or an API or something crashing in production when people are trying to get to it.

23:10 It's very stressful.

23:11 Right.

23:11 I should add that distributions do generally give you a way to get at the debug data for an interpreter.

23:19 It's not as though they strip it off entirely and it's gone forever.

23:22 It's just that the way that you get at it after the fact is different per distribution.

23:27 And it's not something that necessarily everyone who's firing up GDB knows how to do.

23:31 So it's not that it's gone forever.

23:33 It's just that it's not easily accessible to everyone, I'd say.

23:36 Sure.

23:36 Can you add it afterwards?

23:38 Like, if my app crashes and I'm like, oh, no, I didn't have the debug information.

23:43 You can, actually.

23:43 It'll either be, like on Ubuntu, it'll be a Python 3-debug package that you install.

23:50 Or there's also a thing called DebugInfoD that can download the debug information from servers managed by your distribution as needed when a debugger requests it, which is pretty cool.

24:02 But the problem is that this still doesn't assure you that GDB is going to work.

24:05 It's just that it gives you more chances.

24:07 But still, if your Python interpreter is heavily optimized, this may be not enough.

24:12 And actually, for the sake of giving data, in most distributions, it's actually not enough.

24:17 Because these variables, particularly the frame variable in the Python evaluator loop, is extremely heavily optimized.

24:23 Among other things, because it has to be.

24:25 Because the Python evaluator loop is very hot.

24:27 So most of the time, these tools are, let's say, unreliable.

24:31 Let's introduce kind of like what PyStack does then.

24:33 And then let's talk about also like what other tools you can use that are not PyStack, maybe.

24:37 I think that's probably a good setup for PyStack.

24:40 Like, why does it exist?

24:42 Why do people really?

24:43 Why is it such a game changer, right?

24:45 It understands both of these worlds in a really nice way.

24:49 Tell us about it.

24:49 Exactly.

24:50 So the short version, because, and this is quite funny, because like when we were at PyCon,

24:54 we were presenting both projects that we maintain here, PyStack and memory.

24:57 And memory, everybody knew.

24:59 PyStack was the new one.

25:00 A lot of people didn't kind of catch exactly what it does.

25:03 And it's actually easier to explain than the profiler, which I think, you know, is quite funny that that was the other way.

25:09 So the PyStack, what it does is very simple.

25:11 So PyStack is a tool when you give it a Python program that is running or frozen, right?

25:17 But let's say it's alive.

25:18 Or a core file.

25:20 And it will tell you what it's doing.

25:22 So it will give you the stack trace, basically.

25:24 So it's going to tell you, okay, so this Python program has these many threads.

25:27 And for every thread, it's going to show you the stack trace, right?

25:30 So this function is calling this function, is calling this function, and it's running currently here.

25:34 Like a snapshot in time, right?

25:36 When you asked it, it's like, boom, what are all the threads doing?

25:39 Here you go.

25:39 Exactly.

25:40 So if the program is running, which you can absolutely run PyStack on a healthy running program,

25:44 it's going to tell you what it was doing at that time.

25:46 So by default, it's going to stop the program for a super small amount of time.

25:51 It's going to take a photo of what the program is doing, and it's going to tell you what every thread was doing.

25:55 So who calls who and what the program was actually running at that time.

25:59 If the program is frozen because you have a log or something like that, it's going to show you what is blocked, basically.

26:05 And if you have a core file because your program crashed or because you generate one on demand, because, by the way, you can absolutely generate one core on demand.

26:12 Just take a snapshot and it's there.

26:14 It will tell you what the program was doing, what the core was generated.

26:17 That's something you can do in Linux just in the terminal.

26:20 You can just say, take a core dump of some running process.

26:24 Yeah.

26:24 For instance, you can do it with GDB or with a utility that sells out to GDB called G Core, which is installed by default when you store GDB.

26:31 And the cherry on the top here is that it will tell you both what the Python code and the C code.

26:36 So it will tell you like, okay, so we are calling these Python functions, you know, main and main calls, you know, create dictionary and create dictionary calls add numpy array.

26:45 And then when it enters C code, the C realm is going to show you also the C codes.

26:50 And then if it enters Python again, it's going to show you Python again and C again.

26:53 And it's going to tell you if it's Python or C.

26:55 And also it's going to show you code that is running.

26:58 So if the source code is available, which most time it is, it's going to show you like exactly what line, like the same thing as a traceback.

27:05 Basically, it's going to show you like what line was running.

27:08 And very cool as well, since Python 3.11 is going to show you what sub-expression is running.

27:13 Because in Python 3.11, we have this better error project that I started.

27:18 And now we have line, sorry, column information.

27:21 So we know you have a very complicated expression and something crashes.

27:25 We can point you exactly to what part of the expression was generating the crash.

27:29 But now we can use the same information in this tool to show you what part of the expression was running.

27:34 So for instance, you were adding four numpar arrays and the application crashes adding the second and the third one.

27:40 We can show you, okay, it's crashing adding the second and the third one.

27:43 So you can know exactly that was that operation and not the other one, which is quite cool.

27:48 And the same for C.

27:49 It sounds awesome.

27:49 I also like the description here.

27:51 Pystack is a tool that uses forbidden magic to let you inspect the stack frames of a process.

27:57 That's a funny trivia.

27:58 I got a funny conversation with Mark Shannon.

28:01 We work together on the 5-star C-Python project because we say here, nasty C-Python internals.

28:07 Which mostly, you know, like we went into what is a nasty C-Python internal because we both do those nasty C-Python internals.

28:14 But, you know, I think nobody really enjoys internal C-Python, not even core devs.

28:19 So there you go.

28:20 Well, and I imagine that you're making your life harder with 3.11, all of you.

28:23 This is kind of weird because I'm making my own job harder every single time.

28:26 So I work on Python, I'm happy, and then I'm sad because I just make my own life harder on the other side of the pool, right?

28:33 This is a project that would be much, much harder to maintain if Pablo wasn't around to help on it.

28:37 Because, yeah, in order to keep this forbidden magic working, we do need to keep up with changes to the interpreter.

28:43 And it's a place where if we didn't have core devs telling us what changed in the interpreter, it would be very hard to keep up with those changes and figure out just what has changed.

28:51 Especially since the faster C-Python stuff has kicked into gear.

28:55 One thing here, which is also quite interesting, is that we are not the only people to enjoy this forbidden magic.

29:01 We have a co-win, let's say.

29:02 This forbidden magic is shared in one way or the other with performance profilers.

29:07 So, for instance, some of the ones that use similar techniques are Austin, the Austin profiler, and also PySpy.

29:14 And it's quite interesting because there is a difference.

29:16 Both tools can actually do something similar.

29:19 So, both PyStack and Austin kind of take a snapshot and show you what the application is doing.

29:24 At the time of this podcast, they can do it for a live process.

29:28 They cannot do it for a core file.

29:29 So, if you have a core file, you are out of luck.

29:31 You can only use PyStack.

29:32 If you have a live process, you can use PyStack, but you can also use PySpire or Austin.

29:37 Both can do that.

29:38 The main difference here, even if we share functionality, is that we are not a profiler.

29:43 We are a debugger, which means that we try really, really hard to find that information, even in the most weird situation.

29:52 So, for instance, even if you have corrupted memory or your core file is corrupted or your process is really, really in a bad state, we can still give you the information.

30:03 Even if you don't have the back symbols or we are slower than both profilers because both PySpire and Austin need to basically take photos at a very high speed because that's what the profiler does.

30:14 It basically takes a lot of photos very fast.

30:16 And then you're going to show you, okay, I took like one million photos in a second.

30:20 And most of the time, the photos show that you're wearing this function called very slow function.

30:25 So, that is going to tell you, well, you're spending most of the time in this function.

30:28 You should optimize that function.

30:29 So, for them, it's really, really important to take those photos really fast, right?

30:33 Sometimes even sacrificing correctness in some cases.

30:36 Yeah.

30:37 The same thing with profilers especially, yeah.

30:39 Right.

30:39 They have some options to control the correctness because they need to kind of sometimes guess.

30:43 And sometimes these photos, they take with the process running.

30:47 So, you know, you can be like half of the photo in one place and half of the other in the other.

30:51 And both have like options to control if you want that or not.

30:54 It's like frame tearing in video games.

30:56 Yeah, pretty much.

30:58 I always think about that particular metaphor when I explain this.

31:02 But most people don't know what that is.

31:04 So, I'm very happy to know that you're a connoisseur as well.

31:07 The main difference is that for them, they do a very good work, just to be clear here.

31:12 But for them, their main concern is speed.

31:14 And the whole thing is surrounded by this idea of doing this operation very fast.

31:19 And because they do it very fast, they can do it once.

31:21 So, you just ask for one photo, they can give you that photo.

31:24 In our case, our concern is not speed because we are not on our profile.

31:28 Our concern is correctness and the photo.

31:30 So, we really, really, really hard to try to get the correct photo and the photo if it's possible.

31:35 So, that's kind of the main difference.

31:37 This portion of Talk Python is sponsored by the Compiler podcast from Red Hat.

31:43 Just like you, I'm a big fan of podcasts.

31:45 And I'm happy to share a new one from a highly respected open source company.

31:49 Compiler, an original podcast from Red Hat.

31:52 Do you want to stay on top of tech without dedicating tons of time to it?

31:56 Compiler presents perspectives, topics, and insights from the tech industry free from jargon and judgment.

32:01 They want to discover where technology is headed beyond the headlines and create a place for new IT professionals to learn, grow, and thrive.

32:08 Compiler helps people break through the barriers and challenges turning code into community at all levels of the enterprise.

32:14 One recent and interesting episode is there, the great stack debate.

32:19 I love, love, love talking to people about how they architect their code, the trade-offs and conventions they chose, and the costs, challenges, and smiles that result.

32:28 This great stack debate episode is like that.

32:30 Check it out and see if software is more like an onion or more like lasagna or maybe even more complicated than that.

32:36 It's the first episode in Compiler's series on software stacks.

32:40 Learn more about Compiler at talkpython.fm/compiler.

32:44 The link is in your podcast player show notes.

32:47 And yes, you could just go search for Compiler and subscribe to it, but follow that link and click on your player's icon to add it.

32:55 That way they know you came from us.

32:56 Our thanks to the Compiler podcast for keeping this podcast going strong.

33:03 This means that if you are already using, a lot of people are using PySpy, for instance, for this kind of, my Python application is frozen.

33:11 So in that case, you don't care if the photo is fast or slow because it's already frozen.

33:15 Like, who cares, right?

33:16 But on the other hand, if you have a crashing application, especially a core file, then you're kind of off the black because these projects don't work at the time that we're speaking for core files.

33:24 Yeah.

33:24 PyStack really seems to have a unique feature set, a special place in the ecosystem.

33:29 And most of the extra features that PyStack has, which are not the main functionality, are basically around this idea of we are a devirter.

33:38 So for instance, we can give you extra metadata.

33:40 Like, you know, some of this metadata actually also is shared with these tools, but some other is not.

33:45 So we give you things like which thread has the gil at the time or like if the GC is running on the thread or not.

33:50 We also tell you like the colon offsets and things like that.

33:54 So there is a lot of like extra stuff that we can provide to you so you can back easily, more easily your applications.

34:02 For instance, for the C code, if it's available, which means that we can give you the colon offsets as well of the source code that generated the binaries.

34:11 You need a modern compiler to do that.

34:14 Like only Dwarf 4, I think, has this information.

34:17 Dwarf is the devising format for C, which is kind of funny because the binary format is ELF.

34:22 So, you know, it's ELF and Dwarf. ELF stands for X-Fail-A-L-L-E-C-O-L-F-O-R-M-E.

34:26 But Dwarf doesn't have an acronym.

34:30 It's just funny.

34:31 They come with this weird acronym, I think they call it, when you come with acronym after the fact.

34:36 So you just say, oh, Dwarf is very funny.

34:38 Let's try to put it.

34:39 And I think now it stands on debugging with arbitrary format or something.

34:45 It's just really bad.

34:46 Try to fit the acronym into the thing.

34:50 Then we call it the Shrek service, and we'll see where it goes from there.

34:54 Exactly.

34:55 Yeah, literally.

34:55 So the idea is that we give you this extra kind of metadata around it.

34:59 Every time we do this, we try to do more.

35:02 For instance, we are now talking about with 3.12, we are going to release in Python subinterpreters.

35:07 So we are discussing the possibility of subinterpreters also in Pyestack or maybe a Sync IOTASC.

35:13 I mean, these are not actual features that we are running right now, but the idea is that we are considering these things.

35:17 And that for a profiler, maybe it's just too hard because it means that you need to inspect a lot more memory.

35:22 And your photo is going to be prohibitively slow.

35:26 For us, it's not because we just take one photo and it just needs to be a very good one.

35:30 Yeah, excellent.

35:30 Yeah, see, you are making your life harder over and over.

35:33 Quick question from the audience in the live stream.

35:36 And Tony says, could this be utilized in something like AWS Lambda as error handling?

35:41 Grab the core dump if it bombs, since you wouldn't have access to the runtime after the Lambda executed.

35:46 I wouldn't.

35:47 It's not as a compiler engineer, I would say I don't know anything about AWS.

35:53 So I don't know if it's the best way to do it.

35:56 But yeah, absolutely.

35:57 This is something that you can do.

35:58 If you generate a core file, then Pyestack can absolutely handle your core file.

36:01 No problem.

36:02 I think the only question I have there is if you can get the core file out after it is crashed.

36:06 But as long as there's some way to get the core file out, you definitely could inspect it with Pyestack.

36:10 Oh, interesting.

36:11 Is there a Python code level API for working with Pyestack or is it an outside only thing?

36:17 Not at the time.

36:18 It's just a command line application.

36:19 Gotcha.

36:20 Okay.

36:20 We could expose it.

36:21 Like Pyestack is only a library with a lot of functionality.

36:25 So if there's people that want to use it for other things, we are quite happy to expose it.

36:29 Yeah.

36:30 I'm thinking things like for Cprofile, you can turn off the profiling at startup and then turn

36:35 it back on with Python code or where you could set like an at exit callback potentially to like

36:42 those kinds of things.

36:43 We don't have this, but we do have some other cool thing that let's say it intersects 20%

36:48 with what you said.

36:49 Just to be clear, I'm not trying to answer your question fully, but I think it's right.

36:53 We have this pytest plugin, which basically you can install.

36:56 And if some of your test crashes, it's going to just run Pyestack or crashes or freezes.

37:01 It's going to run Pyestack on that and it's going to show you what happened.

37:04 We are talking also to have something similar to FoldHandler, which is a standard library

37:09 module that you can activate.

37:10 And if your process crashes, it shows you the Python stack.

37:13 But again, you're missing the C stack.

37:14 So we are going to allow you to also have this idea of like, oh, I want to just run my

37:19 Python application.

37:20 If it crashes, then I want Pyestack to be run on the process so I can see what was happening

37:25 there.

37:26 By the way, this literally is something that was used.

37:29 For instance, URLib3, the project URLib3, I think is the most downloaded package on PyPi

37:35 or is close to B.

37:36 So they use Memroid, which is our memory profiler.

37:39 And sometimes, you know, again, when the moon was full on Wednesdays, it was crashing on some

37:45 weird test.

37:46 And, you know, at the time we asked them, well, you know, Memroid uses C++ code underneath.

37:52 No surprise there.

37:53 And you're using Python code that uses Memroid.

37:55 So the crash is happening in some combination of both.

37:58 And we needed both stacks to debug what was going on.

38:01 And I thought, man, if only we have Pyestack open source.

38:05 We could just tell them, like, run this thing on your test suite.

38:08 But at the time we didn't have it.

38:09 So we had to, like, ask them for a core file.

38:12 They tried.

38:13 But, like, so at the end we ended having to try to reproduce it on our side, which was

38:17 really hard because this was a race condition, basically.

38:20 And the race condition.

38:21 So I did all the weird techniques, like running a Docker container with 0.001 CPU quota, like,

38:27 running hundreds of test suites at the same time.

38:30 It was not a fun afternoon, let's say.

38:32 If I recall correctly, that took us, like, a full 24 hours to reproduce, just running the

38:36 tests in a loop until we managed to catch it the first time.

38:39 Wow.

38:40 If you run into my room at the time, you see this meme when there's this guy with the

38:44 blackboard with a lot of threads, like, moving their hands, like, super crazy.

38:48 So that was me at the time, running, like, six T-Max splits with the test suite, right?

38:54 So, yeah, this tool is, when you need it, it's really useful.

38:58 It's the kind of thing that a lot of times you don't need.

39:00 That's interesting.

39:01 And then when you do need it, amazing.

39:03 And this is key because what happens normally with debugging tools, like, GDB is a very good

39:07 example of this, is that they are very hard to use.

39:09 Like, the kind of, like, amount of knowledge that you need is quite high.

39:13 They are not very ergonomic, which means that, you know, it's not the easiest thing.

39:17 You need to get used to them and their language and what they can do and what they cannot do.

39:21 And, like, you know, it's the kind of thing that normally people learn when they need to,

39:25 which is the worst time to learn it because you need to solve the problem, not learn how to

39:28 use the tool.

39:29 You're already in a bad mental state.

39:30 It's very annoying.

39:31 We are trying to do quite a lot, both on Pyostec, but also on memory.

39:37 memory profiler membrane is to offer a really good UX around these tools.

39:41 So that's why we are offering this pytest plugin and thinking about doing this 400 thing because

39:47 it's not just, like, the tool itself that you can execute, but also, like, we want you to

39:51 not have to think about it.

39:52 So you don't, it's not the tool to reach, it's the tool that is backing you up.

39:56 So you set it once, you forget about the fact that it exists.

39:59 When something happens, you are really happy that you set that thing up.

40:03 And that's the experience that we want people.

40:05 This is the other kind of extra thing that we are trying to put into Pyostec, that the UX

40:11 is really good.

40:12 So, you know, like, Matt, I think you want to...

40:14 And as far as the UX goes, I think it's helpful to keep in mind that when people are using these

40:18 tools, it's almost certainly not because they want to.

40:20 Like, no one is having a good time when they're using these tools.

40:23 They're using these tools because something has already gone wrong and stopped them from doing

40:26 what they wanted to be doing in the first place.

40:28 And now they need to backtrack and figure out why.

40:31 You're kind of like an emergency room doctor.

40:32 People don't ever want to meet the emergency room doctor, but they're happy they're there.

40:36 Exactly.

40:36 Yeah, that's the key.

40:38 Yeah.

40:38 Matt, I noticed looking at the GitHub repo for Pystack that there's a lot of languages involved

40:44 here.

40:44 We got a good chunk of Python, C++, Cython, C, only a little bit of C, I guess.

40:50 But you both have to keep a lot of technology interplay in mind just working on this, right?

40:55 Yeah, definitely.

40:56 My career has been as a C++ developer mostly.

40:59 So it tends to surprise people when we tell them that on the Python infrastructure team,

41:04 we spend most of our time working on C++ and relatively little time writing Python, comparatively

41:10 little.

41:10 If you actually looked at the way this code breaks down in the Pystack repo, you would see that

41:15 even though it's predominantly Python code, the Python code is predominantly in the test suite.

41:19 most of the actual code for Pystack is in C++ or in Cython, not in Python.

41:26 I was going to guess that Python might be in there like reporting, CLI parsing.

41:31 Later is what it is, yes.

41:33 You're exactly right.

41:34 And then is there in some parts when really C++ will be overkill or two verbose or two annoying?

41:40 Like I think parsing some stuff, I think is like preparing the input to the C++ code, let's

41:46 say.

41:46 Yeah.

41:47 Parsing proc maps and things like that.

41:49 But that's the other thing.

41:49 Like one of the reasons there is so much C++ actually is not only because of performance,

41:53 but because these tools need to play quite heavily on systems programming techniques.

41:59 Pystack plays a lot of, let's say, quote unquote, dark magic.

42:04 It's not as dark as our other tool, the profiler, because the profiler is just in another level

42:09 of darkness that has like gone through many dark rituals already.

42:13 And here as well, because like at the end of the day, what these tools do, what Pystack

42:17 does is that it's reading memory from a different program.

42:20 That is quite complicated because when you read memory from a different program, there is nothing.

42:24 It's just bytes.

42:25 Here is some bytes.

42:26 And then you need to figure out what they are.

42:28 And most of the time, the bytes that you're reading are not backed by anything that you

42:33 can use to make sense of.

42:34 Because, you know, for instance, GDB, when it reads those bytes, it has the bug information.

42:39 So it knows that, oh, I'm reading bytes at this address.

42:42 But these bytes means like, oh, a byte interpreter state extract.

42:46 So I know that, you know, the first eight bytes are this and that.

42:48 And I know where to locate things.

42:50 We do that if they're available because we don't want to make our lives harder just for

42:55 no reason.

42:55 But because we have to support the cases when that information is not there, we employ these

42:59 extra techniques trying to make sense of those bytes without knowing what they are.

43:04 So there is like a bunch of heuristics and checks the heuristics.

43:07 And this can get like quite hardcore because like I think at some point we had like four

43:11 levels of checks just because of one heuristic.

43:14 You know, the kind of thing when you say, well, there is no way in the world if these things

43:19 are true is not what I'm searching for.

43:21 Well, I will tell you, yes, it will happen.

43:24 We have seen those when like I remember that we were having this discussion.

43:27 We have four checks for something.

43:28 Basically, we are reading some bytes and we are making some pointers.

43:32 And if a bunch of conditions are true, we are sure that we have located some important

43:37 piece of information.

43:38 The interpreter state, I think, or the threat state, whatever it is.

43:40 And Matt was saying, well, but you know, there is this case when these things can be true

43:44 and it's still not it because, you know, it just happens to have these properties.

43:48 And I said, okay, let me calculate mathematically the probability of that happening.

43:52 And it was 0.001%, right?

43:55 And then we said, cool, never happening.

43:56 I think it was three months until it happened.

43:58 So, you know, like, yes, we need to take care of a lot of these things to ensure that.

44:04 A lot of edge cases.

44:05 I'm starting to understand the black magic.

44:07 Right, right.

44:08 That's part of it.

44:09 The other part is just like all the synonyms with like, you know, stopping the process,

44:13 making sense of Cpython, trying to extract information from Cpython in ways that Cpython is not prepared

44:19 to.

44:20 So you need to know a lot about everything, a lot about systems programming, how to read

44:24 memory from processes, how to stop processes.

44:26 And also these tools, this is quite important to mention as well.

44:29 These tools are supposed to GDB because when you attach GDB to a project, GDB can do whatever

44:34 you want.

44:34 It can either inject code into the process.

44:36 It can call things in the process.

44:38 So many teams have GDB forbidden in production because GDB, attaching GDB can do arbitrary things,

44:44 right?

44:44 You don't want that a lot of the time, especially if you are under compliance or you have like

44:49 secrets.

44:49 Right, right.

44:50 If you're in a banking industry, say, or something like that, and you try to catch a problem and

44:55 it changes, now it makes decisions.

44:57 That might not be awesome.

44:59 And GDB can make your application crash.

45:00 Like you can absolutely do that because it can inject code.

45:03 Like you see, like I invite everyone interested on, like try to learn how GDB calls functions

45:08 in your process.

45:10 So you attach GDB, you can call a function in the process.

45:13 Just learn how that is done and you're going to cry.

45:15 And if you want to cry even more because you say, well, man, I still have some tears in my

45:19 eyes and it's not enough.

45:21 Just learn how LLDB does it.

45:23 Like this is the other debugger like from LLBM because that is just bananas.

45:27 That is just another level of craziness.

45:29 So these tools are very powerful, but they're also a bit dangerous.

45:32 So the other thing that we really, really put a lot of effort in is that these tools just

45:36 read memory.

45:37 They don't modify the process at all.

45:39 The only thing they do is stop it, which is always a safe operation.

45:42 Then they restart the process.

45:44 That's all.

45:44 And you can also choose not to stop it if you don't want to.

45:47 Because for instance, you have some super performance applications, so PyStack can still take snapshots

45:52 with the process running if you really need to.

45:54 Sacrificing, obviously, that the photo may be a bit blurry, let's say.

45:58 Most of the time it will be, but it can be.

46:00 But you can also ask for that.

46:01 But the idea is that these are safe to use on running processes because we don't touch

46:04 the memory at all.

46:05 We just read it.

46:06 I will say, you say that stopping a process is always safe.

46:09 That's not necessarily true.

46:11 It does change the behavior of syscalls that are in the middle of happening.

46:14 They get an eenter.

46:16 They'll get an eenter.

46:17 And that can change the behavior of the program.

46:19 What's supposed to happen is that the program detects that the syscall has been interrupted

46:23 and retries it.

46:24 But not all of them do that all the time.

46:26 Because it's C and your error handling is all manual.

46:29 It's very easy to miss a place where you needed to retry something.

46:32 CPython does it.

46:33 So it's kind of, unless you have your custom code most of the times, it's safe to do.

46:37 That's true.

46:37 But this can happen also if you send signals to your process.

46:40 So for instance, you have any process and then it just happens to send a signal or someone

46:45 interrupts the process.

46:46 Like for instance, because you have it under a scheduler or you're using some very old

46:50 kernel, for instance, that sends six stops.

46:52 Or you're running it in a cluster when it can put you in the freeze C group.

46:57 You will get the same situation.

46:59 And you will see, for instance, in CPython all the time, this loop that just checks if

47:02 for instance, you're reading bytes.

47:04 And then your read call finishes.

47:06 And then you normally assume, well, it has finished because I have read all the bytes that

47:10 I wanted.

47:10 Well, it may be not true because you may have been interrupted.

47:14 So you need to try again.

47:15 Sure.

47:15 And at a higher level, it could be I was calling an API and then it paused.

47:20 And that actually caused it to time out or something like that.

47:23 Right.

47:23 Or a database connection reset or something weird at that level.

47:27 But I think a big difference here is these are a single call went crazy while you paused

47:35 it.

47:35 Whereas when you talk about injecting code, you could have messed it up for the rest of the

47:40 life of the process.

47:41 Yeah.

47:41 In unknown ways.

47:42 Yeah, absolutely.

47:43 Technically attaching GDB is undefined behavior because you can modify a retrain memory in ways

47:48 that you don't know what's going on.

47:49 Obviously, it's not going to be the case because GDB doesn't do that by default.

47:53 But just calling functions can alter, especially if you're calling to a C API.

47:57 If you just happen to have a pointer and then you want to print the pointer and you call

48:01 pydamp object.

48:02 Now you are like, who knows what happens?

48:04 You are just calling.

48:05 You need the guild, for instance, to do that.

48:07 So it's very unclear what's going on.

48:10 So we don't do any of that.

48:11 Even in cases where you're able to successfully call a function with GDB, it manages to get its

48:16 stub injected and do everything that it needs to do to set up the call.

48:18 You can wind up in a situation where you don't satisfy some of the invariants for that call.

48:23 And that call winds up segfaulting in code injected by GDB.

48:27 And it tries to recover from that, but it can't always.

48:29 So you can very easily get yourself in a situation where you thought you were doing something read

48:33 only and managed to crash the process that you were trying to inspect.

48:36 You can see that we learned this the hard way for our other tool because one thing our memory

48:42 profiler does, this is memory.

48:44 So not PyStack, this is the other tool.

48:46 So the other tool allows you to attach to a process, right?

48:49 You have a process that is happily running.

48:51 And then you say, now I want to profile this process that is already running.

48:55 So I just want to know every time it makes allocations, I just want to know that it's happening.

48:59 Or you just want to see it live.

49:00 And so what we do is that in that case, we inject memory into the process to just prepare the

49:06 profiler and I'll do all the stuff, which then we learn the hard way, all the cases when you

49:10 cannot do that.

49:10 Like calling malloc under malloc.

49:12 Because, you know, like if your setup process requires memory and your process is already

49:17 allocating memory, then you're calling malloc under malloc.

49:21 And that is undefined behavior.

49:22 Well, most of the things that crash.

49:23 Yeah, I can't imagine how tricky that stuff is, the memory stuff.

49:27 All right, let's talk through some of the features.

49:29 We've touched on a lot of these, but I kind of just like got a great long list of amazing

49:33 things that PyStack can do.

49:35 I'll just breeze over the ones we've talked about already, but then potentially dive into

49:39 the others.

49:40 So it works on both running processes.

49:42 And it's one of the really unique aspects is on the core dump files.

49:47 That's very cool.

49:48 Just to complete this part, it works on all core dump files, which is a huge, like if you

49:53 are in the world of like how these things work is really hard because core dumps don't have

49:59 a specification.

50:00 So this is very important.

50:01 Like there is no document that will tell you how core dumps work.

50:04 This is the first surprise that you will have if you try to search for it.

50:08 So you will see how they normally work, but the amount of weird stuff that can happen is

50:14 just countless.

50:15 Like this is whatever the kernel is doing and whatever the version of the kernel is doing.

50:19 So you can see super weird stuff.

50:22 Not just the kernel either.

50:23 What G-Core does doesn't go through the kernel.

50:26 So if you're using GDB to generate a core file, you might get something that's in an entirely

50:29 different format than what the kernel would have dumped.

50:31 And the core files can miss data.

50:33 So it's not really always a memory dump of the process, like a complete one.

50:37 Because for instance, imagine that you're in a system and you have like five Python applications

50:42 and then you generate a whole dump of the process.

50:44 Well, those five Python applications are going to have loaded a lot of libraries that are common,

50:48 like libc, I don't know, open SSL.

50:51 So a bunch of these libraries, so are you going to just include a lot of them?

50:54 Well, technically you should, because that's what was loaded in the memory.

50:57 But that's going to generate huge core files, like gigabytes in size.

51:01 So a lot of the optimizations that I've learned is that, well, you know, if it's a cell library,

51:05 just go and read the cell library.

51:07 So I'm not going to include it.

51:08 Which means that tools need to know that this is happening.

51:11 And then when they see a pointer and they try to search in the core, they're going to find

51:15 that there's nothing there.

51:16 So they need to go to the library.

51:17 So there is a lot of layers that you need to go.

51:20 So the second part, when it says works on core files, it works on all of them,

51:24 which is quite a huge statement.

51:26 I guess we should touch on what platforms PyStack can run on.

51:30 Just Linux.

51:31 Linux, all right.

51:32 Because, I mean, it could work.

51:34 And this is an important fact.

51:36 Like, for instance, if you're running on Windows or macOS, you probably want to use the other tools that we mentioned, like PySpot or Austin.

51:43 I think both run on all platforms.

51:46 But yes, this is because we want to ensure that we do this very well and we cover all the cases

51:52 and we have enough with one operative system.

51:54 Our other tools work on macOS as well.

51:57 So the profiler memory works on macOS.

51:59 So we don't only do tools that work on Linux, but this one only works on Linux.

52:03 And to be fair, it does also work on Windows in WSL.

52:07 That is my main development environment.

52:09 So if not natively on Windows, but at least if you're in a virtual machine on Windows, you're fine.

52:14 PyStack will work on WSL?

52:16 Yep.

52:16 Yeah, absolutely.

52:16 Yeah.

52:17 Okay, cool.

52:18 And I suppose it works on Docker running Linux on a bunch of machines.

52:23 Like on, you know, parallels on Mac.

52:25 And there's a lot of ways on the different platforms.

52:28 Yeah, I develop on Docker on Mac.

52:29 So for instance, I run PyStack on Docker on Mac.

52:31 No problem.

52:32 And even in the new ones, the M1 ones works nicely.

52:35 Cool.

52:36 Includes calls to inline functions in the native stack.

52:39 Ah, that's a funny one.

52:41 So one of the things we do is that one thing that can happen is that the C compilers, they

52:46 really like to do this because it's very efficient.

52:47 Sometimes following some heuristics, they can say, well, you're calling this function, but

52:52 this function is kind of small.

52:53 So generating all the, you know, assembly code to prepare the call and finalize the call plus

52:59 all the locals and the stack and whatnot is kind of very expensive.

53:02 So what they do basically is copy paste the code in the caller.

53:05 So, and they set up everything.

53:06 So, you know, it works nicely.

53:08 The locals are not overwritten just because you use the name foo in both.

53:11 Right.

53:12 So, so it kind of works, but it is that instead of calling a function, just copy paste the code.

53:17 But the, basically the effect that this has on the backtrace is that there is no function

53:21 called.

53:22 So there is no function.

53:22 So when you are calling that function, it disappears.

53:25 So it's like, you never call it.

53:27 And this can be quite confusing when you're looking at stack trace, because if you have function

53:30 B that calls function C and C equals D and the middle one is in line, you're not

53:34 going to see it.

53:35 And then you're going to say a calling C. And you say, well, there's no way that happens

53:38 because I'm not going to see here.

53:40 This can make this kind of like backtraces very confusing.

53:42 This is in C, not in Python, right?

53:44 Because Python doesn't have inlining.

53:46 Exactly.

53:46 Python doesn't have inlining.

53:47 That's true.

53:48 We have something that we call inlining, but it's not the same thing.

53:51 So I'm not going to explain that.

53:53 Compiler optimization type of thing.

53:55 Yeah.

53:55 There is no inlining in Python.

53:56 That's correct.

53:56 Let's just leave it like that.

53:58 But it's in C and C++ and RAS and whatnot.

54:00 So if there is debugging information, we can recover these inline calls.

54:05 Which is something that, by the way, GDB can also do.

54:08 But we can do it as well.

54:10 So there is enough debugging information.

54:12 We actually work in some cases when GDB doesn't.

54:14 Just because GDB tries to be very correct in some of these cases.

54:17 But for whatever reason, it's overcorrect.

54:20 We can actually do it most of the time.

54:22 But yes, this is a feature that, so you have one of these inline calls.

54:26 And we do more than that.

54:27 So for instance, if you have like extreme debugging information that you can activate by passing, for instance,

54:32 if you compile something with GCC, you can pass minus G3.

54:35 That's debugging information level three.

54:37 So put everything there.

54:38 We can even show you macros.

54:40 So you're using macros.

54:41 The macro expands to source, basically.

54:44 And then that source is passed to the compiler.

54:46 So there is no macro at the compiler level.

54:49 The compiler is going to see the source itself because the pre-processor kind of expands the macro.

54:53 But there is a technique in the debugging information that can include the fact that there was a macro there.

54:57 So we can show you the macro.

54:59 We can say this was a macro.

55:00 So we can pretend that there was a function call.

55:02 That's quite cool.

55:03 That's crazy.

55:04 I didn't see that coming, yeah.

55:05 So when are we getting inlining of functions as an optimization of Python, huh?

55:09 To be honest, there is some interesting thing that are closed.

55:11 There is this PEP that was approved to inline list comprehensions in function calls.

55:17 I was about to say.

55:18 As you can see, what I said, the consequence is that you basically copy-paste calls,

55:22 so the function call disappears.

55:23 This will happen in Python, by the way.

55:26 So when the PEP is implemented, which, by the way, it is implemented, if I recall correctly,

55:30 what happens is that you see a backtrace with PyStack, for instance.

55:34 You're not going to see the list comprehension frame, which is fine, because most of the time

55:38 doesn't add you anything, because it's going to tell you, here's a list comprehension, and

55:42 then you're calling a function call list comprehension.

55:43 So it's kind of weird.

55:45 But the interesting parts of list comprehensions being function calls, basically.

55:49 I mean, it's not really functions calls.

55:51 They have their own frame.

55:52 But the interesting part here, which was one big change from Python 2 to Python 3, is

55:56 that variables inside the comprehensions are local to the comprehension, which means that

56:00 you have a function, a variable called x outside, and then you use a variable called x inside,

56:05 and you assign to that by using the comprehension name.

56:08 The one outside is not modified, right?

56:10 This is maintained here, even if it's inline, because, you know, even if it's inline, that

56:14 is maintained.

56:14 But there is some cases when that behavior is very, very tricky, particularly class scopes.

56:20 So you have a comprehension in a class scope, which already is something weird to do, but

56:24 you can absolutely do it.

56:25 Like, class scopes are quite wild.

56:27 They are not, they don't behave like function scopes.

56:29 There was a bunch of edge cases that we saw.

56:33 This comprehension in inlining is deactivated on class scopes, for instance, just because

56:37 there was some consequences of the inlining.

56:39 There is some, in the discussion of the pep, that is the case, if you want to see it here.

56:43 Explaining this here will be a bit weird, but it's quite important because inlining always

56:47 has consequences.

56:49 One of them is the frame is missing, but this frame is not going to be missed by anyone

56:52 because it doesn't really add anything.

56:53 We sort of have them.

56:54 One of the things I think is really cool about asking for information here, and it's just

57:00 really helpful, maybe beyond even like a good log message and stuff, is not only do you see

57:05 the stack trace, the call stack here, when you, this line in this file called this function

57:10 and so on, but you can see optionally the local variables, right?

57:14 What's extremely interesting is that you would normally need to call like the under-repper

57:19 method of an object in order to figure out how to print it out in a user-friendly way,

57:24 but we can't do that, right?

57:25 We're working on crash processes.

57:27 We're reading one byte of memory at a time to try to interpret it.

57:29 So in order to give you these locals, PyStack needs to be able to understand the CPython

57:35 representation of a list and know how to iterate over a list manually to figure out what elements

57:39 it contains and recursively get the repper for each of those methods or for each of those

57:44 objects that are in the collection as well.

57:46 It can't just rely on being able to call Python code to get you this string.

57:50 And manually here means that it needs to know that at least in Python release a bunch of

57:54 pointers that points to a buffer and the buffer is a bunch of Py objects and then every object

57:59 can be different.

58:00 So obviously this means that we cannot print all objects.

58:03 So if you have a custom object, we cannot print that.

58:05 We will print something.

58:07 We will tell you, for instance, the name of the class.

58:09 We will say, we will almost actually, if there is no repo.

58:12 So we will say custom object instance at location, blah, blah, blah.

58:16 So the default repo that you will get if you create a class.

58:18 But for most of the common types, dictionary sets, integer floats, etc.

58:24 We use functions, all of these things.

58:26 We actually are able to print it.

58:28 Again, here the idea is adding debugging, help you debug these things.

58:32 So obviously it's not going to be the same as having a debugger attached to something that

58:37 you can inspect.

58:38 But most of the time you don't really need it because most of the time you need to know

58:42 the locals is because you have a function call and the function call has,

58:46 the specific argument that are passed to the function and it modifies how the function

58:50 behaves.

58:50 Like for instance, imagine that it has a keyword argument and the keyword argument is strict

58:54 or replaced.

58:55 Like the unique code encode what you really want to know you pass one or the other because

58:59 otherwise it's going to trigger different code paths.

59:00 So if you use this, that's this local option in Pyast that you will see what arguments were

59:07 passed to the functions and also the local variables in the functions.

59:09 And if most of the time it's just this, you know, built-in types, like lists or things

59:14 like that, then it's going to be very useful and it's going to be mostly enough.

59:18 I think, I mean, it's kind of weird that being the authors, we say this because obviously we're

59:23 going to say nice things, but I swear it's true.

59:25 Every time I particularly myself needed this option, the things that we were printing were

59:30 the things that I needed to know.

59:31 So I didn't really need to know because you will say, well, if I have an umpire array, I

59:35 won't see the array, right?

59:37 You will show me numpire array.

59:38 Well, sure, but that won't help you debug a crashing code because it doesn't really

59:43 matter what is in the numpire array.

59:44 It's also fair to point out that if someone ever finds like a built-in type that they needed

59:49 to know the value of in order to debug a problem, they can bring it to us and we can see if we

59:53 can implement it.

59:53 Benefits of open source.

59:55 Yeah.

59:55 Two thoughts sort of came up for me when I was listening to all describe that.

59:59 And this is just such a cool feature.

01:00:00 One, you know, if it's a class custom object that does not have slots, you could grab just

01:00:07 the dunderdick and kind of print it as a dict.

01:00:10 It would be one, like if you...

01:00:11 Yes, but no.

01:00:13 This is a very interesting question, actually, as you ask.

01:00:16 For instance, in Python 3.12, there is an optimization in which there is no dunderdick.

01:00:21 So this is quite funny, actually.

01:00:23 It's quite funny because what happens is that this is one of the optimizations of the faster

01:00:28 C-Python project.

01:00:29 This, I think, was done by Inada Sun and Mark Channel.

01:00:32 So the idea here is that if you think about it, if you have an object that has an dunderdick,

01:00:37 right, and it has a hash table, unless you want the dictionary itself and wants to just

01:00:41 say, here is a dictionary, I can just, you know, take a photo and put it in a poster in

01:00:46 my room because I like it.

01:00:47 Unless you want the dictionary as itself, you normally want...

01:00:50 Maybe take a second to explain what dunderdick is.

01:00:53 All right.

01:00:53 So most objects in Python, like you have a my class animal, and animal has a bunch of attributes

01:00:59 like name and age and like, you know, kind of animal or whatever.

01:01:03 So in Python, those attributes are internally represented with a hash table, which in Python

01:01:07 we call a dictionary.

01:01:08 And you can ask the Python interpreter to show you that internal dictionary.

01:01:13 So normally you will say my animal dot name, that will print Bimo, which is the name of

01:01:18 my cat.

01:01:19 But you can actually ask for that hash table that is internally.

01:01:21 And for that, you will need to know my animal dot dunderdick.

01:01:25 So underscore, underscore, dig, underscore, underscore.

01:01:26 And that will give you the internal hash table with the name of all the attributes that you

01:01:30 have.

01:01:30 So it will show name, kind, age as a strings, and they will show you the actual values.

01:01:35 So that's normally how Python is represented internally.

01:01:38 When you do attribute access internally, it goes to this hash table in different ways and

01:01:43 stretches this argument, right?

01:01:44 But in Python 3.12, we said, well, among other things, because this optimization touches

01:01:49 many things.

01:01:49 Really having a hash table represented with a full dictionary is a bit expensive.

01:01:54 Because if you think about it, if you access an attribute, having the full hash table is not

01:02:00 really needed.

01:02:00 Among other things, because the hash table has a bunch of things that allow it to work as

01:02:05 a Python object.

01:02:06 But you don't really enjoy those things.

01:02:08 Like, for instance, it has a pointer to the class and it's telling you, I'm a dictionary.

01:02:12 But you already know it's a dictionary because that's what we put there.

01:02:14 So having the whole full dictionary with reference counts and all that stuff as a normal Python

01:02:19 object is expensive memory-wise, but also forces you to have a bunch of indirections.

01:02:24 And we already have a bunch of optimizations here.

01:02:27 For instance, one of the optimizations that we had is that if you have a class, let's say

01:02:31 animal again, most of the instances of the class, if not all, are going to have the same

01:02:35 attributes because normally all animals, all cats have name, age, and whatever, right?

01:02:40 You absolutely can add new attributes.

01:02:42 So this is something that you can, but normally you don't.

01:02:44 So what we do is that instead of storing the same names in the dictionary of every instance,

01:02:50 we put those names in the class because they are going to be common.

01:02:53 And then if you add extra attributes, we kind of add it to the dictionary after the fact.

01:02:58 But that dictionary is already weird, in the sense that sometimes the keys are outside the

01:03:04 dictionary just because they're shared.

01:03:05 This is the shared key dictionary optimization.

01:03:08 So we went a step ahead and we just eliminated the dictionary.

01:03:12 So now what we have is the internals of the hash table, like let's say row.

01:03:16 So there is no Python object kind of wrapping around it.

01:03:19 It's kind of like just raw pointers.

01:03:22 And if only if you ask for that dunderdick and you want to say, well, I don't care about

01:03:27 your optimization.

01:03:27 Just give me the hash table because code that does that still needs to see the hash table.

01:03:31 We cannot break that.

01:03:32 So only when you ask for the dictionary, we instantiate that dictionary and then we give

01:03:37 it to you.

01:03:37 So before calling dunderdick was just getting a pointer and that's it.

01:03:41 Here's the dictionary.

01:03:41 Now calling dunderdick computes stuff.

01:03:44 Like it just creates a dictionary on the fly and give it to you.

01:03:46 Which means, by the way, this is a nice piece of trivia.

01:03:49 Before, if you want to calculate the approximate, because this is always approximate, there is

01:03:55 no full way to say how big is my Python object and everything it contains because Python objects

01:04:00 are a graph.

01:04:01 And that's that question.

01:04:02 Most of the time doesn't make sense.

01:04:04 For instance, Python objects point to their module and you don't want to also include the

01:04:08 module in the size, right?

01:04:10 Among other things.

01:04:11 But if you want to know the size of custom object, you normally says size of the instance

01:04:16 plus size of the dict.

01:04:17 But now in Python 3.0, just by asking for the size of the dict by doing my instance, the

01:04:22 other dict, you just make it bigger.

01:04:24 So the real size of the object was actually smaller than what you will get.

01:04:28 I think that sounds like a great optimization, but it does make your life harder here.

01:04:32 The other stuff that I had is...

01:04:34 Well, I mean, it's not technically harder.

01:04:36 Among other things, because we know already how to print dictionaries.

01:04:39 So that's fine.

01:04:40 If it's a dictionary, we print dictionaries.

01:04:42 And if it's not a dictionary, we are already in the business of inspecting internal structures.

01:04:46 And we know absolutely how to interpret parts of hash tables.

01:04:50 And at the end of the day, what we have here is parts of hash tables for now.

01:04:54 The problem here is more about what happens if it changes in the future.

01:04:58 Because right now, for instance, it's easy.

01:04:59 But as we optimize more and more in the next release, it's not going to be easy.

01:05:04 And right now we can say, well, we support this, this, that, and this other thing.

01:05:08 Which means that every time that new Python version is published, we need to just go to

01:05:12 all those things and check if there were change and then change our code, which is quite a lot

01:05:16 of work.

01:05:17 But if we support more types, it means that we need to do this thing for more things.

01:05:20 And sometimes it's harder, right?

01:05:22 Especially custom objects.

01:05:23 Who knows what we find there?

01:05:25 So we kind of like stay away from that because, you know, it's maybe a lot of work.

01:05:30 You're already busy, right?

01:05:31 Like we established.

01:05:32 All right.

01:05:33 Let me just flip through here and see if there's anything else that we want to cover.

01:05:37 I feel like that's pretty much it.

01:05:38 Maybe just one more shout out to the pytest plugin and whatever else you'll think we should mention

01:05:43 because we're running out.

01:05:44 One last thing I think is interesting here just to highlight, which is kind of cool,

01:05:49 is that we, in both our tools, so PyStack and the backend, this kind of links into the conversation

01:05:54 that we had before around the UX and how we put a lot of emphasis on UX and making these

01:05:58 tools super easy.

01:05:59 So, for instance, as we mentioned before, we were talking about some of these features

01:06:02 and then I say, for instance, the inline, right?

01:06:04 We said, well, if you have the right information, we do this.

01:06:07 But as I said before at the beginning of the podcast, I said most of the things don't have

01:06:11 the right information.

01:06:12 So most people would say, well, then, like, what is this point of this, right?

01:06:15 So, but then we said, okay, so we really want to make this thing easy.

01:06:19 So we don't want to tell people, well, if you want to use this feature, then you need to

01:06:22 install this thing and just find your distribution, how to, yada, yada, yada, it's kind of annoying,

01:06:26 right?

01:06:26 So one of the things we leverage in both our tools, in PyStack and Membray, is this

01:06:31 thing that Matt mentioned before, this debugging for the server.

01:06:34 So this means that in most distributions, the modern distributions, so this means that the

01:06:38 latest versions of Ubuntu, Debian, Fedora, Arklinus, it works on most of the new ones.

01:06:43 There is a way that debugging tools can say, okay, I have here a binary, like let's say Python,

01:06:48 and this binary doesn't have the right information, but I really need it.

01:06:52 So can you give me the right information?

01:06:54 And it will download it automatically for you, so you don't need to do anything.

01:06:57 The tool will do it for you.

01:06:58 So it will figure out what the bug information it needs from the process it's analyzing.

01:07:03 It will go to the distribution.

01:07:04 It will say, hey, can you give me the bug information for this, this, this, and this?

01:07:08 It will download it for you.

01:07:10 It will automatically merge it to the binary, and it will use it for showing the inlines or

01:07:15 the C code or whatever it is.

01:07:17 So this means that most of the time, what you will see is that the first time you analyze

01:07:21 a Python process, it will take a bit more time just because it's loading these files, and

01:07:25 these files can be a bit big.

01:07:26 It will tell you that it's doing it.

01:07:27 And then it will, these files then are cache for subsequent calls, so you don't download

01:07:32 it every single time.

01:07:33 But then it just works by magic.

01:07:35 So you have this kind of process and have the bug information, and voila, it just works.

01:07:39 You don't need to do anything.

01:07:40 You don't need to know about the fact that you need the bug information or the fact that

01:07:44 your Python is optimized and doesn't have anything.

01:07:46 You just work.

01:07:47 It just works.

01:07:48 So you just need a new enough distribution.

01:07:50 And even if you don't have a new enough distribution, there is a way to set up in the old ones.

01:07:53 But anyway, if you're using one of the latest support to the Wuntu, Debian, Arc, Fedora, Red Hat,

01:07:58 all of these habits, then magically it will just work, which is something that we really, really

01:08:03 are happy about because it means that you don't need to know about all of these things.

01:08:08 You will just get it.

01:08:09 That's really fantastic.

01:08:09 And go out and grab it and just get it for you without you worrying about it.

01:08:13 It feels magical.

01:08:13 Like the first time I saw it, I said, I need this.

01:08:16 Like this is the future.

01:08:18 Like this is the future.

01:08:19 Because if you have to do this manually, it just makes you miserable.

01:08:22 And I know how to do it.

01:08:23 I know how you need to do it.

01:08:24 And it still makes me miserable.

01:08:25 So I don't want to do it.

01:08:27 I just want the tool to figure it out.

01:08:29 And these both tools do it.

01:08:31 So that's kind of cool.

01:08:32 Yeah, that's fantastic.

01:08:33 All right, you guys, I think we're out of time here.

01:08:35 But, you know, final thoughts.

01:08:37 People are excited about PyStack.

01:08:38 Matt, what would you tell them?

01:08:40 Try it out.

01:08:41 Check it out.

01:08:42 How would they use it?

01:08:42 Yeah.

01:08:42 Try it out.

01:08:43 Download it.

01:08:44 It's as easy as pip install PyStack.

01:08:46 Find something that isn't working the way you expect it to.

01:08:48 Point PyStack at it and see if you can figure it out.

01:08:50 And of course, we're open to contributions.

01:08:52 So if you find especially issues, if you find something that's broken, let us know.

01:08:56 If you find some platform it doesn't work on, let us know.

01:08:59 But yeah, it is my single go-to debugging tool whenever something gets stuck or doesn't do what I'm expecting it to do.

01:09:06 When I run a Python command at a command prompt and it just doesn't return, I reach for this all the time.

01:09:11 I'm convinced it's a very useful tool for people.

01:09:13 Yeah, it looks amazing.

01:09:14 Pablo, final thoughts?

01:09:15 Yeah, the only thing I will add to what Matt said is that one of the things, and yeah, don't do only this thing for PyStack and our tools.

01:09:22 Do it for every tool that you use.

01:09:24 It's giving success stories.

01:09:26 So when you use the tool in that particular challenging situation and it really works, you know, you just say, wow, it just works.

01:09:32 Just go to the repo.

01:09:33 Again, not only ask about any tool that you see that actually does this.

01:09:37 And tell the maintainers what you were trying to do and that you were really happy.

01:09:41 Among other things, this really helps maintainers because at the end of the day, you think about it, we are putting all this work and then we just get the case that really doesn't work.

01:09:49 So it's a bit discouraging.

01:09:50 So it will keep us happy.

01:09:51 And that's kind of important in open source since these things are free.

01:09:54 And the other thing is that it allows us to know how people are using the tool.

01:09:58 So when we discuss new features and like how we evolve the tool, how to do that.

01:10:03 So for instance, for Membray, we have the success stories page where you have, we are going to have the same bio stack.

01:10:08 So if you just happen to use it and you like it or you use it successfully to fix something, just tell us.

01:10:14 We are super happy to learn from you and to know why it was useful to you and what kind of features you used from the tool so we can keep improving it.

01:10:22 Excellent.

01:10:22 Yeah, I think that's a really great idea.

01:10:24 I encourage people to do that as well.

01:10:26 Pablo, Matt, thanks for being on the show.

01:10:28 Always a pleasure, Michael.

01:10:29 Thanks for having us.

01:10:30 Yeah.

01:10:30 All right.

01:10:30 Bye, guys.

01:10:32 This has been another episode of Talk Python To Me.

01:10:34 Thank you to our sponsors.

01:10:36 Be sure to check out what they're offering.

01:10:38 It really helps support the show.

01:10:39 Take some stress out of your life.

01:10:41 Get notified immediately about errors and performance issues in your web or mobile applications with Sentry.

01:10:47 Just visit talkpython.fm/sentry and get started for free.

01:10:52 And be sure to use the promo code TALKPYTHON, all one word.

01:10:56 Listen to an episode of Compiler, an original podcast from Red Hat.

01:11:00 Compiler unravels industry topics, trends, and things you've always wanted to know about tech through interviews with the people who know it best.

01:11:07 Subscribe today by following talkpython.fm/compiler.

01:11:12 Want to level up your Python?

01:11:13 We have one of the largest catalogs of Python video courses over at Talk Python.

01:11:18 Our content ranges from true beginners to deeply advanced topics like memory and async.

01:11:22 And best of all, there's not a subscription in sight.

01:11:25 Check it out for yourself at training.talkpython.fm.

01:11:28 Be sure to subscribe to the show.

01:11:30 Open your favorite podcast app and search for Python.

01:11:33 We should be right at the top.

01:11:34 You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm.

01:11:44 We're live streaming most of our recordings these days.

01:11:47 If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talkpython.fm/youtube.

01:11:55 This is your host, Michael Kennedy.

01:11:56 Thanks so much for listening.

01:11:58 I really appreciate it.

01:11:59 Now get out there and write some Python code.

01:12:01 I'll see you next time.