Learn Python with Talk Python's 270 hours of courses

#311: Get inside the .git folder Transcript

Recorded on Thursday, Apr 1, 2021.

00:00 These days Git is synonymous with source control itself. Rare are the current debates of whether to use 'Git' vs 'SVN' vs some 'Fossil' like 'Source Safe' vs 'you name it'. But do you know how Git works? What about its internals, I'm sure you've seen a .git folder in your projects route. But to most folks, that's a black box. In this episode, you'll meet Rob Richardson. He's going to pop the lid on that black box as we dive into Git internals in the 'Docket folder', among other things about 'Source control'. This is talk Python to me, Episode 311, recorded April 1 2021.

00:46 Welcome to talk Python, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy, follow me on Twitter, where I'm @mkennedy, and keep up with the show and listen to past episodes at talk python.fm. And follow the show on Twitter via @talk Python. Congratulations to Mike Manning. He's the final winner of our 'PyCon' ticket give away. Thank you to everyone who entered if you didn't win, I hope you're able to get a ticket and support the PSF the Python community and be part of that awesome conference. See you in May.

01:18 Rob, welcome to talk Python me. So glad to be here. I'm really excited that I get to join you. Great to meet your audience. Yeah, it's great to have you here. You got to meet intersection, you gave a talk at the Python web conference recently that also spoke at and your talk was really interesting, and certainly relevant to the Python, folks. So I thought it'd be cool to have you over here. And, you know, I should give credit to Paul Everett for connecting us he's like, Oh, that was a great talk. You should go talk to Rob. So thanks, Paul, as well, who's not long ago on the show? Yeah. I've been chatting with Paul about thoughts around the talk as well. He's a really brilliant guy. Yeah, he is. He definitely is. He's been doing a lot of cool stuff for a long time. So yeah, he's a great guy. Now, before we get into 'Git', and all those types of things, which, you know, it's really surprising to me how much it's taken over the world, right used to be, there was always a question, well, what 'Source Control' do you use? Like, that's not a question I hear all that often these days, not at least as much as it used to be. But before we dive into the details of that, let's start with the story. How did you get into programming, this is actually a really fun story. I was 10, I was at the library, because they had the computer. And we'd play video games. And the methodology of how you do this is you go up to the counter, and you flip through the book and you go find the video game and you show them that page, and they give you the disk and you hit the save icon, you take the save icon and you put it in the computer, you play the game. So I had finished playing my game. And I went back to the desk to go picking another one flipping through the you know, plastic sheets, and I found a drawing program. And we said that I'd like to play this game, they gave me an eight and a half by 11 sheet of paper, the top two thirds was graph paper, you know, graphs. And the bottom third was how to write the program to draw that on screen. Oh, cool. Okay, and it was so much fun, I got to start to build content that was in my mind, in real life in this artistic medium with a very technical implementation. So you know, that was so much fun. I didn't ever return that game. And so that kind of brought me into the world of software development. I always thought it was just a fun thing that people did. I didn't realize it was a career. So it wasn't until really late in my college experience, where I realized that I could do this for a career. And so after I graduated, I got into programming professionally. And I've had a really fun time coding now for professionally for more than 20 years. Yeah, awesome. I think programming is special, because it's one of those things you kind of hinted at where you, you think of something, you dream, something, you imagined something. And then with a little bit more thinking that thing can become real. Whereas you know, so much of what humans do, it's one or the other, I could tell an amazing story and write the book. Or I could go build an awesome house. But normally those things don't actually co exist where you think a lot about something and they they come into existence. But I do think that's a magical part of what we get to do. And I think it captures a lot of people's imagination. And what's really cool is that in this digital world, there are a lot less boundaries, a lot less constraints. There's nothing telling me that this pixel needs to be in this certain way. I can draw whatever I want on these pixels on the screen. Yeah, yeah. And modern day we have cloud computing, and we have incredible computers. Like the sky's the limit is really, really awesome. Also money. You don't have to go buy tons of hardware for many things that we do, right? The really cool right now, how about today? What do you have to these days, I'm doing a lot with software development, 'Cloud' based development, a lot of websites, a lot of Web properties, 'ASP. net' and 'Node' on the back end, 'React' and

01:18 'Vu' on the front end, you know, taking that into interesting modalities. I've started

05:00 had to play with Raspberry Pi's. And that's really fun. And getting to dig into all the things, I've gotten really good at doing DevOps as well, part of my passion is being able to share this knowledge with others. So I do a lot with teaching both at user groups and conferences and elsewhere mentoring. And so it's really fun to be able to not only learn these new skills, but also pass it on to the next generation of developers too. I'd love to say that. It's not that I'm really good at it. It's just that I've been collecting things for a while. So let me add to your collection too. Well, I think one of the things that's really interesting about becoming an expert in programming, people who are beginners or maybe don't do programming at all, they see that person as incredibly talented and incredibly smart. And they may be they often are, but I feel like the the real big difference is, I've spent 10 years gathering up these little tips like, Oh, I tried this, that doesn't work very well, that crashes, you try to talk to the database that way. That's bad. Oh, by the way, I've also built up a couple of examples of what databases are. And I've seen, you just have this almost more experience than I don't know, like, innate skill, right? So it's really cool that you just kind of layer on these skills over your career. And the reason I think that's powerful is it's very easy to communicate back to other people, right? If you know, the way 'Nietzsche' did 'Philosophy' or the way 'Euler' did 'Math', like, you can't, or Bach did music, like you can't easily communicate that to someone, if it's this crazy innate skill, you can sort of communicate it, but it's not the same. But with programming, I think it's very easy to transmit it on and pass it on, and ways to help people like level up, it's super fun. And it's really easy to get started, you know, programming languages have become much more approachable of late. And so if you're new to programming and just starting to dabble in it, you don't need to buy a big expensive thing, you know, the laptop that you're using to browse the web is probably sufficient for building simple programs. And so you know, dive in, use free tools and and just start building stuff. It's really approachable and really fun. Absolutely. And one of the things that just never ceases to blow my mind is I can be in a coffee shop, working on a relatively cheap laptop, doing my coding, 'git push', speaking of 'Git' something happens on one part of the cloud, it triggers a web hook somewhere else that then grabs the code and could run that on a tremendously powerful data center, and computer or suite of computers, a cluster of computers. And yet I get the experience of basically building this super powerful thing on my very wimpy little laptop. It's just cool that you can create things like Facebook, or Google or you name these these really large, amazing apps, but you could kind of just do it on like a laptop, something that existed in my mind yesterday, exists in the cloud and scaled to any user that wants it tomorrow. Yeah, that is really fun. Yeah, it's super, super fun. Before we move on, you also mentioned that you've been playing around with Raspberry Pi's something I covered recently on 'Python Bytes'. My other podcast is somebody built a water cooled, Raspberry Pi cluster, computer. So eight Raspberry Pi's in one thing, all of them overclocked, and water cooled, have you seen people doing this stuff? It's crazy. It's really cool. And as you start to get into clustering, programming, clustered programming, you know, multi machine type of experiences, a Raspberry Pi is a really cheap barrier to entry, you know, for $40 or so you can get a Raspberry Pi, get three or four or five of them, cluster them together. And now you get the sense of what does it take to build parallel machinery. And it is really, really fun. So you know, to get an 8 or 10 or 100, or 1000. node, Raspberry Pi is pretty sweet. And it's awesome. I mean, you could do something similar with Docker, like fire up a bunch of 'Docker Containers', but it's not the same feelings. Like there's actually 8 of them over there. And they're actually talking to each other and working together. I think it's, it's a very different feeling. It's super cool, 'Containers' do help us start to approximate that. But yeah, there is some lying to ourselves to believe that all of these containers running in context on my one laptop, are really a distributed system. Yeah, absolutely. It's not the same, but it does let you sort of play around there a bit. Right. Alright. So I want to talk about Git primarily. And that's what your presentation in the Python web conference was about. And that's what we're going to center our conversation on. But you know, you and I both been around the industry for a while, "Git" is not that old. It's the new 'Source Control' on the block, I guess. So maybe, let's talk a little bit about the history of 'Source Control'. You know, I think of 'Source Control' as a spectrum, from what 'Source Control' all the way to get distributed source control maybe, you know, there's I've talked to people I've seen inaction source control is I've gotten

10:00 File code file. And I've called it version one, version two, version two edited version three, version three, final, final two, you know, just like already, maybe it's a lot of files use zip the folder and you name it like that, right? Like that's the beginning of source control means doing a wrong, but it's getting getting there, right? Well, it's doing it in exactly the way that you needed at that time, 'Copy folder' 'Versioning' is definitely a thing, '.pu'

10:00 '.date', you know, copy that content off to make sure that you have it. And that's really what we're after with 'Version' control, is when we think of 'Version control', we're really talking about two things. One is 'archiving' the history of my journey, so that I can get back to a known good state if things go bad, but also communicating with my team to be able to convey the progress of the system. And 'Copy Folder Versioning' does the first one real well. It doesn't do the second one real well, there systems that I've worked on where you know, to upgrade the system is to first copy all the things into the '.backup' folder, and then upgrade the primary thing. And if it doesn't work correctly, then you take that, put it back, put it back, yeah, they point the web server at the backup folder. And so now the system has been running out of the backup folder for you know, six or eight months or a year and and now we go to upgrade. And step one is to take Oh, wait, we just took down the site. Now we have no known good backup thing. Yeah. without some sort of 'Source Control'. The real thing, I think that falls apart, maybe you're doing the 'File Versioning' thing, which is still not that ideal, but the thing that really falls apart is 'Collaboration'. Right? Right. Soon as two people want to work on something, it's not okay to say, Well, here's my zipped version, can you merge that back together and probably don't have merged labels, either. So what does that even mean? Right? So I quickly gets us into where I think people probably should be in some sort of 'Version Control'. But back in the day, that was different stuff. For example, you know, maybe that was 'Subversion'. Actually, if you were on 'Subversion', you were in a good place. I mean, a really good place. Yeah, 'Subversion' was really cool. subversion was an upgrade to 'CVS', where 'CVS' would version each file or each folder separately. And so nested folders just happened to kind of be together in this clump. And what 'Subversion' gave us was, we're Versioning, the entire project together in one piece. Before that, we might have had 'sourcesafe', or other pieces, 'Team Foundation Server' kind of fits into this realm as well. And so it's that mechanism of 'Versioning' all of the pieces together, and then being able to 'Publish' that to a central place. What makes all of these systems kind of unique or specific, is that they're all really client server pieces. So 'Version' was really good at being a Client server piece, I'm gonna go out on a limb and say it was the best client server version control system that I'm aware of. Yeah, I think so these systems though, kind of a fundamental flaw, because we want to use version control for those two pieces, we want to use it to be able to back up the work so that I can get back to a known good state, and to communicate with our team. And the hard part with these client server version control systems, is we're doing both every time we commit. So when I commit a change to subversion, I'm immediately publishing it to all of you. So the analogy that I like is when I'm rock climbing, I want to be able to put a carabiner in the wall as frequently as possible. If I climb a foot, and I fall, I'm only going to fall a foot, if I climb six or eight or 12 feet and I fall, I'm going to fall 12 feet, well, actually, the nature of the rope is that it's going to swing all the way down. So I fall 24 feet. And that's a long way to go. I want to stick pieces in the wall as frequently as I can. You don't want to see me spamming the thing every time I get there. So I get to the point where it's like, okay, I finished the thought, I'm good. I want to mark this safe point. But I'm not ready to publish it to all of you. Yeah, it's really the thing you should be working with most of the time is if I publish it to the rest of the team, it should be, it should at least run, right that the test should pass. Maybe you can fix that, like you're gonna work with somebody. But it shouldn't just mean nobody can build or even start the software at all. Because you've added the safe point in the middle of their work that is inconsistent, or halfway there. Whatever, right? And so I've reached the stopping point, but I'm not done. It doesn't work. And so I have this moral dilemma, do I mark a safe point and inflict that on all of you or do I not. And that's when I fall back to a secondary version withdrawal system where I start doing copy folder versioning again, where it's like, I just want to take all my stuff and stick it in this spot so that I have this known good state and that's where we

15:00 to pivot to distributed version control systems of which 'Git' is one of them, where we have a separation between the 'Commit' stage and the 'Publish' stage. And that isn't the official terms that 'Git' or any of the rest of them use. But there's a process of marking those safe points. And then there's a process of collecting all of the safe points and publishing them to others. And that takes a bit of a mind shift to get used to it as well when you're working on it. Because if you come from one of these other systems I committed so it saved, right, but 'Commit' in a distributed source control system means it's, it's a local safe point until you get 'push', or whatever either a materials equivalent of a 'git push'es, right, yeah, and hg push. And so it's exactly that it's marking save points, however frequently you want. And then combining those safe points together into a cohesive story to 'Publish' to your colleagues. And that's what makes 'Distributed Version Control'. So powerful is separating those two concepts. 'Mercurial', git, 'Perforce', there are other distributed version control systems. And as the world was moving from 'Subversion', and 'TFS', into this Distributed world, we experimented with each of them, you know, arguably, 'Git' wasn't the best, we might have done a 'VHS' and 'Betamax' type of thing. But clearly, 'Git' has become the 'de facto' standard version control system, it is distributed. And now we can separate the safe points from the Publish points.

16:32 Talk Python To me is partially supported by our training courses at talk Python. And we run a bunch of web apps and web API's, these power of the training courses, as well as the mobile apps on iOS and Android. If I had to build these from scratch again, today, there's no doubt which framework I would use. It's 'FastAPI'. To me 'FastAPI' is the embodiment of modern Python and modern API's. You have beautiful usage of type 'Annotations', you have 'Model binding' and 'Validation' with pedantic and you have first class 'Async', and 'Await' support. If you're building or rebuilding a web app, you owe it to yourself to check out our course 'Modern API's with FastAPI' over at 'Talk Python Training'. It'll take you from curious to production with 'FastAPI'. To learn more and get started today, just visit talk python.fm/fastapi or email us at sales@talkpython.fm.

17:23 I think another really important thing to highlight for people who haven't been there, you know, right at the 'Git' homepage, they highlight 'Subversion', which we 'CVS' which we've been talking about, but 'Perforce' 'clearcase', 'Sourcesafe', 'TFS', a lot of these things, there's two things one, they would lock files, like, if you wanted to make a change to a file, you would claim it like I'm editing 'main.py', when no one else can interact with that file, it's literally made read only on your computer until, you know until that person is done, and they had better not forget and go on vacation while they got some files checked out. That's the one thing. The other is you need permission to participate in a project, you have these gatekeepers, and you need to sort of prove yourself to the gatekeeper. So if I wanted to 'Commit', I wanted to work on 'Flask'. If it was under 'Subversion', I have to go, can I have permission to go read it 'read only' access to 'Flask', if I want to make a change, I literally have to say I need the permission to 'Commit' back to 'Flask' with the distributed ones, you clone it, you do your proof of work, your proposed idea. And if you want, you can contribute it back. Or you could just go in a different way, right? There's this very interesting separation of I can kind of work on it and then see if I want to contribute back rather than the other way around, I have to get permission to contribute. And I think that's a super critical thing in the open source space where there's a very loose coupling of people and projects. Like if somebody comes to me says I want to work on a suppose I'm working on 'Flask', they come to me, I'm in charge of 'Flask', they come to me and say I want to work on 'Flask' like well, maybe what else have you done? Show me, this is a huge project. Yes, we do not want you to mess up Flask. But and we had a little bit of that with 'SourceForge'. You know, you could clone the 'Repository' in 'Subversion' and just work on it locally. But you weren't able to participate the moment that you wanted to help. It was a really friction full process where you know, okay, so I have this diff. Now I don't have write permissions. So am I going to, you know, bake this diff into an email and hope somebody reads it? Do I just use it locally? Do I 'Fork' the project and only have our corporate version of it? It was very difficult to participate. And that's not a

19:41 feature of Git per se, but rather the GitHub that shared hosted mechanism around get that has grown up as well. Yeah, I mean, with 'Git', you can 'Clone' a thing and then work on it, as long as you have 'Root access'. But yeah, the additional mechanisms the 'Git Flow' around it is certainly something created by 'GitHub' with

20:00 Like prs, and 'Forks' and Emerging 'Upstreams', and all that origin 'Upstream' stuff. One thing I did want to ask you, before we get into the details is why do you think 'Git' won, you did talk about this 'Betamax VHS' sort of thing. And there are other options out there for distributed source control. I have a theory, what are your thoughts? And I have a theory too, I don't have the answer. And maybe our listeners will help us discover what the correct answer is. Or maybe there isn't one. In my mind, a lot of the time, we were looking at ways to compete with things. You know, we had things that would compete with CFS or 'Subversion'. Because you know, we wanted a little bit more, we wanted to make money on the process of source control. And what's really interesting about Git is that it be it has become so pervasive. And so we're not building competitors to get we're building integrations and to get it we're building on top of Git. Yeah, arguably, GitHub helped with that, too. GitHub has a really, really powerful community mechanism for that, and GitHub really only did get. But I would argue that Git is really cool, because it's free and open source. And because it's free and open source, and it has that community mechanism around it. We don't need to compete with it. We don't need to try to make money on this. Instead, we can build collaborations with it, and mechanisms working with it, and build up the community together. My thought as well was GitHub. Yeah, it's the thing that brought not just the server infrastructure to privately have code. It brought the community and it brought the flow that allowed people to collaborate in ways that could let them collaborate, once they've proven they have something to collaborate, right? Here's my PR, I've already shown you the thing that's amazing that I want to offer up to you. Oh, that does look amazing. Thank you, who are you? Let's let's talk about this, right? It's a different conversation than I've never seen you. Why should I give you right access to 'Flask', and 'svn'. And it's exactly that 'GitHub' has these magic levels to it, we're at the very first level, it is just an online source code repository system. And so you know, how is that different from 'SourceForge' or 'TFS' before it, and it isn't at this level. And so if that's what you're using GitHub for, then that's perfect. You know, backup your local projects up to GitHub, get your content off of your machine in case there's a disaster, that is definitely the first level, the next level starts to build 'Workflows' around it, where we can say, I want to create issues, I want to create project management things, I want to create milestones, I want to create goals. And so that's kind of the next level leveling up again, we can start to create a social community around that where we can start to have conversations around the content, where I can create a 'diff', and we can all talk about it, and we can collaborate on it. And once it's good enough, now we can 'Pull' it in. Add to that, then the mechanism around pull requests and things like that. Git has a content concept of 'push' and 'pull', you know, 'Publish' and 'Receive', I guess, might be the terminology that matches here. And what's interesting about a 'Pull' request, I don't have write access to your 'Repository', but I want to contribute. So instead of 'pushing' my content to you, I'm going to 'Request' that you 'Pull' it from me. And so no longer do I need to create this email and write out all the content and hope you read the email. And I create this code up in my space. And I request that you include it in your space. And that made collaborating with projects really, really easy. So with that comes the next level of GitHub where we have these communities that can socialize, and develop and hang out in this coding space. And that's really what made GitHub so magical, is that we have this community around coding, where previously with 'SourceForge', or other environments, yeah, we had that online source control system, but we didn't have those levels of interaction. So pull requests, or merge requests, or whatever you're gonna call it is that mechanism of being able to collaborate with low trust type of environments. I want to offer up my solution to the community and see if that's gonna fit into this ecosystem. Yeah, I think that's why I get one as well. And Vera rose out there. It says open sources, the best way to learn and improve technically, in collaborative people you don't know. Yeah, I think it's that the people that you don't know, it makes it special because it allows you to create these connections with people all around the world, who you would otherwise not meet, and you get a chance to work with them. Even if you live in rural Canada and you want to do software development, maybe no one around you is

25:00 Really good at whatever you're trying to do, but go to GitHub, find a project, you can collaborate the best people in the world on that we can create these communities around our passions for technology or the problems that we want to solve, not necessarily based on the geographic boundaries that we find ourselves in. Yeah, absolutely. All right. So that's the history of it a little bit. Yeah, talk a little bit on why 'Distributed Source Control' is really powerful. And I think it is really unlocked open source in a special way. And on a much larger scale than it has. And it is interesting to note that 'Git' and 'GitHub' are different. 'GitHub' uses 'Git' under the covers to be able to build it social experiences. But 'Git' is a thing that is separate and distinct. There is no 'Pull Request' concept in 'Git', for example. And with 'Git' on your local machine in a cave, you can 'version' and 'create' those safe points. When you're ready to socialize, to 'Publish' your content to communicate with your team, you can use 'Git', together with lots of services, 'GitHub', or 'Bitbucket', or 'Git Lab', or there's lots of private services as well that allow you to create that online community. Now, GitHub has published their magic sauce to the world, and lots of us have cloned it. So 'GitHub' is still the place where we code for the most part. But if you prefer coding in another community, then that's totally fine. You can still use 'Git' and all of the tools to be able to 'Version' create your 'safe points' and 'Publish' that content to others, you could just 'Publish' it to a different server as well. 'Git' and 'GitHub' are different. It's easy to see them as the same thing. But yeah, they're absolutely not. Right, we've got all these different locations. You know, I've mixed emotions, mixed feelings about if you have another project, and you put it somewhere else, I'm not going to name any particular service. But let's just say somewhere that's not 'GitHub'. That's totally good. But at the same time, so much of the Open Source flow is around 'GitHub' and the stuff that's happening there, it's I don't know, it's really interesting to think if why you might be at one place and not the other place, and so on. Yeah. And a lot of people were worried when GitHub was bought by 'Microsoft', is this going to be the end of the community collaboration. And I think 'Microsoft' has been a really good steward of the 'GitHub Community', and really making sure that 'GitHub' is still available to all of us and facilitating the success of that ecosystem. Yeah, there was a lot of hesitancy and concern within certain communities. And I feel like they've done a great job. I guess what I didn't realize was that GitHub really needed some help from somebody, like they financially they were not doing as well, I looked in like, this place must be incredibly successful. But you know, what came out after when some of the reports and stuff that was you know, it was kind of important that someone came along. And if that's the case that I was, you know, head over heels that Microsoft bought them. Last thing I want to see is them go away, and I think they've done a good job of just letting them be alright. Yeah.

28:00 It's working really well. So I think it's, it's been a good, good deal that worked out there. See also 'Docker' for an example of a community that is amazing and contributing, but doesn't have a financial business model to be able to survive. Yeah, yeah. Hopefully, things go well for 'Docker', but it's just tricky. They tried the enterprise thing. And then they're, yeah, they're switching to other things. Yeah, I love their pivot back to focusing on 'developers' in the community, which is wonderful. But I still feel like they haven't found their spot that allows them to be business successful. And the hard part is, you can only do that for so long. And then you need to, you know, pivot to something that can start to facilitate the business. Yeah, absolutely. All right. Are you ready to go into the 'Git Folder' and find where the hidden magic lives? Yes, if you go into a project that you've get cloned, or you've get a knitted, and you create some files, and you mess around there, you don't see anything different, it looks just like any other folder that might have a project in it, right. But in there, actually, it's contained the, almost the entire backup the entire contents of all the versions of those files, at least every branch that you've checked out, hidden in the 'hidden.git file', So .gbit On and it's not almost it is that is the entire history of the project. So the way to backup a git database, misusing that term, is to grab that '.git folder' and copy it. Inside that '.gitfolder' is lots of files that describe the history of the project, since its inception, down to the current version. And so you know, kind of the only way that you can break Git is to open up that '.git folder' and change stuff. By default, this folder is hidden on most systems. So you may have to show hidden files and folders to be able to see the '.git folder', but it's there. And it's really powerful. Yeah, so

30:00 On Windows, you go to Explorer, there's like, one of those little ribbon things that drops down is a checkbox for Show 'Hidden folders' and 'files'. On macOS, I learned you can hit Shift+Command+. , and that will show hidden files I that I did not know when I was very delightful. And some users are delighted when users told me about that. On Linux, I don't know. I mean, you could go and do an LL in there on the terminal. But there's probably some way to show it in the Explorer equivalent as well. You can navigate into it from your terminal or wherever. And once you're inside of it, yeah, all of the files are right there. Yeah. So we go in here, and we find things like 'Head', 'Config description', 'Hooks', 'Index' info, 'Logs', 'Objects', 'packed ref's and 'refs', you want to maybe give us a rundown of what each one of these are. And then we'll, we can dive deeper with one of the tools that you built into maybe some things like refs and so on. But yeah, yeah, hooks, but yeah, wherever you, what cool in this database, is, it is the entire history of your project. And it's' z lib- compressed'. So for example, the 20 year history of 'Pearl', the '.git folder' is ever so slightly larger than the 'Checkout folder'. And that includes the entire history, including all of the changes, and all of the authors. And all of that is really nicely compressed into this. Well, it breaks down into a couple of groups of things, we have the 'Content', we have 'Branches', and 'Tags', you know, 'References' to the content. We have 'Configuration' details around this 'Repository', we have 'Index' files, we have 'Temp' files, then we have 'Automation' tools. And so these are kind of the groups of things that will find in this folder, a lot of them happened to be in their own folder, which is really nice. So for example, 'Hooks' is the place that you go for 'Automation'. 'Refs' is the place where all of the 'Content' is, but no .refs is the place for 'Branches', 'Objects' is the place for the 'Content'. And so a lot of the things that we'll see will have their own folder, but some of them spill out, like 'Configuration' is in the 'config file'. But there's also some stuff in the 'Info folder' for that 'indexes', we've got the 'Index file' right there on the 'Root directory', but we also end up with index files inside of 'Pack Folders'. And so, you know, it gets a little bit interesting. The first one to dive into is probably the 'Objects folder', because this is the stash of all of the content in your 'Repository'. Now as you 'Commit' something into 'Git', you'll first add it to the 'Staging' area, and then you'll 'Commit' it with a message. And as you do so, you'll end up with content inside the 'Objects folder'. Now, what's interesting to note here is if you look at a 'git log', you'll see hexadecimal thing you know, it might be seven characters, or it might be much longer than that. And as you do that log, you can take a look at that inside the 'Objects folder' are folders with two digits, those are the first two digits of the 'Commit number', inside that folder is all of the 'Commits' that happened to start with that two digit number, or letter. So you know, that means that not all of the files will be in one directory, they'll kind of be arranged a little bit that gets around too many files in one directory, or hers. But it's that that objects folder that then stores all of the content there. Now, what's interesting is, I think of it if I 'Commit one', and that's where this talk was really cool. When I 'Commit' one thing, and I go look in that 'Objects' folder, I will have 3 different files. Now they are 'z- lib compressed'. Yeah, you can't just open them up and look at them, right. They're kind of right scrambled up. But it's not magic. You know, I built a tool that will 'unzip lib', compress one, which is pretty cool. But once we identify a thing that we want to do, we can also use 'git-cat-file'. 'git-cat-file' allows us to look at both the 'type' and the 'content' in a particular node. This is a directional 'Acyclical Graph Nodes', 'Dag nodes' that specify relationships between these things. But what's cool is, here's an, here's an, here's a file that's in that branch, something like that. They're not branches, but they are folders. Here's a file within a folder. Here's the content. So we have three different types of these nodes. One is a 'Commit', and in the 'Commit', we have the author's name and the date that it was 'Committed' the message that we gave, and also in that 'Commit' is a 'reference' to the 'tree' nodes' that are part of that 'Commit'. Now each 'tree' node can specify 'files' or 'folders'. So a 'tree node' can reference another tree node. And inside the tree node, we have references to those files.

35:00 So I might have a tree node that references file, '1.txt', the third type is a 'Blob'. And so as we look at 'Blobs', then that's the actual 'content' in the thing. So go back to the click on 'File', I don't think I have a one to get back to the blob. But the cool part about this app, hit 'Refresh'. And you'll get to that 'big Blob of stuff'. Here's all of the 'Commits' in this 'Repository' to head so we had something visually to look at here, and it's about to pull up. It's rendering. Yeah, it's not super performant. That's all right, you built this thing called 'git-Explorer', which is a little web app that runs that you pointed at a 'git repository'. And it lets you look at these things that you're describing visually, and then click around on them, right, right. So click on show 'Type', and we see the 3 different colors emerge, there are 'Commits' 'Trees' and 'Blobs'. And it's like, Okay, I have a whole bunch of files in my 'Objects' folder. And I can click on each one, and I'll use that 'git-cat-file' thing to go figure out what it is. But it's like, you know, I really wish I had more stuff about it. And so that's where I click 'alphabetical'. And that will put them all in order, click on 'tags', and now you can see the name of that thing. And I'm only showing the first 7 digits of the 'Commit' here. But now you can kind of get a sense for here, all the 'Objects' and click on each one and open it up, right. And these names are what often go by Shaw's in 'git parlance', which is just the type of 'hash SHA', whatever it is. And I don't know how many people know this. But you can use sub pieces of the SHA's refer to it in 'git'. So you don't have to say the full I know what is that 32 characters or whatever to describe a name along with its enough to be unique. It'll go like you can issue 'Commands' against these things in this abbreviated form. Right? Right, exactly. So oftentimes, only two digits is necessary, sometimes three or four. And that's why often when you're looking at Git history, it'll only show you the first seven, surely enough, yeah, now, we start to do as we're clicking through this, as we get a feel for all of these green nodes, that's the 'Content' in the 'files', the blue nodes of the tree nodes. And as I click on one of those blue tree nodes, then it references other files, I can see there, 'SHA's' there, 'git hashes' there in that list. And then as I look at the red ones, the commits, that's my 'Commit' message that includes the 'Parent' node, that was the 'Commit' right before this. It also references the tree node that has the files for this. And so Wouldn't it be nice if we could, I don't know, arrange them in a way. So let's, instead of going from alphabetical, let's click on 'Parent child', and start to see the relationships, we'll need to turn on lines. Now, we probably want to also turn on tags. And now we can take a look at those 'Commits', and see how each one references. Now if you have a very large repository shoots off, then I haven't built scrolling yet. Sorry. Yeah, but you can see that the 'Red commit' nodes all reference each other, and reference the previous ones. And then they go into these tree nodes that may reference other tree nodes. And eventually those reference the file notes, that's part of my demo, highlight that, if we create the same file content, and 'Committed' in 2 different directories, it's actually only one 'Blob' on desk, there's only one 'green blob note'. But the cool part here is we were able to explore each of these objects in our 'Repository', and we get a feel for how they work. So if I change one line and a very big file, what gets committed? Well, the entire file. Yeah, that's suspect. That's probably why large binary files are not ideal to be 'Committed' here, even though we technically can put them there. Right? So that's the first group of things is these objects. So that's the top level 'Objects folder' in the 'Docket' folder? Yeah, yes, exactly. Now, there is a 'Pack folder' inside there. If you run various commands, then Git will say, well, do I have too many Commits to many of these objects that I need to, you know, 'Pack' together to make this repository smaller on disk, and if so then it'll automatically do a GC a 'garbage collect', where it starts to pack those into pack files. Now it's kind of a 'z- lib', compressed group of 'z-lib compressed' files. So it gets very meta there, but that's what the 'pack folder' is inside the 'Objects folder'. Okay, so next up, let's talk about the 'Refs folder'. Now when we look at 'refs', we look at 'Branches' and 'Tags', and remotes. These are files that 'Reference' 'Commits'. So one example is the 'HEAD folder' in the root of the '.git directory'. And inside that 'HEAD' folder, it will specify

40:00 What 'HEAD' points to. So if you do a 'git-log', and you see that 'HEAD' has an arrow pointing to I don't know, mean, or 'Trunk' or 'Develop' or whatever. Then if you open up that 'HEAD' file, you'll see the text in that file is that file, it's basically the 'SHOT', right? Is that what it is? It is the 'SHOT' if your 'HEAD' is pointing at a 'SHOT', but typically your 'HEAD' won't be pointing it, a 'SHOT' will be pointing at a 'Branch'. Oh, yeah, 'Refs' like mine right now is "refs/heads/main", which is the default 'branch' for this project. So that's awesome mean being the 'branch' 'Head' says it goes to 'refs' 'Heads' mean, so we can go into the 'Refs' folder, we can go into the 'HEAD's folder, and we can open up 'Main'. And what's in 'MAIN', is the 'SHOT' of the 'Commit' that 'Main' points do. Okay, what's cool here is that each of these 'refs' both 'Head', and all of these 'Branches', is just 'Pointers' to the 'Commits' in the 'Objects' folder. Yeah, so these are like, the 'Main' file is just a text file that just literally has only the 'SHA', it is where that 'Branch' currently is. Exactly. Okay. So technically, to create a 'Branch', I just create a 'file' that happens to be in 'refs Heads', I name it something and I give it a SHOT. And now I have a 'Branch' that points at that thing. 'Branches' in Git are not these durable, fragile things like in 'TFS', or in 'Subversion'. 'Branches' in Git are just 'name' tags, their pointers, their 'references' to the 'Commits' in this tree of 'Objects'. So the cool thing is we can move them around by the basically a path of these named 'Commits' through the history of the overall history of it Right, right. There the 'Labels' that we give it so that we can understand it, because communicating in 32 digit 'SHOTS' is not as much fun. No, definitely not. Definitely not one of the talks that. I like to do is I do a 'git-log', and I show that 32 digit 'hash', and I, I read it out, and then I walk up to somebody in the audience and pretend they're the project manager, and I go, can I ship it?

42:07 And they're like, yeah, thus, we have these 'Labels'. Yeah. Now in the 'HEAD's folder, is all of the 'Branches' in the 'tags' folder is all of the 'tags'. And there are also just files pointing it 'Commits'. Sorry, my repo is empty. They don't have any, but if you you know, people might tag a 'release' or a 'version' or a 'beta version' or something like that. So you can refer to it by 'name', by 'Label' instead of, you know, 'Main' with 'SHA' or something weird like that. Right, right. And then we have our 'Remotes' folder, which 'References' where I last saw another copy of this Git 'Repositories', 'Branches'. Yep. So in this case, you have one that says 'refs', 'Remotes', 'Origin', 'Main'. And that's perfect. That's where I last saw this server, the server's 'HEAD' main direct 'main' 'Branch'. Now in this case, I chose to call my 'Main' my remote 'Server' origin. Now, this could be a server that we've designated as the server, it could be one of my co workers, it could be a network share, you know, 'Git' isn't really opinionated about what constitutes a remote repository. Other than that, it isn't this one. Yeah. Okay. How does it know what, what origin is, as I create a 'Remote', I'm going to 'name' it, okay. So as I clone, I'm going to say 'git-clone' this 'Repository', and it'll build one and it'll by default, call it 'Origin'. But I could also say,

42:07 'git-> remote-> add-> origin', I just gave it a name. And then give it a URL. I could say Git remote add 'Upstream'. I could say Git remote add, Michael. Now it's a 'reference' from my 'repository' to yours. And so it's just in this case, in the 'refs remote folder', it's just a folder, 'referencing' the 'Branches' that I saw on your machine is there somewhere where it stores like the URL it does, and that is the next section that we may want to look at, which is 'Configuration'. Let's open up the 'config' file in the root of the '.git' folder. All right, now, this 'Configuration file' is really cool. It includes all kinds of configuration details associated with our 'Repository'. Now in this case, we have 'remote origin', where we've named this one and here's the URLs that we go to there. In this case, it's github.com/talk Python, we have other configuration details associated with this 'repository'. This '.git-config' file is actually one of three on my machine. And we'll start out with our 'config' file in that's installed when we installed 'Git'. So it's probably in Program Files, or it's in user local bin or you know, somewhere off in the ether of how we install

45:00 We probably don't want to touch that one. But that's the base configuration of all the options that we chose. When we installed Git two, if I run a command, if I were to say something like, get email global, something like that, you know, with the '-g command', maybe it's modifying that one. Well, the one that we just talked about was the system one. The second one is the global one, which is user specific, I find that confusing. Yeah, but my user specific the '.git-config' in my user home directory, so you know, C users, Rob or user, or the tilde a '/directory', Mac, and Linux, that '.git-config', overrides any settings in my system configuration. And so oftentimes, when you first install 'Git', you'll say, 'git-config'-globaluser.email, 'user.name'. And so if you open up that '.git-config' in your user home directory, you'll see those settings, you'll see your username, your name, your email, and all of the details that you've configured there. And then the third one is the

45:00 'config-file' here in your 'Repository' that will override any of those settings. So it doesn't make sense for us to have origin in our system in our user specific configuration file, because well, each 'Repository' will have a different 'Origin'. But it probably does make sense to put our name and email in our system in our user specific directory, because that would apply to all the repositories on our machine. Yeah, absolutely. Almost all of them, you might be doing home based, open source work, and you might be doing corporate button of work. And your formal corporate place might not love your corporate email on the open source project. Maybe Yeah, exactly. Yeah. So when I have that scenario, where I need to set my email address, and maybe my name differently in different 'Repositories', I can set it in my

45:00 '.git-config' in my user, home directory. And then I can override it in each 'Repository', just copy those couple of lines, set them in your config file here. And now you've set this 'Repository' to track your email differently. Is there a git command to change it? So I don't actually go into the '.git' folder? And I say, like get email, but not global? Or Git config? Yeah. move off the "-- global" Yeah. Okay. Perfect. Then you don't even have to know how you just know, I do get email and what my email is right. Now, there are other configuration files here in the dot 'Git folder'. But the 'Config' file is really the big one that we like to talk about. Okay, so you'll see a description file here, that's a configuration file, 'git-instaweb' is a web server baked into 'git' that allows you to kind of browse through your 'Repository'. Now 'git-instaweb' works pretty well on Linux, and not so great on Windows. I bet you've never used, 'instaweb'. In most scenarios we use. I'd never heard of it until you brought it up the other day. Yeah. But this 'configuration' file is the name of the website when you launch 'git into the web'. So get ships with a web server that can be the host of that 'git-repository'. Yeah. Now, why would I ever do that? Why wouldn't I just use GitHub? Exactly, which is why you've never heard of get 'Instaweb'.

48:24 Yeah, I mean, you might say we want a 'private' git-server, or 'public' git server, or something like that. That might be but yeah, usually, I've never heard of it. So very interesting.

48:36 All right. What else is in this? This list here? Yeah. So we've talked about the 'content' in 'Objects' folder. We've talked about the 'Branches' and 'tags' and the 'REST' folder. We've talked about configuration? Let's go poke in the 'Hooks' folder. Yeah, 'Hooks' is interesting. Yeah, it is really cool. The 'Hooks' folder is where we do 'Automation'. Yeah, so people probably heard of 'pre-commit-hooks', right? Like, probably the most popular example in the Python space is to run the 'Black formatter'. So it automatically formats your code before it checks it in. So indentation whitespace, like between a comma in an argument or something always consistent. So you don't get these like back and forth. editor driven, you know, merge issues, but there's no real change. But I format in my editor, you format it and yours, and back and forth. It goes between spaces with the comma, no spaces with the comma space with the comma. And so you could set up a 'pre-commit-hook' to canonicalize it before it goes in, right. But there's more than pre commit, right? Yeah, I could set up a 'pre-commit-hook' to make sure all my unit tests pass before I 'Commit', I could set up and so what we see here in this 'Hooks Directory' is all different kinds of 'Automation' things. So 'pre-commit-hook', a 'pre-merge-hook', a 'pre-push-hook', a 'pre-rebase-hook', and each of these are 'Shell' scripts. Well, with one exception, it's a 'Perl' script. But you see at the very top

50:00 It says '/bin/sh'. Why am I on a Windows box? Is this 'Shell' script still gonna work? Oh, yeah, 'git' ships with enough 'Linux C', 'Unix C bash', like stuff to be able to kick off the 'Shell' scripts and run them as it would on any Linux system. Okay, interesting. So there's basically like a little mini bash that comes with it, I remember people using that 'bash Shell' from 'git' to be more Unix like on Windows Exactly. So here in this 'Shell script', I could do all kinds of things. Maybe I'm calling a 'PowerShell' script, maybe I'm calling a 'Python' script, maybe I'm calling a 'Node format', or I can just call into whatever tasks I want to accomplish. And that will then accomplish that task whenever this event happens. So what I love to do in my demo, is remove all the '.sample pieces', so that their actual scripts, and then just merely the presence of that file, will be able to kick off that automation. Alright, so there's a bunch of files that our sample 'Shell' scripts named things like

50:00 'pre-commit.sample', or 'pre-merge commit' sample, if I just call it 'pre-commit', but not the '.sample'. Now it's going to be active. Exactly. Okay. Nice. Now, the cool part about these is that I have all my automation setup, I'm running the 'Formatters', I've got my 'Unit-test-passing', and it's great. But this file is inside my 'Docket' folder. So I can't 'Commit' these. It's not one of the files that is available for me to add to the 'Staging area'. Right? You bummer. It would be inception if you tried to 'Commit' stuff in the

50:00 '.git' folder, right? So often, we'll create 'Shell' scripts outside the 'Docket' folder and 'Commit' them, and then have something here inside the docket folder that calls into that other 'Shell' script. Yeah. And you mentioned some kind of 'Node-based-tool' that you can use, right? That will manage that stuff. Right? Right. There's lots of packages, the one that I show is 'git-hooks', that is an 'NPM package'. And once you install 'git-hooks', it will actually create all those aliases from the folder where you actually build the scripts that you can 'Commit' into this 'Hooks' directory, so that then they'll run, just installing this package installs those hooks into place C. So basically, if you just install the package once, it will find those other external scripts and make those be the ones that get C's. With the advantage. You can commit them into GitHub, and if somebody makes a change, that change will propagate to everyone else. Yes. You can commit them into git, push them up to GitHub, and they will run Okay, yeah, fantastic. Yeah. Yeah. Very neat. Very neat. Okay. What else have we got here? What else are we I think maybe 'Index'? Maybe that's an interesting one. Yeah, 'Index' is really interesting. As we look through Index, if we just pop it open in a editor, it's just a bunch of gobbledygook. And we're like, what is this? It's a file, right? Yeah, yeah, this isn't the only index. But this is one of the really cool indexes where Git keeps track of interesting stuff. That's nice. Yeah, check out this blown up. If I tried to look at it. It's like a binary blob exploded and died on my terminal. But there are file names in there somewhere. So it must be something to do with that. Yeah, I think it's 'git-ls-files', where you can go look through this index. And if we pass in 'Flags' to that, then it will be able to show the status of those files. But this is looking through that index. And the cool part about looking through that index is that 'git' if it wants to do a quick thing like which files have changed, needs to know the 'Blob' that is checked out in my working directory, you know, which 'Blob' did I start with? As we look through those objects, we saw a big tree of things. And, and so opening up each commit node, finding all the tree nodes, opening up each tree node, finding all the 'Blob' nodes, that takes a while. And so this is a cache, an index of all the files that I checked out in my working directory, this allows get to move really fast as it looks through my folder and identifies any files that have changed or new files or things like that. So that's what this index file is for. Yeah, and my 'git-incantations' are not 'pulling' it up here. But I think you can get it to show the '/SHA' of each file as well. Right, right. In which case, then, instead of traversing the whole history, and actually looking at the file on the hard drive and saying, Well, what is this 'HASH'? Do I have an update for this file? Or I could just look in this binary file and get that answer. Right. Exactly. Nice. Yeah. The next section of files that we want to look at our 'Logs'. And the cool thing about 'git's 'logs' is they keep track of where all of our branches have been. So if we get '.git/logs/head', then we get a thing that kind of looks really weird. We've got really long lines in this. And in our first line, it says whole bunch of zeros space, and then we've got

55:00 'git' SHA of the 'Commit' that it went to and a little bit about that 'Commit'. This is a log of where our 'Branches' have been. And so we'll have a file for each of our branches. In this case, we're looking at that Head file. So we see that Head started out nowhere and ended up at EDI13FC, and it has my username, my email, and then some other stuff. Yeah, the really interesting thing is this log can be really useful. If, for example, I switch branches and forgot where I was, or I commit something, and then I uncommit it, that's a thing. And I want to get back to it, or I delete a branch before I merged it in or, you know, those types of things. If I do that quickly enough, you know, remember gets gonna do that garbage collect and go prune nodes that aren't used anymore. If I get there quickly enough, I can use this log to go back through my refs, and go find that commit, the objects are still there, I just don't have any refs pointing to them anymore. And so the command that we can use on the command line is called 'git-ref-log'. And we can pass 'git-ref-log' a particular branch we want to look at, but by default, if we just say 'git-ref-log' all one word, then it will show the history of 'Head'. Now in this case, we didn't move it very far. But we can see there Oh, and here's the branch that I just deleted. And here's the SHA for this one. And so at that point, then we can 'git' check out that commit and get back to the content that we had created. And then last the reference to right. Okay, nice. There's a little bit of recovery, kind of an undelete. If you had to in there. Yeah, nice. The funny thing about this, the command is 'git-ref-log', but I've also heard it pronounced git ref log. And I'm like, so I've got this cat of nine tails. And I'm like, No, you can't get reflux. Exactly. Do it again. But once you understand how the refs folder works, then 'git-ref-log' makes a whole lot of sense. We're looking at what those ref files have said in the past. Here's what was before we changed it. Here's what it became, after we changed it a little bit more context around where you're currently working is where the head is pointing. Often that some Yes, branch. And this is like, Where's the history of that 'bin'? throughout the branch that it's on? Yes. Yeah. Very cool. Very cool. So we're getting sort of short on time here. What else should we be talking about? Like what else? I mean, sort of should we close this out with in terms of content of our dog file giveaway, the only other section in here is 'temp' files. So if we've 'committed' stuff, we might see a

55:00 'Commit_msg' file, or maybe it's called commit that underscore message. We might see other 'temp' files, we have a 'temp' folder, sometimes baked into things. And so that's the last group of files here in the '.git folder' is temp files, temp files, configuration, objects, refs, hooks, these are all the pieces that come together to make this 'git database'. And once again, you really can't break 'git', you know, it's like, well, I did this incantation, and it's broken, oh, no, you can use ref log to get back to a particular commit. Or you can use various commands 'checkout' to get back to where you need to, maybe you'll use reset to, you know, kind of get your working directory back in shape. But that structure of 'git' the double entry bookkeeping inside this repository is really good at keeping track of the things and so you really can't break 'git' Yeah. And back this up, you back it up, right, you backup this folder you backup? Yes, basically everything right? All right. Now, it might be easier to back it up not by just backing up this folder, but by publishing your changes to another repository. And that's where we have great workflows, like, I will push all of these changes to another server, maybe I'll call that 'server origin' yeah. Absolutely. And that it's automatic, if you check out from somewhere like cloning from somewhere like it right 'GitHub'. So there's just a couple other things maybe I want to touch on really quickly. While we have a moment. When you talked about breaking 'git' there's a interesting little design thing called dang it 'git' Or even better, I'll maybe I'll link to the better version, but not safe for work version where you're frustrated. And it's like, oh, no, I just did something terribly wrong. Please tell me how to do it. 're flog' is right at the top of these things. I I committed and immediately realized I need to make a change, or I need to change my commit message. And yeah, anyway, that's a pretty interesting one. Another thing we've talked a lot about GitHub. And what we haven't really talked about is 'gitignore', right? As much as you want to track stuff. You don't want to automatically track a bunch of things that are working files, you know, build stuff from C++, or maybe under node_modules or 'PyCharm' working file.

01:00:00 All sorts of things should not go into your project. Right? Your VF directory? Yeah. Yes. Exactly. Your VF directory? Absolutely. So yeah, there's 'gitignores' any content that you download any content that you compile any of that content shouldn't be in your repository, because it changes too infrequently. And it's usually easier to either rebuild it or redownload it, all those things should be ignored, yet, it's a huge merge nightmare as well, even if you could keep it, right. Suppose I check in my 'VF directory', and you go on Windows, well, you can't have the same contents as mine, because mine is the macOS version. So you change it, put your Windows version in there, and I get it back out and it breaks my Mac versions I got right those, there's stuff you should ignore. Absolutely. And when you create a new project on GitHub, it very handily says, hey, what kind of projects is this, we can get you far down the road with your 'gitignore'? Is this a Python project? Is it a node project or whatever? Right? What I wanted to point out is that drop down list, there's actually a GitHub project called 'gitignore', that has the ignore for all of these different languages. So if you want to make a change, to say, pythons-gitignore', you can go there and pull it up and see it. And you could technically do a PR, against it to say there's this new thing that's common in the community now, please fix it. That's pretty cool. And these things aren't perfect. You know, most of them will exclude everything that starts with or ends with or contains log. But your 'ilogger' or your

01:00:00 'log handler', yeah, might get excluded by that as well. So you may need to adjust this to get it the way you want. Yeah, but it is nice to know that least it'll give you a bit of a start. And that it's, it's a thing that you can contribute back to it's not just magic inside of GitHub, but it's its own GitHub, open source repository. Right. Yeah. quite neat. quite neat. Let's see, what else should we cover really quick. I think maybe just one other thing. I think that's maybe worth throwing out there. That was interesting. But it's, it's pretty specific. But you've mentioned windows a couple times, maybe two things, actually. One is on the show that you saw my screen just a minute ago, when I was inside of a git repository, it would actually put what branch it was on and the Git state and so on. And I have that because I have all my z shell installed, which is a really nice shell for Mac and Linux that gives you things like branch awareness, and number of changes, and so on. So you're talking about something like that for PowerShell, the new Microsoft terminal, what we're using for that, it's called "Oh, my posh". And Scott hanselman has a really cool video about "Oh, my posh" where he walks us through how to get it installed. There are various themes into Oh, my posh, but the theme that I really enjoy actually puts the cursor on the next line, one of the things that I frequently do in command prompt is, you know, I have all of the path to get to this folder. And so the command that I'm trying to teach ends up getting wrapped to the next line. And so "Oh my posh" er, "oh my zsh' gives you that additional context of how's your Git repository doing, you could also show your remote. It's basically just running a 'Shell script' behind the scenes. And so you can modify that 'Shell script', Scott hanselman is diabetic, and so needs to check his blood sugar a lot. And so he actually has built into his 'Oh my posh script', his blood sugar number, because it's really easy to miss. And it's one of those things it's really important not to miss. So it's in his terminal all the time, probably even color coded, right? If it's out of outer range, make it red. If it's not, not arranged to make it green, something like that. Yes. Wow. Interesting. Yeah, this looks fantastic. I've never played with this before. But yeah, it looks looks really nice. you recommend it? Yeah. I do. Oh, cool. All right. Well, I guess the one other thing that I was gonna throw out there is I heard of this thing called 'VFSForGit', we talked about large files. And this, this sounds like it's very much a Windows only thing. But it's a neat idea, this virtual file system forget that if you have a really large repository, it's kind of like the smart sync for Dropbox or something. It only pulls the files and interacts with the files that you actually touch. But it does that behind the scenes without you knowing it. Have you seen this yet? And we actually said 'VFSForGit', but it's actually VFS forget virtual file system. Yeah, great when your repository is just massively huge. And 98% of our repositories are not, but when you have the code base of auto know windows, then you need something like this because you can't get clone the entire thing. GitHub, not GitHub. Google is famous for having their corporate mono repo and I suspect that's bigger than it then you could get clone onto each machine as well. And so the cool part is one of the benefits of 'Subversion' that we lost as we move to 'git'

01:05:00 Was I could clone only part of a repository. And VFS kind of gives us that ability back most of the time, we don't need it. But if you've been really bad, and you've committed a whole bunch of binary files to your repository,

01:05:15 it's interesting, it might be worth kicking the tires. It isn't necessarily windows only it is plugged into Git itself. But it allows you to put that checkout directory somewhere else. So for example, on a shared file of shared network drive, now I have all of those objects, all of those blobs in one place, and I don't need to copy each of those to my machine. Yeah. interesting. The windows people that were switching to get said it was really a nightmare. So for example, the source code for Linux, repo something like 600, Meg's or point six gigs, Windows is like 270 gigs. So it's really ginormous. And they said, to do a clone took 12 hours to do a checkout took three hours to do a git status took eight minutes and to do an add and commit took 30 minutes before they made this change. So they were suffering some hard pains to go down this path for sure. I guess it probably is worth it for them. Right. All right. Well, I guess we probably should put a bow on it. We're more or less out of time there. But yeah, I'll ask you the two questions I always ask at the end of the show. If you're gonna write some code, what editor do you use? It depends on the code that I'm trying to write. In most cases, I'll reach for VS Code, but also reach for Visual Studio, right? If you're going to be doing 'ASP.net' stuff, like you said, Sometimes I'm also known to reach for, if you're doing like ASP. net or something you were talking about like that, or something, maybe something like, 'WPF' where the tools are built in, you have to basically have to almost Yes, but sometimes I also reach for Sublime Text or TextEdit, buku. And then often asked for a Python package library recommendation, maybe we could make it your Git scripts, the one that runs the 'pre-commit-stuff'. And when that moves that outside the Git folder, what was that called? Again? It's called Git hooks. And let me grab a link to it. It's actually a node package. But Exactly, yeah, just install it wherever. And it's good to go. Right? Yes. And so if you have maybe a Flask server, and you want to, as part of your 'Flask' server, maybe you have a 'React' or 'Vu' app where you need to pull down 'jQuery' as part of your client side dependencies, then you may have enough node stuff to be able to leverage this as well. Yeah, yeah, that makes a lot of sense. If you're already using NPM, because you're doing front end stuff, then you might as well, right. Yes, yeah, very cool. Other things that we didn't talk about. And it's really cool how this happened. 'Git Workflows'. what's beautiful about Git is really opinionated about how you do your workflow. Are you going to do Git flow? Are you going to do GitHub flow? Are you going to do something else? Git can work for all of those scenarios, because it is just a mechanism of committing and sharing files. It doesn't impose a specific branching or naming convention, you can choose to put those on top. But 'git's workflow is really open to whatever you need it to do. Yeah, well, when I was first getting familiar with this whole prs, and merging and those kinds of things, I felt like, Oh, that's a 'git' thing. That's a GitHub thing has nothing to do with 'git', right. It just get facilities that on top of it. So you can choose however you want to work. Right, right, by Cool. All right. Well, I don't normally close out this show with a joke. But Robinson had a good one here in the live stream. So I'm going to put this up here for us as our parting thought, and then asked you for one more as well, maybe, yeah, so he said, there's a programmer once told him couldn't use Git. He was afraid to commit.

01:08:46 Afraid of the 'git commitment'. Oh, that's awesome. Yeah, thank you for that. Thanks for making us laugh. Alright, final call to action, people want to go a little bit deeper than get may be they just do the three commands

01:08:46 'git-clone', 'git-add', 'git-commit', 'git-pushed', like that's for commands like beyond that, like, how do you get more into this world? What's really interesting is, as we're coming off of those other systems, we want to kind of build up that tribal knowledge that we had. And so we're going to go grab those three or five commands, and we're gonna stick them to the posted under our keyboard, take the next step to go figure out, you know, what is the next command that I want to do? Or how does this command work? What we did today was we explored through that '.git' folder, so that we can take that next level to see how it works.'Git' isn't a black box, it's not magic, it just works a little bit differently than the source control system you might have been familiar with. So definitely get familiar with it, Google the terms that you're looking for, and really start to embrace that mechanism and get really powerful with Git. I'm confident that you can get past just those few commands and you can make it just an inherent process in your workflow and use it to be really really powerful. Especially

01:10:00 Specifically, separating the save points from the Publish points. That's the thing you couldn't do before that you can now do with 'git' well said, definitely agree with all of that. I think getting really good with source control. And source control these days really means get almost, it allows you to be fearless with your code, right? So often people like, I would like to try this. But what if I break it? What if it doesn't go right? Well, if you know how to you know, create your branches work locally, do all sorts of stuff, rollback, you can just go crazy and just explore things. And if it doesn't work, you know, throw it away. No harm, no foul. It's lovely. Then if you get really stuck, hit me up on Twitter '@rob_ridge', and show me the code where you got stuck. And let's get you unstuck because I would love to continue this conversation and really help you be successful. Awesome. All right. Well, thank you for taking the time and being here. It's been great to chat. Get with you. Most definitely. Thanks for having me on. Yeah, see you later.

01:10:55 This has been another episode of talk Python to me. Our guest in this episode was Rob Richardson. It's been brought to you by our courses over at Talk Python Training. On level up your Python. We have one of the largest catalogs of Python video courses over at talk Python. Our content ranges from true beginners to deeply advanced topics like 'Memory' and 'Async'. And best of all, there's not a subscription insight. Check it out for yourself at 'training.talkpython.fm'. Be sure to subscribe to the show, open your favorite podcast app and search for Python. We should be right at the top. You can also find the iTunes feed @ /itunes, the Google Play feed at /play and the direct RSS feed at /rss on talk python.fm. We're live streaming most of our recordings these days. If you want to be part of the show and have your comments featured on the air, be sure to subscribe to our YouTube channel at talk python.fm/YouTube. This is your host Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon