Transcript for:
Intermediate Git Course - Tobias Günther

This is an intermediate Git  course taught by Tobias Günther.   He'll help you move beyond Git  basics and improve your Git workflow.   Hello, Free Code Camp friends, my name is Tobias.  And I'm going to improve your Git knowledge today.   There are a lot of beginner tutorials  out there about version control with Git.   But I will help you understand the  concepts behind many things in git,   how to craft the perfect commit, how to  choose a branching strategy, or how merge   conflicts really work. So my goal is to give  you more confidence when you work with git,   and to get you a little bit closer  to becoming an advanced Git user.   Before we start a huge shout out to the Free  Code Camp team, thank you so much for being on   this mission of teaching people how to code. And  thanks for letting me contribute a little bit.   A couple of words about my own background, I'm  part of the team behind tower, and tower is a   git desktop GUI. for Mac and Windows, we've been  around for more than 10 years now, and helped over   100,000 developers and designers well work more  easily with git, become more productive with git,   and make fewer mistakes. For today's session,  you don't need to have tower installed,   you can follow along on the command line,  no problem. Alright, let's get started.   So let's talk a bit about how to create the  perfect commit. So the first part is to add the   right changes, right. And the second part is to  compose a good commit message. So let's start by,   let's start by adding changes to the  commit. So our goal here is to create   a commit that makes sense one that only  includes changes from a single topic.   And in contrast to the easy way when we  sometimes just cram all of our current   local changes into the next commit. So this  is the bad thing we should not do. But being   selective and carefully deciding what should  go into the next commit is really important.   This is a better way of how a better commit could  look like because it separates different topics.   On the other hand, the bigger a commit gets. And  the more topics that are mixed into the commit,   the harder it gets to understand both for your  colleagues and for yourself in the future. So   Git staging area concept is really helpful in this  context, it allows you to select specific files,   or even parts of those files for the next commit.  So this is what the staging area can do for you,   you can really select individual files for one,  commit, and even parts of files for one commit   and leave others for future commit. So let's take  a look at a practical example. And over the last   few hours, or maybe even days, we've created  a bunch of changes, let's say git status here.   But let's say that not all of those are  about the same topic. So let's stick to   that golden rule of version control to  only combine changes from the same topic   in a single commit. And you probably already know  that to include a specific file, we can simply   type, git add and the name of the file. So let's  add that CSS file here. And voila, and let's take   a closer look at another file index HTML and see  what changes it currently contains. So we can   use git diff for that. And we can see that there  are two parts or chunks of changes at the moment.   And let's say that the first one belongs to the  next commits topic, but not the second one. So   let's just add the first part to the staging  area, we can do that. Let me just exit the   output here. We can do that with the git add git  add dash p flag. P brings us down to the patch   level, we want to decide on the patch level what  to include and what not. And we want to do that   with index HTML. So now Git steps through every  every single chunk of changes with us. And it   asks us a simple question. Do we want to add  this chunk or hunk to the staging area or not?   Don't worry about all the other possible answers  you can give in that situation. I don't know them.   And I want to sleep at night for us. A simple  why for Yes, or n for no is sufficient. So let's   say this one is actually the topic that we want to  commit. So let's say yes, we want to include that.   And, and for the second one, this is not the  same topic. So let's leave that out of the   staging area for the moment. So if we now take  another look at git status, we can see that   parts of index HTML will be included in the next  commit changes to be committed, and other parts   will be left for a future commit. Again, so index  HTML is listed twice, awesome. crafting a commit   like this in a very granular way, will help you  create a very valuable commit history, one that   is easy to read and to understand. And this is  crucial if you want to stay on top of things.   Now let's talk about the second part of creating  that perfect commit. And that is providing great   commit message. We'll start with the subject line.  So of course, conventions are different between   teams. But generally, the advice is to write  something very concise, less than 80 characters if   possible. And the subject should be a very brief  summary of what happened. And here's a little   hint, if you have trouble writing something short  and concise, then this might be an indication that   you've put too many different topics into that  commit, right. So let's go to the command line.   And if I now type, so I have a couple of  changes against stage for the next commit.   If I type git commit, I will get an editor  window where I can enter a commit message.   And we'll write something simple  ad capture for email signup.   If we add an empty line after the subject, get  knows we are writing the body of the message and   has room for a much more detailed explanation. So  here are a couple of questions you might want to   answer with your commit message body this  year, what's now different than before,   what's the reason for the change? And is there  anything to watch out for or anything particularly   remarkable about that commit. So I'll write my  version of that in the text editor here. And   voila, so let's save and close  this. And the commit is done.   Let's take a quick look at git log, and we can  see Okay, so this is the last commit we just did.   This is the subject, and this is the  body of the message. So by answering   these questions you're doing your colleagues  and your future self a huge favor, because   it helps to understand what exactly happened  in this revision, and what to watch out for.   Let's talk a bit about branching strategies.  This is an important topic because Git leaves it   completely up to you how you want to work with  branches. It only provides the tool, but you   and your team are responsible for using it in the  optimal way. And this brings us to our first topic   conventions. If you work in a team, you need to  come up with a clear convention on how to how   you want to work with branches. And you need to  write this down somewhere where it's accessible to   everyone. Why your team needs a written convention  you ask because Git allows you to create branches,   but it doesn't tell you how to use, you need  a written best practice of how to work or how   work is ideally structured in your team to avoid  mistakes and collisions. And this highly depends   on your team and team size on your project, and  how you handle releases of your software. last not   least it helps to onboard new team members. When  new people join your team, you can point them to   your documented will convention and will quickly  understand how branches are handled in your team.   When you think about how you want to work  with branches, you automatically think   about how you integrate changes and structure  releases. These topics are tightly connected.   To help you better understand your  options. Let's simplify a little bit.   I'll show you how to extreme versions  of how you could design your branching   workflows. And the motto of the first one is  always be integrating mainline development.   Always integrate your own work with the  work of the team. That's the motto here.   And this is how it could look. In this  example we only have a single branch   where everyone contributes their commits. So  this is a really simple For example, I doubt   that any team in the real world would have such  simple branching structure. But for illustration,   this extreme simplified example helps us  understand the advantages and disadvantages   of this model. So in an always be integrating  model, you have very few branches. And this makes   it easier to keep track of things in your project.  Of course, also commits in this model must be   relatively small. This is a natural requirement  because you cannot risk big bloated commits   in such an environment where things are  constantly integrated into production code.   And this also means that you must have a high  quality testing environment setup. Again,   the premise in this model is that code is  integrated very quickly into your main line   your production code. And this means that testing  and QA standards in your team must be top notch.   If you don't have this, this model will not  work for you. The other end of the spectrum   is when multiple different types of branches  enter the stage. So here branches are used   to fill to fulfill different jobs. New features  and experiments are kept in their own branches.   releases can be planned and managed it managed  in their own branches. And even different states   in your development flow, like production  develop, can be represented by branches.   Remember that this all depends on the needs  and requirements of your team and project,   it's hard to say that one approach is better  than the other. And although a model like   this one seems complicated, it's mostly a  matter of practice and getting used to it.   And as I already said, in reality, most teams  are working somewhere in between these extremes.   Now let's look closer at two main types  of branches and how they are used.   These two types of branches are long  running and short lived branches.   So the distinction between a long  running and a short lived branch   is one of the broadest you can make and a  very helpful one. So let's start about talking   about the long running branches first. Every  Git repository contains at least one long   running branch typically something called main or  master. But there might be also other long running   branches in your project something like develop  or production or staging. For example, these   branches all have something in common they exist  throughout the complete lifecycle of the project.   I've already mentioned one typical example of  such a long running branch. Every project has a   mainline branch like master or main. And another  type of long running branches are so called   integration branches, often named develop or  staging. Typically, these branches represent   states in a project release or deployment process.  If your code moves through different states,   for example, from development to staging to  production, it makes a lot of sense to mirror the   structure in your branches to. And finally, many  teams have a convention connected to long running   branches. Typically commits are never directly  added to these branches. commits should only make   it to the long running branch through integration.  In other words, through a merge or rebase.   There are a couple of reasons for such a rule.  One has to do with quality. You don't want to   add untested and reviewed code to your production  environment as an example. And that's why code   should go through different states tests and  reviews before finally arrives on production.   Another reason might be release bundling and  scheduling, you might want to release new code   in batches maybe even thoroughly scheduled.  And without such a rule. When code is directly   committed to long running branches like main,  keeping an eye on what's released becomes pretty   difficult. Now the other type of branches are  short lived branches. And in contrast to long   running branches, they are created for certain  purposes, and then deleted after they have been   integrated. There are many different reasons  to create short live branches. For example,   when you start working on a new feature, a bug  fix or refactor refactoring or an experiment. And   typically a short lived branch will be based on a  long running branch. For example, when you start   a new feature, you might base that new feature  on your long running main branch for example,   and after making some commits and finishing  your work, you probably want to re integrate it   back into main. And after you've safely merged or  rebased it your feature branch can be deleted. And   I've already said that branching strategies  will be different for each team and project.   It highly depends on your preferences  or team size or type of project.   But I'd like to give you a glimpse into two  pretty popular branching strategies, and take both   of them as inspiration for your own individual  branching strategy. Let's start with GitHub flow.   GitHub advocates a workflow that's extremely  lean and simple. It only has a single long   running branch, the default main branch, and  anything you're actively working on is done in   a separate branch, a short left branch, no matter  if that's a feature a bug fix, or a factoring.   So this is a very simple, very lean setup. Another  very popular model is called Git flow. And this   offers a bit more structure but also more rules to  follow. So the main branch is a reflection of the   current production state. The other long running  branch is typically typically called develop   in any feature branches start from  this one, and will be merged back   into it. Develop is also the starting point  for any new releases, you would open a new   release branch, do your testing commit  any bug fixes to that release branch.   And once you're confident that it's ready  for production, you merge it back into   main, you would then add a tag for that release,  commit on Main, and close the release branch.   As you can see, good flow defines quite a  couple of tasks and steps in the process.   In tower, the good desktop GUI that we make,  we support users by offering these tasks as   shortcuts in the app. And that way I can show  you here, so you have all of the most important   actions that Git flow brings to you. So you  don't have to remember all of the bits and   pieces and what you have to do and what comes  next, that makeup, these different steps.   So if you ask different teams, how they are using  branches, you will get many different answers.   There is no perfect branching model that everyone  should adopt. It's more about understanding your   project, your release workflow and your team,  and then modeling a branching workflow that   supports you in the best way possible. Let's talk  about pull requests. First of all, you need to   understand that pull requests are not a core good  feature. There are provided by your Git hosting   platform, which means they work and look a little  bit differently on GitHub, git lab, Bitbucket,   Azure DevOps, or whatever you're using. But the  basic principles and ideas are always the same.   Let's start by talking about why you would  want to use pull requests at all. In essence,   they are a way to communicate about code and  review it. The Perfect example is when you've   finished working on a feature, without a pull  request, you'd simply merge your code into main   master or some other branch. And in some cases,  this might be totally fine. But especially when   your changes are a bit more complex or a bit  more important, you might want to have a second   pair of eyes look over your code. And this  is exactly what pull requests were made for.   With pull requests, you can invite other people  to review your work and give you feedback.   And after some conversation about the code,  your reviewer might approve the pull request and   merge it into another branch. Apart from this,  there's another important use case for pull   requests. It's a way to contribute to code to  repositories, which you don't have right access   to think of a popular open source repository, you  might have an idea for improving something, but   you're not one of the main contributors and you're  not allowed to push commits to their repository.   This is another use case for pull requests. And we  also have to talk about forks in this connection   of fork is your personal copy of a git repository.  And going back to our open source example,   you can fork the original repository.  make changes in your forked version and   open a pull request to include those  changes into the original repository.   And one of the main contributors can then review  your changes and decide to include them or not.   I already mentioned it. Every good platform has  its own design and understanding of how pull   requests should work. And they look a little  bit different on GitHub, git lab Bitbucket or   Azure DevOps, or whatever you're using. So here  is an example we'll use the GitHub interface.   For this test case, let's use the Ruby on  Rails open source repository and Let's see.   Alright, so here we are on GitHub on the Ruby  on Rails main repository. And in the top right,   I can fork this repository, so I can create  my own personal version of the repository.   And it's code base. And again, a reminder  about why we're doing this, I don't have   access to push code into Ruby on Rails, into the  Ruby on Rails repository. And for good reasons,   of course, because I'm not a Ruby on Rails Pro.  But in my own fork repository, I can make changes,   I can make whatever changes I want. So I just  did that I forked the repository. And I can now   simply clone that I'll get the clone URL, and then  on the command line, git clone and remote URL.   And we will, in a second, when  this has finished cloning,   we will create a branch and make some changes. So  this is also important to understand pull requests   are always based on branches,  not on individual commits.   So we're creating a new branch which we later  request to be included. And let's ogburn   let's go into rails, and open this in my editor.  And I'll just create a brief branch git branch   test and git checkout test. Alright,  so I now am on a new branch and can   make a silly little change, let's change  something in the readme file. This is a,   an awesome web application framework, close  this. Alright, let's take a look at our changes.   git add README and git commit dash m silly   little change. So we now have made some small  changes on a separate branch, and we can push that   branch to our own remote repository, our fork,  so git push set upstream origin tests test. And   once this is available, okay, so this has worked.  So we have now created the changes that we can   request to be included. Once I've pushed  them to my remote repository on GitHub,   I can take another look at the repository  in the browser and see what happened.   And voila, GitHub has noticed that I just pushed  something here. And since it's a fork of forked   repository, GitHub detected my changes, and  automatically asks me if I want to create a pull   request with those changes. Because in a forked  environment, this is mostly what you want to do.   And if I do, I can propose which branch they  should be integrated in. So I'll start the   pull request process here. So at the moment, I'm  proposing to integrate my changes from my little   branch here in the fork back into the main branch  in Rails, and let's say that is okay. And I can   add some comment. And I could then create the  pull request and the maintainer of the original   repository would then be notified, and they can  review my changes, and possibly integrate them.   Merge conflicts, nobody likes them,  but they are a fact of life when you're   working with Git. And in most cases, they  are not as tragic as we often think. Oh,   we'll talk about when they happen, what  they actually are and how to solve them.   Alright, so the name already says it. Merge.  Conflicts can occur when you integrate when your   merge changes from a different source. But keep  in mind that integration is not limited to only   merging branches. Conflicts can also happen when  rebasing interactive rebasing when performing a   cherry pick or a pull, or even when reapplying  a stash, and all of these actions performed   some kind of integration and that's when merge,  conflicts can happen. Of course, these actions   don't result in a merge conflict every time  Thank God. But when exactly do conflicts occur,   actually gets merging capabilities are one of its  greatest features and advantages. Merging branches   works effortlessly most of the time, because Git  is usually able to figure things out on its own.   But there are situations where contradictory  changes were made. And that's when technology   simply cannot decide what's right or wrong.  These situations require a decision from a human.   The true classic is when the exact  same line of code was changed into   commits on two different branches. Git has  no way of knowing which change you prefer.   There are some other similar situations  that are a little bit less common,   for example, when a file was modified  in one branch and deleted in another.   But the same, but the problem is always the  same changes contradict when you're working   with a desktop GUI, like tower forget, that  can make things easier, especially because it's   just more visual, I can select things here. And  this helps me understand what actually happens,   I can see these two changes conflict and I  can select one or both or just this year and   solve the conflict pretty easily. How do  you know when a conflict has occurred?   Don't worry about that get will tell you very  clearly, when a conflict has happened. First,   it will let you know immediately in the  situation, for example, when a merge or rebase   fails with a conflict. So let's try this  out. Actually, we have something here, let's   provoke a merge conflict. And I'll just try  to merge in developer to my main branch.   And voila, automatically I can see something is  wrong here. Conflict conflict conflict, automatic   merge failed. So you can see that when I tried  to perform the merge, I ran into conflict and get   tells me instantaneously about the problem. But  even while even if I had overlooked these warning   messages, I would find out about the conflict  the next time I run git status. So let's do that.   And pretty quickly, you have this unmerged path  category in the status here. So in other words,   don't worry about not noticing, merge conflicts  get makes sure you can't overlook them.   All right, um, though you  can't ignore a merge conflict,   you really have to deal with it  before you can continue your work.   Dealing with a merge conflict doesn't necessarily  mean you have to resolve it, you can also undo   it. And this is sometimes very helpful. So keep  this in mind always, you can always undo a merge   conflict and return to the state before. And  this is true even when you've already started   resolving some of the conflicted files and you  notice Oh god, I'm, I'm on the wrong track here.   Even then, when you find yourself in a  dead end, you can still undo the merge.   And some commands come with an abort option that  lets you do exactly that. So the most prominent   examples are Git merge, Uber abort, and Git rebase  abort. So in our example, here, when I find why   I don't have the time to deal with this right  now, or I've resolved something the wrong way,   I can always type Git merge dash dash abort here,  and get status shows me I'm back to normal again.   So this should give you the confidence that you  really cannot mess up, you can always aboard,   return to clean state and try again start  over. So let's see what a conflict really   looks like under the hood, we will  demystify those little buggers. And,   at the same time, help you lose respect for  them and gain a little bit of confidence.   So as an example, let's look at the contents  of one of the conflict files. I'll provoke that   merge conflict once again, and I can see that  in my index HTML file, I have a conflict. So   let's take a look at that. And  Nope, not this one. But this one.   So get was kind enough to mark the problematic  areas in the file. So they're surrounded by these   symbols here. This is the start and this is the  end of the problem area. So the content that   came that comes after the first marker, originally  originates from our current working branch, then   aligned with some equal lines equal signs,  separates the two conflicting changes. And   finally, this year came from the other branch  that's displayed as well. So in this case,   it's pretty simple. In the develop branch  where I made some changes, I deleted this   list item, these list items, and in my head  branch, I changed them. So Git is unsure,   did you want to change them? Like Like  this? Or did you want to delete them?   Like here? And I have to tell git,  what's correct and what's not.   Okay, so how can you solve a conflict, solving  the conflict is actually pretty simple, we need   to clean up these lines. And after we finish, the  file should look exactly as we want it to look.   So it might be necessary to talk to the teammate  who wrote the other changes and decide which code   is actually correct, maybe it's ours, maybe it's  theirs, maybe it's a mixture between the two.   And this process of cleaning up the file, making  sure it contains the what we actually want. This   doesn't have to involve any magic, you can do this  simply by opening your text editor or ID and make   some changes. Sometimes, though, you'll find that  this is not the most efficient way, that's when   dedicated tools can save you a little bit time and  effort. So on the one hand, there are good desktop   gooeys. Some of the graphical user interfaces for  good can be helpful when solving conflicts, you've   already seen one. So this is tower where you could  where you can see what happened in the conflict.   And this visualizes the problem. And on the  other hand, there are dedicated merge tools.   For more complicated conflicts can be great to  have a dedicated different merge tool at hand,   you can configure a tool of choice using the Git  config command. And then in case of a conflict,   you can simply type Git merge tool and have it  open the conflict, I have a kaleidoscope app   installed on my Mac. So let's just try this Git  merge tool. I configured that. So the first one,   as you can see is a pretty easy one or the  second one here, error HTML was deleted. So   I don't need to see that I just need to decide,  do I want to keep it or do I want to delete it.   So I'll stick with the deletion. And for the  second one, there's really content in the file,   where it makes sense to open that merge tool  that I configured, I can see, well, this is   the the change that I made. And this is the change  that came from another person or from a different   branch. And what do I want to look like and what  I wanted to look like I can choose these changes,   or these year or I can make my own changes here.  So after cleaning up the file, either manually   or in a desktop, do your GUI or merge tool, we  have to commit this like any other change. So I   can save it here and say this is resolved. And  if I type git status, I can see these changes   would be committed, I've made some changes here  in index HTML, this here is just a safety copy,   you can configure that also to happen. So you  can always return to the to the original file.   But I would actually just commit this  here. And simply by committing the resolved   files, I signal to get that conflict is  completed. And I can go on with my work.   Most developers understand that it's  important to use branches in git,   because having separate containers  for your work is incredibly helpful.   Let's talk a bit about integrating branches  about getting your new code back into an existing   branch. There are different ways to do this, and  the two most common ones are merge and rebase.   Let's start by talking about  merge and how it actually works.   When Git performs a merge, it looks for three  commits. First the common ancestor commit.   If you follow the history of two branches in a  project, they always have at least one commit in   common. At this point, both branches had the same  content. And after that they evolve differently.   The other interesting commits are  the end points of each branch.   Remember that the goal of an integration is  to combine the current states of two branches.   So the latest revisions are of course  important. Combining these three commits,   will perform the integration that we're aiming  for. I've chosen a very simple example case   here because one of the two branches are  a branch a here, it didn't receive any new   commits after the branching happened. So its  latest commit is also the common ancestor.   In this case, the integration is dead simple  good can just add all the new commits from   branch B on top of the common ancestor  commit. And and get the simplest form   of integration is called a fast forward merge.  Both branches then share the exact same history.   In most most cases, however, both branches  move forward differently, of course.   And to make an integration, good,  we'll have to create a new commit   that contains the differences between them.  And this is what we call a merge commit.   Normally, a commit is carefully created by a  human being at some meaningful unit that wraps   only related changes in the commit message  provides context and notes. Now, a merge   commit is a bit different. It's not created by a  developer, it gets created automatically by Git.   And it also does not wrap a set of related  changes. Its purpose is to connect two   branches just like a knot. If you want to  understand a merge operation after the fact,   you need to take a look at the history of  both branches and their commit history.   Now let's talk about rebase. But before we  start, let me emphasize something rebase is   not better or worse than merge. Most importantly,  it's different. You can live a happy, good life   just using merge. But rebase has its pros and  cons. So knowing what it does, and when it could   be helpful is nice. Alright, remember that we  just talked about the automatic merge commit,   some people would prefer to go without these,  they want the project history to look like a   straight line, without any science that it had  been split into multiple branches at some point,   even after branches have been integrated. And  this is what happens with rebase. Let's walk   through a rebase operations step by step. The  scenario is the same as in the previous example,   we want to integrate changes from branch  B into branch a. But now by using rebase.   The actual Git command to start this is  really simple. It's just Git rebase and   the branch. Similar to Git merge, we just  tell git, which branch we want to integrate.   But let's take a look behind the scenes.  First, git will remove all commits on branch   a that happened after the common ancestor commit.  But don't worry, will not throw them away,   you can think of those commits as being  parked as saved somewhere temporarily.   Then get applies to new commits from branch B.  And at this point temporarily, both branches   look exactly the same. But in the final step,  those parked commits need to be included,   the new commits from branch a, they're positioned  on top of the integrated commits from branch B,   they are rebased, as you can see, and the result  looks like development had happened in a straight   line, there is no merge commit that contains all  the combined changes, we preserve the original   commit structure. There's one more thing an  important thing to understand about rebase   it rewrites commit history. So take a close look  at this last diagram here. Commit c three has an   asterisk symbol, it has the same contents as C  three, but it's effectively a different commit.   Because it now has a new parent before the rebase.  See, one was its parent. And after the rebase,   it's C for which it was rebased onto a  commit has only a handful of important   properties like the author date change set and  who its parent commit is. and changing anything   any of this information, if effectively creates a  completely new commit and with a new commit hash.   So we writing history, like that is not  a problem for commits that haven't been   published or pushed yet. But if you're rewriting  commits that have already been pushed to a remote   repository, you might be in trouble. Because  another developer might have based their work   on the original c three commit, which is not here  anymore. So let's close this topic with a simple   rule. Do not rewrite commits, that you've already  pushed to a shared repository. tools like rebase,   you should only use them for cleaning up your  local commit history. For example, for a feature   branch that you've been working on for some time,  and before you integrate that back into a team   branch, then you're using re Based on that, that's  what these tools like in rebase were made for.   Alright, so much for today. Be sure to check out  my little advanced Git kit. It's completely free   of charge. It's a little collection of short  videos about a lot of advanced Git topics from   things like interactive rebase, all  the way to branching strategies,   merge conflicts, sub modules, what  have you. It's really helpful if you   want to become more productive with  Git and version control. And again,   it's free. More right? Have fun and see you  soon. Here on the Free Code Camp YouTube channel.