An Introduction to Git

The new Review Board UCOSP team just had its first IRC meeting (minutes forthcoming).  A ton of learning has already taken place — Christian Hammond (ChipX86) and David Trowbridge (purple_cow) were generous enough to stick around after the meeting to give an introduction to Git.

The following is a transcript of that introduction from our IRC meeting.

<ChipX86>    so, git is a bit different than what you’re probably used to. It’s a DVCS — distributed version control system

*    markstrmr has quit (Ping timeout: 276 seconds)

<ChipX86>    a Git checkout stores the entire history of the repository

<ChipX86>    where this is useful is that you don’t need a central server for your sourcecode. Anyone who has a checkout can work with anyone else with a checkout

<ChipX86>    this is done by way of “remotes.” A git repository can be linked up to ther repositories by specifying a “remote” to their repository

<ChipX86>    you’ll be dealing with two remotes:

<ChipX86>    origin (our reviewboard git repo) and your own (usually designated by your github username)

<ChipX86>    so your first step will be to check out the repository

<ChipX86>    you’ll want to pick a place for the sourcecode and do:

<ChipX86>    git clone git://github.com/reviewboard/reviewboard.git

<ChipX86>    then, once you’ve “forked” our repository (go to https://github.com/reviewboard/reviewboard/ and click Fork near the top right), you can add it with:

<ChipX86>    git remote add yourusername git@github.com:yourusername/reviewboard.git

<ChipX86>    you’ve probably done the initial checkout at this point, given that you’ve hit setup issues, but just to go over it..

*    markstrmr (~mstriemer@wnpgmb1154w-ds01-92-212.dynamic.mts.net) has joined #reviewboard-students

<ChipX86>    now, any development you do should be done on “topic branches.” A branch in git is a very light-weight thing. It just gives a name to a set of commits. You can easily switch back and forth between branches.

<ChipX86>    “master” is the main branch. It will contain the commits made upstream, which you’ll need to sync every so often (more on this later)

<Mengyun>    question

<ChipX86>    sure

<Mengyun>    what’s the purpose of the 2nd step?

<ChipX86>    adding your own repo?

<Mengyun>    remote add…

<ChipX86>    the idea there is that any projects oyu’re working on will be occasionally “pushed” to your own fork of reviewboard

<Mengyun>    creat own repo as a fork of rb?

<ChipX86>    which is a way of backing things up

<ChipX86>    and also lets us see your code

<ChipX86>    since git is distributed, it’s common for people to have their own copy of a project somewhere with their own changes

<Mengyun>    oh i see thx!

<ChipX86>    we had a case last term where a harddrive failed

<ChipX86>    and much of the project was lost

<ChipX86>    if the code had been up on the github fork, there wouldn’t have been as much of a setback

<ChipX86>    I’ll go into pushing the code more in a bit

<ChipX86>    so, branches

<ChipX86>    when you’re doing work, you’ll want to create a new branch for that work

<ChipX86>    to do this: git checkout -b <branchname>

<ChipX86>    that creates the branch and switches to it

<ChipX86>    from there, go about your coding

<ChipX86>    when you want to commit, you can do: git commit -a, which will commit all modified files, prompting you for a change description

<ChipX86>    you can use `git add` to add any new files

<ChipX86>    sometimes you’ll want to be more specific

<ChipX86>    you may have 10 files modified, but you only want to commit 2

<ChipX86>    in that case, do `git add` on the files you want to commit, and then `git commit`

<ChipX86>    one thing that git is really good at is staying organized

<ChipX86>    you’ll want to develop a habit of trying to keep your commit history clean

*    markstrmr (~mstriemer@wnpgmb1154w-ds01-92-212.dynamic.mts.net) has left #reviewboard-students

<ChipX86>    there are a few ways to do this

*    Mark_ (~markstrmr@wnpgmb1154w-ds01-92-212.dynamic.mts.net) has joined #reviewboard-students

*    Mark_ (~markstrmr@wnpgmb1154w-ds01-92-212.dynamic.mts.net) has left #reviewboard-students

<ChipX86>    `git add` has this awesome parameter, `-p`, which will allow oyu to stage only certain parts of your change for committing

*    markstrmr_ (~markstrmr@wnpgmb1154w-ds01-92-212.dynamic.mts.net) has joined #reviewboard-students

<ChipX86>    useful for leaving out, say, debug output

<ChipX86>    if you’ve committed a bunch of changes that you want to clean up, oyu can do: git rebase -i master

<m_conley>    back

<ChipX86>    that will present the commit history between the master branch and the tip of your branch

<ChipX86>    you’ll be able to delete commits, squash them together, rearrange them, change the descriptions, etc.

<ChipX86>    powerful feature. Worth playing with with some test commits.

<ChipX86>    a good goal is to squash commits together often and have one commit per logical change

<ChipX86>    as an example:

<ChipX86>    say I’ve made 5 commits while trying to get something to work

<ChipX86>    commits 2, 3 and 4 are really the same effective change, but I’ve tried a few ways of doing it before settling on it, and I don’t want all that history around forever because it’s hacky or something

<ChipX86>    I can do:

<ChipX86>    git rebase -i master

<ChipX86>    scroll down to commits 3 and 4 and change the action from “pick” (which means to just keep that commit) to “squash” (which means to squash it into the previous commit)

<ChipX86>    when I save and close, it’ll squash those commits into one commit, let me edit the change description, and then save the new history

<ChipX86>    that’s one option for git rebase. rebase is typically used to take a branch of commits and move it onto another branch. Such as when you update the master branch and want to make all oyur changes up to date

<ChipX86>    the thing is, you only want to use rebase if you haven’t yet pushed those changes to your own github

<ChipX86>    if you have done that, you’ll instead want to use `git merge master`

<ChipX86>    the reason is that rebase actually creates new commits and deletes the old ones, which will mess up your history on your github. `git merge` will instead keep those commits and create a new one with the merge.

<ChipX86>    that can be kind of confusing, so if you have quesitons I’d be happy to answer them

<Steve_Sutcliffe>    so git merge master will do the same thing as rebase?

<ChipX86>    they’ll both have the same end result of making your branch up-to-date

<ChipX86>    but the way they do it is different

<ChipX86>    every commit has an identifier, a SHA1

<ChipX86>    if you use rebase, the commits all get new SHA1s

<ChipX86>    if someone else is using your github fork, and you rebase something that’s on there, they won’t be on your set of commits anymore. Those commits will have moved.

<ChipX86>    if you instead use merge, the commits retain their SHA1s. A new commit is then made with the merge.

<ChipX86>    so, rule of thumb: use rebase to clean things up, *before* those commits are pushed to your github. Use merge after.

<ChipX86>    that also means it’s important to organize your commits prior to pushing

<ChipX86>    once you push, you can’t really go back

<ChipX86>    so, pushing and pulling

<ChipX86>    you’re going to want to periodically update your master branch with any upstream changes

<ChipX86>    quickest way to do that is to check out the master branch, and type: git pull origin master

<ChipX86>    which pulls the master changes from the “origin” remote (our project’s github)

<ChipX86>    when you want to push to your own repository, be on that branch (git checkout <branchname>) and push with: git push <yourusername> <branchname>

<ChipX86>    the <branchname> is only needed the first time

<ChipX86>    after that, just: git push <yourusername>

<ChipX86>    (the first time, it’ll create that branch on your github, and subsequent “push <yourusername>” will push all branches that your github knows about)

<ChipX86>    questions?

<m_conley>    the more you use Git, the easier it gets.  So I recommend practice, practice, practice.

<ChipX86>    yeah

<ChipX86>    you can create branches and just play with them, and then delete the branches

<ChipX86>    also, there’s tihs tool, gitk

<ChipX86>    (on MacOS X, download and install gitx instead)

<ChipX86>    gitk comes with git. Run it with: gitk –all

<ChipX86>    leave it up and running while you code

<ChipX86>    it will visually show your commit history

<ChipX86>    help you to visualize how these operations impact things

<m_conley>    Is everybody still OK out there?

<purple_cow>    one thing which I want to reemphasize is that you should leave “master” as a tracking branch for “origin/master” and only commit your own work onto other branches

<ChipX86>    yeah, good point

<m_conley>    Steve_Sutcliffe / Teresa / Mengyun / markstrmr_ / KAmlani / CrystalLokKoo:  all good?

<CrystalLokKoo>    for hte time bieng yes

<ChipX86>    I know I threw a lot at you

<CrystalLokKoo>    being*

<Steve_Sutcliffe>    I think so

<Mengyun>    i think so

<markstrmr_>    Sounds good

<ChipX86>    it’ll make more sense once you play with it. Just don’t commit to master and you won’t mess anything up 🙂

<Teresa>    I think so

<m_conley>    I’ll be putting what ChipX86 wrote up on the blog for reference

<ChipX86>    one more thing to go over

<ChipX86>    our post-review tool

<Mengyun>    is push a kind of commit?

<ChipX86>    push just takes your commits and puts them on a remote

<Steve_Sutcliffe>    when we do a push does that go to the main repository? or to that branch we created? or both?

<Steve_Sutcliffe>    (I mean the fork we created)

<ChipX86>    that’s where the <yourusername> part comes in. That’s the remote name. It’ll try to push to whatever origin you specify

<ChipX86>    I’d use: git push chipx86

<Teresa>    ok, so all of our commits are local until we push them?

<m_conley>    Teresa:  exactly.

<ChipX86>    nothing you do will impact the main reviewboard codebase

<purple_cow>    If you get confused and do something and it looks like your work is gone, don’t panic. Come here and we can usually figure out how to get it back.

<m_conley>    Yep.  In Git, it’s actually really difficult to lose everything.

<Steve_Sutcliffe>    those updates we do then, the pulls, that will only be from the main reviewboard codebase? Not from any work that any of us are doing?

<m_conley>    what’s more likely is that it’s all there.  And more.  And you need to disentangle it.

<ChipX86>    even if it looks like the commits are deleted

<m_conley>    Steve_Sutcliffe:  It depends.  If you’re working with a partner, you might want to pull in commits to their repository.

<Mengyun>    will this happen : i work with my fork but the main repo is updated during my work

<m_conley>    Steve_Sutcliffe:  but you’ll definitely want to pull from the reviewbaord codebase to keep up to date.

<purple_cow>    Steve_Sutcliffe: “git fetch” takes an argument for which remote you want to fetch (and also knows –all). Typically the only changes you’ll see are in the main codebase

<m_conley>    Mengyun:  definitely.

<Mengyun>    so what should i do then

<Steve_Sutcliffe>    ah I see

<m_conley>    Mengyun:  it’s your responsibility to keep your fork up to date.  On your machine, merge with reviewboard’s codebase, and then push to your fork.

<Mengyun>    how often do u recommend to do this update?

<ChipX86>    doesn’t have to be every day. Just as often as you need to.

<m_conley>    Mengyun:  well, every time you do a git pull origin master, if you notice some things have changed upstream, it’s a good idea to push them.

<ChipX86>    I tend to do it before I’m about to put somtehing up for review

<ChipX86>    you may have to resolve conflicts (wihch we’ll have to go over later)

Leave a comment