# Intro to Git & GitHub¶

Ha Khanh Nguyen (hknguyen)

• For this topic, we will use the materials provided in this textbook: Pro Git by Scott Chacon and Ben Straub.

## 1. What is Git?¶

• Git is a type of version control system (VCS).
• Well... What is a version control system?

### 1.1 Version Control System (VCS)¶

• Version Control is a system that manages changes to computer programs, documents, websites, databases, etc.
• Most VCSs think of the information they store as a set of files and the changes made to each file over time.
• This is commonly described as delta-based version control (focused on differences between versions).

### 1.2 How is Git different from other VCS?¶

• Git thinks of its data more like a series of snapshots of the filesystem.
• With Git, every time you commit, or save the state of your project, Git basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot.
• To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored.

• The table below summarizes the main differences between common VCS (Perforce, CVS or Subversion, etc.) and Git:
Common VCS Git
- delta-based version control - a stream of snapshots
- requires connection to databases or other computers on the network to access certain files - everything is stored locally
- allow limited operations without server connection - nearly every operation can be done locally

### 1.3 The 3 stages of Git¶

• Git has three main states that your files can reside in: modified, staged, and committed:

• Modified means that you have changed the file but have not committed it to your database yet.
• Staged means that you have marked a modified file in its current version to go into your next commit snapshot.
• Committed means that the data is safely stored in your local database.
• This leads us to the three main sections of a Git project: the working tree/directory, the staging area, and the Git directory.

• The basic Git workflow goes something like this:
• You modify files in your working tree.
• You stage just those changes you want to be part of your next commit, which adds only those changes to the staging area.
• You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.

## 2. Creating a Git Repository¶

• You typically obtain a Git repository in one of two ways:
• You can take a local directory that is currently not under version control, and turn it into a Git repository, or
• You can clone an existing Git repository from elsewhere. (this is what you have been doing with the labs)

### 2.1 Initializing a repository in an existing directory¶

• First, you need to open the command line prompt and navigate yourself to the directory you want to create the repo with.
• For example, it might look like this:
# macOS
cd Desktop/stat430/my-project

# Windows
cd Desktop/stat430/my-project
• Then, run the following command:
git init
• The above command creates a new subdirectory named .git that contains all of your necessary repository file.
• Note that at this point, nothing in your project is tracked yet.
• If you want to start version-controlling existing files (as opposed to an empty directory), you should probably begin tracking those files and do an initial commit.
git add -A
git commit -m "initial project version"
• Now, we need to connect it to GitHub (at this moment, your repository only exists locally):
• To do this, we have a choice between GitHub and UIUC CS GitHub Enterprise
• Create a new repository, then copy the URL (with .git at the end) provided after the repo is created.
• Go back to your command line prompt:
git remote add origin <your repo URL here>
git push origin master

### 2.2 Cloning an existing repository¶

• We're very familiar with this option as we did this for our very first lab!
• To clone an existing repo, use the git clone command:
git clone <url>

## 3. Recording Changes to the Repository¶

• The diagram below explains all the states of the files in a repository

### 3.1 Checking the status of your repository¶

• The main tool you use to determine which files are in which state is the git status command.
• If you run this command directly after a clone or after pushing to the remote repo, you should see something like this:

• Now, let's say you make some changes, for example, I will add a file called test.md to this directory:

• This means that the new file test.md is untracked!

### 3.2 Tracking new files¶

• In order to begin tracking a new file, you use the command git add.
• To begin tracking the test.md file and ALL other new files, you can run this:
git add -A
• If you want to only track the test.md file, you can instead run the following command:
git add test.md
• I will go ahead and run the 1st one since I prefer tracking all the files.
• After that, let's run git status again to see if the status changes.

### 3.3 Staging modified files¶

• Now, we can also edit existing files that are already tracked. I will change hw1-soln.ipynb.

• The hw1-soln.ipynb file appears under a section named “Changes not staged for commit” — which means that a file that is tracked has been modified in the working directory but not yet staged.
• To stage it, we run the git add command.
• git add command is a multipurpose command: it can be used to begin tracking new files, stage files and other things.
• It may be helpful to think of it more as “add precisely this content to the next commit” rather than “add this file to the project”.
• We will stage the hw1-soln.ipynb file, then run git status again!

• Ok, so now, both these 2 files are staged and ready to be committed!
• Let's say we need to go back to hw1-soln.ipynb to make another change!

• What is happening? hw1-soln.ipynb is in both "Changes to be committed" and "Changes not staged for commit"!
• BIG NOTE: if you modify a file after you run git add, you have to run git add again to stage the latest version of the file!

• Remember that anything that is still unstaged — any files you have created or modified that you haven’t run git add on since you edited them — won’t go into this commit.
git commit -m "add test.md + changes hw1-soln"
• The -m flag is to indicate that the included string is the commit message.

### 3.5 Pushing the changes to remote repository¶

• The changes are all committed in your local repo, but to have these changes updated in the remote repo (the one on GitHub or GitHub Enterprise).
git push origin master

• And when we got to GitHub Enterprise to check, we see that the latest commit is there!

This lecture notes referenced materials in 1.3, 2.1 and 2.2 sections of Pro Git by Scott Chacon and Ben Straub.