Version Control¶
A Version Control System (VCS) is a class of software that tracks and manages changes to a collection of files over time. It is crucial for:
- Collaboration: Allowing multiple people to work on the same project simultaneously without stepping on each other's toes.
- History Tracking: Maintaining a complete history of every change made, including who made it, when, and why.
- Rollbacks: Providing the ability to revert to any previous, working state of the code.
- Audit Trail: Providing accountability and traceability for compliance and debugging.
- Experimentation: Enabling risk-free experimentation through isolated branches.
Think of it as an "undo" button for your entire project, but one that works across time and collaborators.
Types of Version Control Systems¶
Local Version Control Systems¶
The simplest form of version control—copying files into another directory (perhaps with timestamps). This approach is incredibly error-prone:
- Easy to overwrite the wrong file
- No way to understand what changed between versions
- No collaboration support
Early tools like RCS (Revision Control System) kept patch sets on disk to reconstruct any file's state, but were limited to single files.
Centralized VCS (CVCS)¶
Systems like Subversion (SVN), Perforce, and CVS. They use a single, central server to store the versioned files, and clients check out a working copy.
Architecture:
┌─────────────────────────────────────────────────────────┐
│ Central Server │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Complete Repository │ │
│ │ - All versions │ │
│ │ - All history │ │
│ │ - All metadata │ │
│ └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
▲ ▲ ▲
│ │ │
Checkout Checkout Checkout
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Client A │ │Client B │ │Client C │
│(Working │ │(Working │ │(Working │
│ Copy) │ │ Copy) │ │ Copy) │
└─────────┘ └─────────┘ └─────────┘
Advantages:
- Everyone knows what everyone else is doing (to a degree)
- Administrators have fine-grained control over permissions
- Simpler to manage than local VCS
Drawbacks:
- Single point of failure: If the central server fails, no one can commit changes
- Data loss risk: If the hard disk is corrupted without backups, you lose everything
- Network dependency: Operations require network connectivity
- Performance: Large operations (branching, history viewing) are slow over network
Distributed VCS (DVCS)¶
Git, Mercurial, Bazaar, and Darcs are examples. Clients don't just check out the latest snapshot; they mirror the entire repository (including its full history).
Architecture:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Developer A │ │ Developer B │ │ Developer C │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Complete │ │ │ │ Complete │ │ │ │ Complete │ │
│ │ Repository │ │◄───►│ │ Repository │ │◄───►│ │ Repository │ │
│ │ + History │ │ │ │ + History │ │ │ │ + History │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │Working Copy │ │ │ │Working Copy │ │ │ │Working Copy │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
▲ ▲ ▲
│ │ │
└─────────────────────┼───────────────────────┘
│
▼
┌─────────────────────┐
│ Remote Server │
│ (Central Reference) │
│ - Full Repository │
│ - Coordination Hub │
└─────────────────────┘
Advantages:
- Full redundancy: Every clone is a complete backup
- Offline work: Commit, branch, merge, and view history without network access
- Speed: Most operations are local
- Flexible workflows: Support for complex branching and merging patterns
- No single point of failure: Any client repository can restore the server
Git: The Distributed VCS¶
Git is the most widely used modern Distributed Version Control System (DVCS), created by Linus Torvalds in 2005 for Linux kernel development. Its distributed nature, speed, and powerful branching model are its key strengths.
Design Philosophy¶
Git was designed with specific goals in mind:
- Speed: Operations should be fast, especially for large projects like the Linux kernel
- Simple design: A clean internal model (content-addressable filesystem)
- Strong support for non-linear development: Thousands of parallel branches
- Fully distributed: Every developer has the complete history
- Able to handle large projects efficiently: Like the Linux kernel
Key Concepts in Git¶
Here are the core concepts that define how Git operates:
1. Repository (Repo)¶
The repository is the database containing all the files, the history of changes, and all the objects (commits, trees, blobs, tags). It's essentially the entire project, usually stored in a hidden folder named .git within your project directory.
There are two types of repositories:
- Bare repository: Contains only the
.gitdirectory contents (no working directory). Used on servers for sharing. - Non-bare repository: Contains both the
.gitdirectory and a working directory. Used for development.
2. Snapshot Model (Not Delta/Diff-based)¶
Unlike older VCSs that might only store the differences (deltas) between files, Git stores data as a series of snapshots of the entire project every time you commit. If a file hasn't changed, Git simply stores a link to the previous identical file it has already stored. This makes retrieving any version very fast.
Comparison:
Delta-based VCS (SVN, CVS):
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│ V1 │───►│ Δ1→2 │───►│ Δ2→3 │───►│ Δ3→4 │
│(Full)│ │ │ │ │ │ │
└──────┘ └──────┘ └──────┘ └──────┘
To get V4: Apply V1 + Δ1→2 + Δ2→3 + Δ3→4 (slow!)
Snapshot-based (Git):
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│ V1 │ │ V2 │ │ V3 │ │ V4 │
│(Full)│ │(Full)│ │(Full)│ │(Full)│
└──────┘ └──────┘ └──────┘ └──────┘
To get V4: Just retrieve V4 (fast!)
3. The Three States (Working Directory, Staging Area, Repository)¶
Git manages your project files in three main states:
┌───────────────────────────────────────────────────────────────────┐
│ Your Computer │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
│ │ Working │ │ Staging │ │ Git │ │
│ │ Directory │───►│ Area │───►│ Repository │ │
│ │ │add │ (Index) │commit│ (.git) │ │
│ │ - Your files │ │ - Next commit │ │ - History │ │
│ │ - Editable │ │ - Prepared │ │ - Permanent │ │
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
│ ▲ │ │
│ └─────────────────────────────────────────────┘ │
│ checkout │
└───────────────────────────────────────────────────────────────────┘
- Working Directory (Working Tree): The actual files you are currently seeing and editing on your disk. This is a single checkout of one version of the project.
- Staging Area (Index): An intermediary area where you prepare a snapshot of changes before committing them. This allows you to group related changes into a single, logical commit. Stored in
.git/index. - Git Directory (Repository): Where Git permanently stores the history of your changes as commits. The
.gitfolder.
4. Commit¶
A commit is the primary way to record changes in Git. It is a snapshot of your project's files at a specific point in time.
Anatomy of a Commit:
┌─────────────────────────────────────────────────────┐
│ Commit Object │
│ │
│ SHA-1: a1b2c3d4e5f6g7h8i9j0... │
│ │
│ ┌───────────────────────────────────────────────┐ │
│ │ tree → 9f8e7d6c5b4a3210... │ │
│ │ parent → 0a1b2c3d4e5f6789... (or none) │ │
│ │ parent → (second parent for merges) │ │
│ │ author → "Alice <alice@example.com>" + ts │ │
│ │ committer→ "Bob <bob@example.com>" + ts │ │
│ │ │ │
│ │ Commit message describing the change │ │
│ └───────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
Each commit contains:
- A unique SHA-1 hash identifier (40 hexadecimal characters).
- A pointer to its parent commit(s) (creating a history chain).
- A pointer to a tree object (the snapshot of the files).
- Author: The person who originally wrote the work.
- Committer: The person who last applied the work (can differ after cherry-pick, rebase).
- Timestamps for both author and committer.
- A commit message describing the changes.
The SHA-1 Hash: The SHA-1 is computed from:
- The file contents (blob)
- The directory structure (tree)
- The parent commit(s)
- The author/committer information
- The commit message
This creates a cryptographic chain where any change to history would change all subsequent hashes.
5. Branching¶
Branching is arguably Git's most powerful feature. A branch is simply a lightweight, movable pointer to one of the commits.
main
│
▼
○───○───○───○───○───○
▲
│
HEAD
After creating a feature branch:
main
│
▼
○───○───○───○───○───○
│
└───○───○───○
▲
│
feature ← HEAD
- The default main branch is often called
mainormaster. - When you create a new branch, Git just creates a new pointer that points to the same commit the current branch is pointing to.
- When you commit on this new branch, the pointer moves forward, creating an isolated line of development.
- This isolation allows engineers to work on new features or bug fixes without affecting the stable
maincodebase.
How Branches are Stored:
Branches are simply files in .git/refs/heads/ containing the 40-character SHA-1 of the commit they point to.
$ cat .git/refs/heads/main
a1b2c3d4e5f6789012345678901234567890abcd
This is why branch operations in Git are so fast—they're just file operations.
6. HEAD¶
HEAD is a special pointer that tells Git which branch you're currently on. It's a symbolic reference that points to a branch (not directly to a commit, usually).
$ cat .git/HEAD
ref: refs/heads/main
Detached HEAD: When you checkout a specific commit (not a branch), HEAD points directly to that commit. In this state, new commits aren't associated with any branch and can be lost.
$ git checkout a1b2c3d
# Now in detached HEAD state
$ cat .git/HEAD
a1b2c3d4e5f6789012345678901234567890abcd
7. Merging¶
Merging is the process of integrating changes from one branch into another.
Fast-Forward Merge: When the target branch hasn't diverged, Git simply moves the pointer forward.
Before:
main: ○───○───○
│
feature: └───○───○───○
After git merge feature:
main: ○───○───○───○───○───○
│
feature: ────────────────────┘
Three-Way Merge: When branches have diverged, Git creates a new merge commit with two parents.
Before:
main: ○───○───○───○───○
│
feature: └───○───○───○
After git merge feature:
main: ○───○───○───○───○───M (merge commit)
│ │
feature: └───○───○───┘
The three-way merge uses:
- Base: The common ancestor
- Ours: Current branch tip
- Theirs: Branch being merged
8. Rebasing¶
Rebasing is an alternative to merging that rewrites history by moving commits to a new base.
Before:
main: ○───○───○───○───○
│
feature: └───○───○───○
A B C
After git rebase main (on feature branch):
main: ○───○───○───○───○
│
feature: └───○'───○'───○'
A' B' C'
Rebase vs Merge:
| Aspect | Merge | Rebase |
|---|---|---|
| History | Preserves complete graph | Creates linear history |
| Commits | Creates merge commit | Rewrites commit SHAs |
| Conflicts | Resolve once | May resolve per commit |
| Safety | Safe for shared branches | Dangerous for shared |
| Traceability | Shows when merge happened | Hides merge point |
Golden Rule of Rebasing: Never rebase commits that have been pushed to a public/shared repository.
Interactive Rebase:
git rebase -i allows you to modify commits before replaying them:
- pick: Use commit as-is
- reword: Change commit message
- edit: Stop to amend the commit
- squash: Combine with previous commit, keep message
- fixup: Combine with previous commit, discard message
- drop: Remove commit entirely
- reorder: Change commit order by rearranging lines
Essential Git Commands¶
Basic Workflow Commands¶
| Command | Purpose |
|---|---|
git init |
Creates a new Git repository in the current directory. |
git clone [url] |
Downloads a complete copy of an existing remote repository. |
git status |
Shows which files are modified, staged, or untracked. |
git add [file] |
Moves changes from the Working Directory to the Staging Area. |
git add -p |
Interactively stage hunks of changes (partial staging). |
git commit -m "msg" |
Permanently records the staged snapshot into the repository. |
git commit --amend |
Modify the most recent commit (message or content). |
git push |
Uploads local branch commits to the remote repository. |
git pull |
Fetches changes and merges them into the current branch. |
git fetch |
Downloads changes from remote without merging. |
Branching Commands¶
| Command | Purpose |
|---|---|
git branch |
Lists all local branches. |
git branch [name] |
Creates a new branch pointer. |
git branch -d [name] |
Deletes a branch (safe, checks if merged). |
git branch -D [name] |
Force deletes a branch. |
git checkout [name] |
Switches to a different branch. |
git checkout -b [name] |
Creates and switches to a new branch. |
git switch [name] |
Modern command to switch branches. |
git switch -c [name] |
Modern command to create and switch. |
git merge [name] |
Integrates changes from the specified branch. |
git rebase [branch] |
Reapplies commits on top of another base. |
History and Inspection¶
| Command | Purpose |
|---|---|
git log |
Shows commit history. |
git log --oneline |
Compact one-line-per-commit view. |
git log --graph |
ASCII graph of branch structure. |
git log -p |
Shows patches (diffs) for each commit. |
git diff |
Shows unstaged changes. |
git diff --staged |
Shows staged changes. |
git diff [branch1] [branch2] |
Compares two branches. |
git show [commit] |
Shows details of a specific commit. |
git blame [file] |
Shows who changed each line and when. |
Undoing Changes¶
| Command | Purpose |
|---|---|
git restore [file] |
Discards changes in working directory. |
git restore --staged [file] |
Unstages a file (keeps changes in working directory). |
git reset [commit] |
Moves HEAD to commit, keeps changes staged. |
git reset --soft [commit] |
Moves HEAD only, keeps staging area and working directory. |
git reset --hard [commit] |
Moves HEAD, discards staging area and working directory changes. |
git revert [commit] |
Creates new commit that undoes specified commit. |
git clean -fd |
Removes untracked files and directories. |
Remote Operations¶
| Command | Purpose |
|---|---|
git remote -v |
Lists all remotes with URLs. |
git remote add [name] [url] |
Adds a new remote. |
git remote remove [name] |
Removes a remote. |
git push -u origin [branch] |
Pushes and sets upstream tracking. |
git push --force-with-lease |
Force push with safety check. |
git pull --rebase |
Fetches and rebases instead of merging. |
Advanced Commands¶
| Command | Purpose |
|---|---|
git cherry-pick [commit] |
Applies a specific commit to current branch. |
git stash |
Temporarily stores modified tracked files. |
git stash pop |
Restores most recently stashed changes. |
git bisect |
Binary search to find commit that introduced bug. |
git reflog |
Shows history of HEAD movements (recovery tool). |
git tag [name] |
Creates a tag at current commit. |
git worktree add [path] [branch] |
Creates additional working directory. |
Git Internals Deep Dive¶
At its core, Git is not just a file system; it is a content-addressable filesystem layered with a Merkle Directed Acyclic Graph (DAG).
1. The Core Abstraction: The Object Database¶
Git does not store "changes" or "deltas" as its primary unit of storage. It stores Objects. When you run git add and git commit, you are creating objects in the .git/objects directory.
The database is a key-value store where:
- Key: The SHA-1 hash (40-character hexadecimal) of the content + a header.
- Value: The compressed data (zlib).
There are only four types of objects in Git's physics:
A. The Blob (Binary Large Object)¶
- Concept: Represents the content of a file.
- Data: It contains the file data only. It does not contain the filename, timestamps, or permissions.
- Implication: If you have two identical files named
A.txtandB.txtin different folders, Git stores only one blob. This provides automatic deduplication.
How the SHA-1 is computed:
$ echo "Hello World" | git hash-object --stdin
557db03de997c86a4a028e1ebd3a1ceb225be238
# The actual data stored is:
# "blob " + content_length + "\0" + content
# SHA-1("blob 12\0Hello World\n") = 557db03...
B. The Tree¶
- Concept: Represents a directory.
- Data: A list of pointers (SHA-1s) to Blobs or other Trees.
- Metadata: It stores the filenames, permissions (e.g.,
100644for RW,100755for executable), and the type (blob or tree). - Structure: This is what reconstructs your folder hierarchy.
Example Tree Structure:
100644 blob a1b2c3... README.md
100644 blob d4e5f6... package.json
040000 tree 789abc... src
040000 tree def012... tests
Tree modes:
| Mode | Meaning |
|---|---|
100644 |
Regular file |
100755 |
Executable file |
120000 |
Symbolic link |
040000 |
Directory (tree) |
160000 |
Submodule (gitlink) |
C. The Commit¶
- Concept: A snapshot of the project at a specific time.
- Data:
- Tree Pointer: A single SHA-1 pointing to the root Tree object (the top-level project folder).
- Parent(s): SHA-1 of the previous commit(s). Zero parents for initial commit, one for normal, two+ for merges.
- Metadata: Author, Committer, Timestamp, Message.
- Implication: Because a commit points to a specific Tree, and that Tree points to specific Blobs, a commit effectively freezes the entire state of the project.
Raw Commit Structure:
tree 9f8e7d6c5b4a3210fedcba9876543210fedcba98
parent a1b2c3d4e5f6789012345678901234567890abcd
author Alice <alice@example.com> 1234567890 -0500
committer Bob <bob@example.com> 1234567891 -0500
Add user authentication feature
This commit implements JWT-based authentication
with refresh token support.
D. The Tag¶
- Concept: A persistent pointer to a specific commit.
- Types:
- Lightweight tag: Just a reference (like a branch that doesn't move). No object created.
- Annotated tag: Creates a tag object with metadata.
- Data (annotated): Contains the SHA-1 of the target object, tagger name, date, message, and optionally a GPG signature.
Tag Object Structure:
object a1b2c3d4e5f6789012345678901234567890abcd
type commit
tag v1.0.0
tagger Alice <alice@example.com> 1234567890 -0500
Release version 1.0.0
This is the first stable release.
-----BEGIN PGP SIGNATURE-----
...
-----END PGP SIGNATURE-----
2. The Internal Data Model: Merkle DAG¶
Git connects these objects into a Directed Acyclic Graph (DAG).
┌────────────────┐
│ Commit C3 │
│ tree: T3 │
│ parent: C2 │
└────────┬───────┘
│
┌────────▼───────┐
│ Commit C2 │
│ tree: T2 │
│ parent: C1 │
└────────┬───────┘
│
┌────────▼───────┐
│ Commit C1 │
│ tree: T1 │
│ parent: (none) │
└────────┬───────┘
│
┌────────▼───────┐
│ Tree T1 │
│ README.md → B1 │
│ src/ → T1a │
└───┬────────┬───┘
│ │
┌────────▼──┐ ┌──▼────────┐
│ Blob B1 │ │ Tree T1a │
│"# Hello" │ │main.js→B2 │
└───────────┘ └─────┬─────┘
│
┌─────▼─────┐
│ Blob B2 │
│"console." │
└───────────┘
- Directed: Links go one way (Child Commit → Parent Commit).
- Acyclic: You cannot loop back to a parent.
- Merkle Property: The ID of every node (Commit) depends cryptographically on its contents and the IDs of its parents.
- If you change a single byte in a file 10 years ago, that file's Blob hash changes.
- The Tree containing that file changes hash.
- The Commit pointing to that tree changes hash.
- All subsequent child Commits change hashes.
- Result: This makes Git history immutable and tamper-evident.
3. Anatomy of the .git Directory¶
When you run git init, this directory is created. Here is what matters inside:
.git/
├── HEAD # Current branch pointer
├── config # Repository-specific configuration
├── description # Used by GitWeb (rarely used)
├── hooks/ # Client-side and server-side scripts
│ ├── pre-commit.sample
│ ├── pre-push.sample
│ └── ...
├── index # The staging area (binary)
├── info/
│ └── exclude # Local ignore patterns (not shared)
├── objects/ # Object database
│ ├── 00/
│ ├── 01/
│ ├── ...
│ ├── ff/
│ ├── info/
│ └── pack/ # Packfiles for efficiency
├── refs/ # References (branches, tags, remotes)
│ ├── heads/ # Local branches
│ │ └── main
│ ├── remotes/ # Remote-tracking branches
│ │ └── origin/
│ │ └── main
│ └── tags/ # Tags
└── logs/ # Reflog data
├── HEAD
└── refs/
Key Components:
| Path | Purpose |
|---|---|
objects/ |
The database. Contains subfolders 00–ff (first 2 chars of SHA-1) or pack/ (compressed archives). |
refs/ |
Where "names" are stored. refs/heads/main is literally a text file containing the 40-char SHA-1 of the latest commit. |
HEAD |
A text file containing a symbolic reference to the current branch (e.g., ref: refs/heads/main). |
index |
The Staging Area. A binary file acting as a cache. It stores a sorted list of all files, their SHA-1s, and their OS stat data (timestamps, inode, etc.). |
config |
Repository-specific Git configuration (overrides global ~/.gitconfig). |
hooks/ |
Scripts that run on specific Git events. |
The Role of the "Index" (Staging Area)¶
The index is a crucial optimization. When you run git status, Git does not re-hash every file in your project to check for changes (that would be too slow).
Instead, it compares the file's OS metadata (modification time, size) against the data cached in the index. Only if the metadata differs does it re-read and re-hash the file.
Index structure (simplified):
# Binary format containing:
- Number of entries
- For each entry:
- ctime, mtime (cached stat data)
- dev, ino (device and inode)
- mode (file permissions)
- uid, gid (owner)
- file size
- SHA-1 of the blob
- flags (stage, name length)
- file path
4. Advanced Mechanics: Packfiles & Delta Compression¶
If Git stored every version of every file as a full blob, the repo would explode in size. To solve this, Git uses Packfiles.
Loose Objects vs Packed Objects:
- Loose Objects: Initially, commits create individual files on disk (one file per object in
objects/xx/). - Packing (
git gc): Occasionally, Git gathers these loose objects and packs them into a single binary file (.pack) with a corresponding index (.idx). - Delta Compression: Inside a packfile, Git uses heuristics to find similar files (even with different names) and stores one as a base and the other as a "delta" (diff).
Pack File Structure:
┌────────────────────────────────────────┐
│ Pack File │
├────────────────────────────────────────┤
│ Header: "PACK" + version + object count │
├────────────────────────────────────────┤
│ Object 1: type + size + data │
│ Object 2: type + size + delta_base + Δ │
│ Object 3: type + size + data │
│ ... │
├────────────────────────────────────────┤
│ Trailer: SHA-1 of pack contents │
└────────────────────────────────────────┘
┌────────────────────────────────────────┐
│ Index File (.idx) │
├────────────────────────────────────────┤
│ Fan-out table (256 entries) │
│ SHA-1 list (sorted) │
│ CRC32 checksums │
│ Pack file offsets │
│ Large offsets (if needed) │
│ Pack checksum + Index checksum │
└────────────────────────────────────────┘
Delta Chain:
Base Object (full content)
└── Delta 1 (changes from base)
└── Delta 2 (changes from delta 1)
└── Delta 3 (changes from delta 2)
Note: Git often stores the newest version as the full file and the older versions as reverse-deltas. This ensures that checking out the latest code (the most common operation) is fastest.
Triggering Garbage Collection:
# Manual garbage collection
git gc
# Aggressive (more thorough, slower)
git gc --aggressive
# Check repository statistics
git count-objects -v
5. Merge Algorithms: "Recursive" vs "ORT"¶
When you merge two branches (say, feature into main), Git performs a Three-Way Merge. It identifies three points:
- Tip A (Ours): The latest commit on
main. - Tip B (Theirs): The latest commit on
feature. - Base: The "Lowest Common Ancestor" (LCA) in the DAG.
Base
│
┌────┴────┐
│ │
▼ ▼
Tip A Tip B
(main) (feature)
The Merge Algorithm:
- Find the common ancestor (merge base)
- Compare each file in Base vs A, and Base vs B
- For each file:
- If only A changed: use A's version
- If only B changed: use B's version
- If both changed same way: use that version
- If both changed differently: CONFLICT
The "Recursive" Strategy (Legacy Default): If the graph has "criss-cross" merges, there might be multiple valid common ancestors. The recursive strategy merges the ancestors first to create a "virtual" merge base, then uses that to merge A and B.
Criss-cross merge scenario:
○───○───○───○ (main)
\ / \ /
X X
/ \ / \
○───○───○───○ (feature)
Multiple common ancestors!
The "ORT" Strategy (Current Default since Git 2.33): Stands for "Ostensibly Recursive's Twin". It is a complete rewrite of the recursive strategy to handle edge cases (like complex renames) correctly and performs significantly faster by avoiding the creation of temporary objects on disk during the calculation.
ORT Improvements:
- Better rename detection
- Better directory rename handling
- Faster: operations happen in memory
- Cleaner conflict markers
- More correct handling of edge cases
6. Network Protocols: The Negotiation¶
When you git push or git fetch, Git does not just copy files. It negotiates changes using the Smart Protocol.
Protocol Types:
| Protocol | URL Format | Features |
|---|---|---|
| SSH | git@github.com:user/repo |
Secure, authenticated, efficient |
| HTTPS | https://github.com/user/repo |
Firewall-friendly, authenticated |
| Git | git://github.com/user/repo |
Fast, no auth (read-only usually) |
| File | /path/to/repo or file:// |
Local, for testing |
Smart Protocol Negotiation:
┌──────────────┐ ┌──────────────┐
│ Client │ │ Server │
└──────┬───────┘ └──────┬───────┘
│ │
│ 1. Reference Discovery │
│ ─────────────────────────────────>│
│ │
│ 2. Server sends refs + capabilities
│ <─────────────────────────────────│
│ │
│ 3. Client sends "wants" (commits) │
│ ─────────────────────────────────>│
│ │
│ 4. Client sends "haves" (existing)│
│ ─────────────────────────────────>│
│ │
│ 5. Server determines missing objects
│ │
│ 6. Server sends packfile │
│ <─────────────────────────────────│
│ │
Detailed Steps:
- Reference Discovery: The server lists its refs (branches/tags) and their SHA-1s.
- "Haves" and "Wants":
- The client calculates what it wants to send (for push) or receive (for fetch).
- The client says: "I want to send commit
C." - The server says: "I already have commit
A(which is an ancestor ofC)."
- Graph Traversal: Both sides traverse their local DAGs to find the minimal set of objects required to bridge the gap between "Have" and "Want".
- Packfile Generation: The sender generates a custom packfile on the fly containing only the missing objects and streams it over the network.
Protocol v2 (Modern): Git protocol v2 (introduced in Git 2.18) adds:
- Server-side filtering (partial clone)
- Reduced round trips
- Better capability negotiation
- Support for fetch-by-SHA-1
7. Plumbing vs. Porcelain¶
Git is built in two layers. You mostly use the Porcelain (user-friendly commands). The Plumbing (low-level commands) exposes the internal machinery.
Porcelain Commands (user-facing):
git add,git commit,git push,git pullgit branch,git checkout,git mergegit log,git diff,git status
Plumbing Commands (low-level):
| Command | Purpose |
|---|---|
git hash-object |
Compute object ID and optionally create blob |
git cat-file |
Examine object type, size, or content |
git ls-tree |
List contents of a tree object |
git ls-files |
Show information about files in index |
git read-tree |
Read tree into the index |
git write-tree |
Create a tree object from the index |
git commit-tree |
Create a commit object |
git update-ref |
Update a reference safely |
git symbolic-ref |
Read/modify symbolic refs |
git rev-parse |
Parse revision specifications |
git rev-list |
List commit objects in reverse chronological |
git diff-tree |
Compare two trees |
git merge-base |
Find common ancestor(s) |
git for-each-ref |
Iterate over references |
git update-index |
Modify the index directly |
Hands-On: Creating a Commit Manually:
# Create a directory and init
mkdir git-deep-dive && cd git-deep-dive && git init
# Create a blob manually (Plumbing)
echo "Hello World" | git hash-object -w --stdin
# Output: 557db03de997c86a4a028e1ebd3a1ceb225be238
# This file now exists in .git/objects/55/7db03...
# Read the content back using the hash (Plumbing)
git cat-file -p 557db03de997c86a4a028e1ebd3a1ceb225be238
# Output: Hello World
# Check object type
git cat-file -t 557db03de997c86a4a028e1ebd3a1ceb225be238
# Output: blob
# Create a tree manually
git update-index --add --cacheinfo 100644 \
557db03de997c86a4a028e1ebd3a1ceb225be238 hello.txt
git write-tree
# Output: (tree SHA-1)
# Create a commit manually
echo "Initial commit" | git commit-tree <tree-sha>
# Output: (commit SHA-1)
# Point a branch to this commit
git update-ref refs/heads/main <commit-sha>
Git Configuration¶
Git configuration exists at three levels:
- System (
/etc/gitconfig): Applies to all users - Global (
~/.gitconfigor~/.config/git/config): User-specific - Local (
.git/config): Repository-specific (highest priority)
Essential Configuration¶
# Identity (required for commits)
git config --global user.name "Your Name"
git config --global user.email "you@example.com"
# Default branch name
git config --global init.defaultBranch main
# Default editor
git config --global core.editor "code --wait" # VS Code
git config --global core.editor "vim" # Vim
# Line ending handling
git config --global core.autocrlf input # macOS/Linux
git config --global core.autocrlf true # Windows
# Color output
git config --global color.ui auto
# Push behavior
git config --global push.default current
git config --global push.autoSetupRemote true # Git 2.37+
# Pull behavior
git config --global pull.rebase true # Rebase instead of merge
# Diff tool
git config --global diff.tool vscode
git config --global difftool.vscode.cmd 'code --wait --diff $LOCAL $REMOTE'
# Merge tool
git config --global merge.tool vscode
git config --global mergetool.vscode.cmd 'code --wait $MERGED'
# Aliases (shortcuts)
git config --global alias.st status
git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.unstage 'reset HEAD --'
git config --global alias.last 'log -1 HEAD'
git config --global alias.lg "log --oneline --graph --all"
git config --global alias.amend 'commit --amend --no-edit'
# View all configuration
git config --list --show-origin
.gitignore¶
The .gitignore file specifies intentionally untracked files to ignore.
Pattern Syntax:
# Comments start with #
# Blank lines are ignored
# Ignore specific file
secret.txt
# Ignore files with extension
*.log
*.tmp
# Ignore directory
node_modules/
__pycache__/
.venv/
# Ignore files in any subdirectory
**/*.pyc
# Negate pattern (don't ignore)
!important.log
# Ignore only in root directory
/build/
# Ignore files starting with pattern
temp*
# Ignore files matching character class
*.[oa] # *.o and *.a files
Global .gitignore:
git config --global core.excludesfile ~/.gitignore_global
.gitattributes¶
Controls how Git handles specific files.
# Text files - normalize line endings
*.txt text
*.md text
*.py text eol=lf
# Binary files - don't diff, don't merge
*.png binary
*.jpg binary
*.pdf binary
# Custom diff drivers
*.json diff=json
*.lockb binary diff=lockb
# LFS tracking
*.psd filter=lfs diff=lfs merge=lfs -text
# Merge strategies
database.xml merge=ours
# Export ignore (not included in archives)
.gitattributes export-ignore
.gitignore export-ignore
# Linguist (GitHub language statistics)
vendor/* linguist-vendored
docs/* linguist-documentation
Git Hooks¶
Git hooks are scripts that run automatically on specific events. They enable automation, enforcement of standards, and integration with external systems.
Hook Locations¶
- Client-side hooks:
.git/hooks/(not shared via clone) - Server-side hooks: In bare repos, or through hosting platforms
Client-Side Hooks¶
| Hook | Trigger | Common Use |
|---|---|---|
pre-commit |
Before commit is created | Linting, formatting, tests |
prepare-commit-msg |
After default message, before editor | Modify commit message template |
commit-msg |
After user enters message | Validate commit message format |
post-commit |
After commit is created | Notifications |
pre-push |
Before push to remote | Run tests, prevent force push |
pre-rebase |
Before rebase starts | Prevent rebasing published commits |
post-checkout |
After checkout completes | Update dependencies |
post-merge |
After merge completes | Update dependencies, rebuild |
Server-Side Hooks¶
| Hook | Trigger | Common Use |
|---|---|---|
pre-receive |
Before any refs are updated | Enforce policies, reject bad pushes |
update |
Per-ref, before update | Per-branch policies |
post-receive |
After all refs updated | CI/CD triggers, notifications |
Example Hooks¶
pre-commit (run linter):
#!/bin/sh
# .git/hooks/pre-commit
# Run ESLint on staged JS files
STAGED_FILES=$(git diff --cached --name-only --diff-filter=ACM | grep '\.js$')
if [ -n "$STAGED_FILES" ]; then
npx eslint $STAGED_FILES
if [ $? -ne 0 ]; then
echo "ESLint failed. Commit aborted."
exit 1
fi
fi
exit 0
commit-msg (enforce conventional commits):
#!/bin/sh
# .git/hooks/commit-msg
commit_regex='^(feat|fix|docs|style|refactor|test|chore)(\(.+\))?: .{1,50}'
if ! grep -qE "$commit_regex" "$1"; then
echo "ERROR: Commit message does not follow Conventional Commits format."
echo "Example: feat(auth): add login functionality"
exit 1
fi
pre-push (prevent force push to main):
#!/bin/sh
# .git/hooks/pre-push
protected_branch='main'
current_branch=$(git symbolic-ref HEAD | sed -e 's,.*/\(.*\),\1,')
if [ "$current_branch" = "$protected_branch" ]; then
read -r local_ref local_sha remote_ref remote_sha
if [ "$local_sha" = "0000000000000000000000000000000000000000" ]; then
echo "Deleting $protected_branch is not allowed"
exit 1
fi
fi
exit 0
Managing Hooks with Tools¶
Husky (JavaScript projects):
npm install husky --save-dev
npx husky init
echo "npm test" > .husky/pre-commit
pre-commit (Python projects):
# .pre-commit-config.yaml
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/psf/black
rev: 23.1.0
hooks:
- id: black
pip install pre-commit
pre-commit install
Advanced Git Operations¶
Git Stash¶
Stash temporarily saves modified tracked files so you can work on something else.
How Stash Works Internally: Stash creates two (or three) commits:
- A commit for the index state
- A commit for the working tree state (with #1 as parent)
- Optionally, a commit for untracked files
These are stored in .git/refs/stash as a reflog.
# Save current changes
git stash
git stash push -m "Work in progress on feature X"
# Include untracked files
git stash -u
git stash --include-untracked
# Include all files (even ignored)
git stash -a
git stash --all
# List stashes
git stash list
# stash@{0}: WIP on main: a1b2c3d Add feature
# stash@{1}: On main: Work in progress
# Apply most recent stash (keeps in stash list)
git stash apply
# Apply and remove from list
git stash pop
# Apply specific stash
git stash apply stash@{1}
# Show stash contents
git stash show -p stash@{0}
# Create branch from stash
git stash branch new-branch stash@{0}
# Drop specific stash
git stash drop stash@{1}
# Clear all stashes
git stash clear
Git Bisect¶
Binary search through commit history to find when a bug was introduced.
# Start bisect session
git bisect start
# Mark current commit as bad
git bisect bad
# Mark a known good commit
git bisect good v1.0.0
# Git checks out middle commit
# Test if bug exists, then:
git bisect good # or
git bisect bad
# Repeat until Git identifies the first bad commit
# End session
git bisect reset
# Automated bisect with test script
git bisect start HEAD v1.0.0
git bisect run ./test-bug.sh
How it Works: Binary search through N commits requires at most \(\log_2(N)\) tests. For 1000 commits, only ~10 tests needed.
Git Reflog¶
The reflog records when tips of branches and other references were updated. It's your safety net for recovering "lost" commits.
# Show reflog for HEAD
git reflog
# a1b2c3d HEAD@{0}: commit: Add feature
# f4e5d6c HEAD@{1}: checkout: moving from feature to main
# 789abcd HEAD@{2}: commit: WIP
# Show reflog for specific branch
git reflog show feature
# Recover a "lost" commit
git checkout HEAD@{2}
# or
git branch recovered-branch HEAD@{2}
# Recover from bad reset
git reset --hard HEAD@{1}
# Reflog entries expire (default 90 days for reachable, 30 for unreachable)
git reflog expire --expire=now --all
git gc --prune=now
Git Cherry-Pick¶
Apply specific commits from one branch to another.
# Apply single commit
git cherry-pick a1b2c3d
# Apply multiple commits
git cherry-pick a1b2c3d f4e5d6c
# Apply range of commits (exclusive start)
git cherry-pick a1b2c3d..f4e5d6c
# Apply range (inclusive start)
git cherry-pick a1b2c3d^..f4e5d6c
# Cherry-pick without committing
git cherry-pick -n a1b2c3d
# Continue after resolving conflicts
git cherry-pick --continue
# Abort cherry-pick
git cherry-pick --abort
Git Reset vs Revert¶
git reset: Moves branch pointer backward, potentially losing commits (history rewriting).
# Soft reset - move HEAD only, keep staging and working directory
git reset --soft HEAD~1
# Mixed reset (default) - move HEAD, reset staging, keep working directory
git reset HEAD~1
git reset --mixed HEAD~1
# Hard reset - move HEAD, reset staging AND working directory
git reset --hard HEAD~1 # DANGEROUS - loses changes
git revert: Creates new commit that undoes changes (safe for shared history).
# Revert single commit
git revert a1b2c3d
# Revert without auto-commit
git revert -n a1b2c3d
# Revert merge commit (specify which parent to keep)
git revert -m 1 <merge-commit>
When to Use:
reset: Local commits not yet pushedrevert: Published commits on shared branches
Git Worktrees¶
Manage multiple working directories attached to the same repository.
# Create new worktree
git worktree add ../project-hotfix hotfix-branch
# Create worktree with new branch
git worktree add -b emergency-fix ../emergency main
# List worktrees
git worktree list
# Remove worktree
git worktree remove ../project-hotfix
# Prune stale worktree info
git worktree prune
Use Cases:
- Work on hotfix while keeping feature branch changes
- Run tests on different branch without switching
- Compare behavior between branches
Git Submodules and Subtrees¶
Submodules¶
Include other Git repositories as subdirectories.
# Add submodule
git submodule add https://github.com/user/lib.git vendor/lib
# Clone repo with submodules
git clone --recurse-submodules https://github.com/user/project.git
# Initialize submodules in existing clone
git submodule init
git submodule update
# Or combined:
git submodule update --init --recursive
# Update submodule to latest
cd vendor/lib
git checkout main
git pull
cd ../..
git add vendor/lib
git commit -m "Update lib submodule"
# Update all submodules
git submodule update --remote
# Remove submodule
git submodule deinit vendor/lib
git rm vendor/lib
rm -rf .git/modules/vendor/lib
Submodule Structure:
.gitmodules:
[submodule "vendor/lib"]
path = vendor/lib
url = https://github.com/user/lib.git
Subtrees¶
Alternative to submodules - merges external repo into your project.
# Add subtree
git subtree add --prefix vendor/lib \
https://github.com/user/lib.git main --squash
# Pull updates
git subtree pull --prefix vendor/lib \
https://github.com/user/lib.git main --squash
# Push changes back (if you have access)
git subtree push --prefix vendor/lib \
https://github.com/user/lib.git main
Submodules vs Subtrees:
| Aspect | Submodules | Subtrees |
|---|---|---|
| External repo | Referenced by SHA | Merged into history |
| Clone | Extra steps needed | Works normally |
| Versioning | Pinned to specific commit | Part of your history |
| Contributing | Standard workflow in sub-repo | Requires subtree push |
| History | Separate | Integrated |
Branching Strategies¶
Git Flow¶
A strict branching model designed for scheduled releases.
┌───────────────────────────────────────────────────────────────┐
│ main │
│ ○─────────────────○──────────────────────○────────────────── │
│ │ │ │
│ Tags: v1.0.0 v2.0.0 │
├───────────────────────────────────────────────────────────────┤
│ develop │
│ ○───○───○───○───○───○───○───○───○───○───○───○───○───○────── │
│ │ │ ▲ │ ▲ │
│ │ │ │ │ │ │
│ ┌───┴───┐ ┌─┴───────────┴─┐ ┌───┴───┐ ┌─┴──────┐ │
│ │feature│ │ release │ │feature│ │hotfix │ │
│ │ /A │ │ /1.0 │ │ /B │ │/urgent │ │
│ └───────┘ └───────────────┘ └───────┘ └────────┘ │
└───────────────────────────────────────────────────────────────┘
Branches:
main: Production-ready code, tagged releasesdevelop: Integration branch for featuresfeature/*: New features, branch from developrelease/*: Release preparation, branch from develophotfix/*: Emergency fixes, branch from main
Commands:
# Start feature
git checkout develop
git checkout -b feature/user-auth
# Finish feature
git checkout develop
git merge --no-ff feature/user-auth
git branch -d feature/user-auth
# Start release
git checkout develop
git checkout -b release/1.0.0
# Finish release
git checkout main
git merge --no-ff release/1.0.0
git tag v1.0.0
git checkout develop
git merge --no-ff release/1.0.0
git branch -d release/1.0.0
# Hotfix
git checkout main
git checkout -b hotfix/security-patch
# ... fix ...
git checkout main
git merge --no-ff hotfix/security-patch
git tag v1.0.1
git checkout develop
git merge --no-ff hotfix/security-patch
GitHub Flow¶
Simplified workflow for continuous deployment.
main: ○───○───○───○───○───○───○───○───○
│ ▲ │ ▲
│ │ │ │
└───○───┘ └───○───┘
feature/A feature/B
Rules:
mainis always deployable- Create descriptive branch from
main - Commit and push regularly
- Open Pull Request for discussion
- Review and approve
- Merge to
mainand deploy
Trunk-Based Development¶
Emphasizes small, frequent commits to a single main branch.
main: ○───○───○───○───○───○───○───○───○───○
│ │ │ │ │ │ │ │ │ │
└───┴───┴───┴───┴───┴───┴───┴───┴───┘
Many small commits directly or via short-lived branches
Principles:
- Main branch is always releasable
- Feature branches live < 1 day
- Feature flags for incomplete features
- Continuous integration is mandatory
- No long-lived branches
GitLab Flow¶
Combines feature branches with environment branches.
main: ○───○───○───○───○───○───○───○
│ │
▼ ▼
pre-prod: ────────○───────────○─────────
│ │
▼ ▼
production: ────────────○───────────○─────
Signing Commits¶
GPG Signing¶
# Generate GPG key
gpg --full-generate-key
# List keys
gpg --list-secret-keys --keyid-format LONG
# Configure Git to use key
git config --global user.signingkey YOUR_KEY_ID
git config --global commit.gpgsign true
# Sign individual commit
git commit -S -m "Signed commit"
# Verify signatures
git log --show-signature
git verify-commit HEAD
SSH Signing (Git 2.34+)¶
# Configure SSH signing
git config --global gpg.format ssh
git config --global user.signingkey ~/.ssh/id_ed25519.pub
git config --global commit.gpgsign true
# Create allowed signers file
echo "$(git config --get user.email) $(cat ~/.ssh/id_ed25519.pub)" >> ~/.ssh/allowed_signers
git config --global gpg.ssh.allowedSignersFile ~/.ssh/allowed_signers
Git Large File Storage (LFS)¶
Git LFS replaces large files with text pointers inside Git, while storing file contents on a remote server.
# Install Git LFS
git lfs install
# Track file types
git lfs track "*.psd"
git lfs track "*.zip"
git lfs track "assets/**"
# Verify tracking
cat .gitattributes
# *.psd filter=lfs diff=lfs merge=lfs -text
# Check LFS files
git lfs ls-files
# Migrate existing files to LFS
git lfs migrate import --include="*.psd" --everything
# Clone with LFS
git lfs clone https://github.com/user/repo.git
# Pull LFS files
git lfs pull
How LFS Works:
Repository:
file.psd → Pointer file (SHA-256 hash + size)
LFS Server:
objects/
ab/cd/abcd1234... → Actual file content
Monorepo Strategies¶
Structure¶
monorepo/
├── apps/
│ ├── web/
│ ├── mobile/
│ └── api/
├── packages/
│ ├── ui/
│ ├── utils/
│ └── config/
├── tools/
└── package.json
Sparse Checkout¶
Clone only specific directories:
# Enable sparse checkout
git clone --filter=blob:none --sparse https://github.com/org/monorepo.git
cd monorepo
# Checkout specific paths
git sparse-checkout set apps/web packages/ui
# Add more paths
git sparse-checkout add packages/utils
# List current sparse paths
git sparse-checkout list
Partial Clone¶
Reduce clone size by deferring blob downloads:
# Blobless clone (fetch blobs on demand)
git clone --filter=blob:none https://github.com/org/monorepo.git
# Treeless clone (fetch trees and blobs on demand)
git clone --filter=tree:0 https://github.com/org/monorepo.git
# Shallow clone (limited history)
git clone --depth 1 https://github.com/org/monorepo.git
# Unshallow later
git fetch --unshallow
Conflict Resolution¶
Types of Conflicts¶
- Content conflict: Same lines modified differently
- Rename/rename conflict: Same file renamed differently
- Modify/delete conflict: One side modified, other deleted
- Directory/file conflict: One creates directory, other creates file with same name
Conflict Markers¶
<<<<<<< HEAD
Your changes (current branch)
=======
Their changes (branch being merged)
>>>>>>> feature-branch
For three-way merge with diff3 style:
<<<<<<< HEAD
Your changes
||||||| merged common ancestors
Original content
=======
Their changes
>>>>>>> feature-branch
Enable diff3 style:
git config --global merge.conflictstyle diff3
Resolution Commands¶
# View conflicts
git status
git diff --name-only --diff-filter=U
# Use specific version
git checkout --ours path/to/file # Keep current branch version
git checkout --theirs path/to/file # Keep incoming branch version
# After resolving manually
git add path/to/file
git commit # or git merge --continue
# Abort merge
git merge --abort
# Use merge tool
git mergetool
Strategies for Complex Conflicts¶
# Re-run merge with different strategy
git merge -X ours feature # Favor current branch
git merge -X theirs feature # Favor incoming branch
# Rerere (Reuse Recorded Resolution)
git config --global rerere.enabled true
# Git remembers how you resolved conflicts and reapplies automatically
Other Version Control Systems¶
Subversion (SVN)¶
A centralized version control system that was very popular before Git. Some organizations still use it, especially for non-software projects.
Key Differences from Git:
| Aspect | SVN | Git |
|---|---|---|
| Architecture | Centralized | Distributed |
| Branching | Expensive (full copy) | Cheap (pointer) |
| Offline work | Limited | Full capability |
| History | Linear | DAG |
| Storage | Delta-based | Snapshot-based |
Common Commands:
svn checkout URL # Like git clone
svn update # Like git pull
svn commit # Like git commit + push
svn status # Like git status
svn log # Like git log
svn diff # Like git diff
svn branch # Creates directory copy
Mercurial¶
A distributed version control system similar to Git but with a reputation for being easier to learn. Used by companies like Facebook and Mozilla.
Key Differences from Git:
| Aspect | Git | Mercurial |
|---|---|---|
| Commands | More flexible | More consistent |
| History | Mutable (rebase, etc.) | Immutable by default |
| Learning curve | Steeper | Gentler |
| Extensions | Built-in + external | Extension-based |
Common Commands:
hg clone URL # Like git clone
hg pull # Like git fetch
hg update # Like git checkout
hg commit # Like git commit
hg push # Like git push
hg log # Like git log
hg branch # Like git branch
Perforce (Helix Core)¶
Enterprise-scale centralized VCS, popular in game development and large binary asset management.
Strengths:
- Excellent large file handling
- Fine-grained permissions
- Strong locking mechanism
- Atomic multi-file operations
Best Practices¶
Commit Practices¶
- Commit frequently with small, focused changes
-
Write descriptive commit messages following conventions:
type(scope): subject body (optional) footer (optional)Types:
feat,fix,docs,style,refactor,test,chore -
Never commit secrets (API keys, passwords, credentials)
- Review changes before committing (
git diff --staged)
Branching Practices¶
- Use meaningful branch names:
feature/user-auth,fix/login-bug,refactor/database-layer - Delete merged branches to keep repository clean
- Protect important branches (main, develop) with branch protection rules
- Keep branches short-lived to reduce merge complexity
Collaboration Practices¶
- Pull/fetch frequently to stay up-to-date
- Review code before merging to main
- Use Pull Requests/Merge Requests for code review
- Run CI/CD on all branches before merging
- Squash commits when merging feature branches (optional, team preference)
Repository Hygiene¶
- Configure proper
.gitignorefiles - Use
.gitattributesfor consistent line endings - Document setup in README.md
- Run
git gcperiodically on large repos - Use Git LFS for large binary files
Security Practices¶
- Sign commits for verified authorship
- Enable branch protection rules
- Require reviews before merging
- Audit access regularly
- Rotate credentials if accidentally committed
Benefits of Version Control¶
- History and audit trail of all changes
- Ability to revert to previous versions
- Concurrent development by multiple team members
- Branching and merging to support parallel development
- Backup and disaster recovery
- Experimentation without risk to the main codebase
- Code review workflows and quality gates
- Compliance with regulatory requirements
- Automation through hooks and CI/CD integration
- Traceability linking code changes to issues/tickets
Summary¶
Version control is essential for any software development project, regardless of size, as it provides structure, history, and collaboration capabilities that are fundamental to modern development practices.
Git's design as a content-addressable filesystem with a Merkle DAG structure provides:
- Integrity: Cryptographic verification of history
- Performance: Fast local operations
- Flexibility: Powerful branching and merging
- Reliability: Distributed redundancy
Understanding Git's internals—objects, trees, commits, refs, and the DAG—enables you to use it effectively, troubleshoot issues, and recover from mistakes. Combined with proper branching strategies, hooks, and workflow practices, version control becomes a powerful foundation for professional software development.