Skip to main content

Overview

Git’s object model is the foundation of its version control system. At its core, Git is a content-addressable filesystem with a VCS user interface written on top of it. This means that Git stores everything as objects identified by SHA-1 (or SHA-256) hashes of their contents.

The Four Object Types

Git has four fundamental object types that represent all data in the repository:

Blob

Stores file data - the content of your files without any metadata.

Tree

Represents directories, containing references to blobs and other trees.

Commit

Captures a snapshot of the project at a point in time with metadata.

Tag

Creates a named reference to a specific commit, often with a message.

Object Storage

Content-Addressable Storage

Each object is stored with its SHA-1 hash as the identifier:
# Object ID format
OBJECT_ID = SHA1(OBJECT_TYPE + " " + CONTENT_SIZE + "\0" + CONTENT)
The first two characters of the hash form the directory name, and the remaining 38 characters form the filename in .git/objects/.

Object Storage Format

.git/objects/
├── 01/
│   └── 234567... (object file)
├── ab/
│   └── cdef01... (object file)
├── info/
└── pack/
    ├── pack-*.idx (index file)
    └── pack-*.pack (packed objects)

Blob Objects

Blobs store file content without any metadata:
# Creating a blob object
$ echo 'Hello, World!' | git hash-object -w --stdin
8ab686eafeb1f44702738c8b0f24f2567c36da6d

# View blob content
$ git cat-file -p 8ab686eafeb1f44702738c8b0f24f2567c36da6d
Hello, World!

# View blob type
$ git cat-file -t 8ab686eafeb1f44702738c8b0f24f2567c36da6d
blob
Two files with identical content will always produce the same blob object, saving storage space.

Tree Objects

Trees represent directory structures:
# View a tree object
$ git cat-file -p main^{tree}
100644 blob 8ab686ea...  README.md
100644 blob 3b18e512...  file.txt
040000 tree 9d1a2e3f...  src

# Tree entry format
MODE OBJECT_TYPE OBJECT_ID    FILENAME

File Modes

ModeDescription
100644Regular file (non-executable)
100755Executable file
120000Symbolic link
040000Subdirectory (tree)
160000Gitlink (submodule)

Commit Objects

Commits are snapshots with metadata:
# View a commit object
$ git cat-file -p HEAD
tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
parent 8f1e7cf0826a8f8d8e5e9f8e8f8e8f8e8f8e8f8e
author John Doe <john@example.com> 1609459200 +0000
committer John Doe <john@example.com> 1609459200 +0000

Initial commit

Commit Structure

  • tree: Points to the root tree object
  • parent: References parent commit(s) (merge commits have multiple parents)
  • author: Who wrote the changes
  • committer: Who committed the changes
  • message: Commit message

Tag Objects

Annotated tags create permanent named references:
# View a tag object
$ git cat-file -p v1.0
object 8f1e7cf0826a8f8d8e5e9f8e8f8e8f8e8f8e8f8e
type commit
tag v1.0
tagger John Doe <john@example.com> 1609459200 +0000

Version 1.0 release

Object Relationships

Git objects form a directed acyclic graph (DAG):
Commit A
├── tree (directory structure)
│   ├── blob (file1.txt)
│   ├── blob (file2.txt)
│   └── tree (subdir/)
│       └── blob (file3.txt)
└── parent (Commit B)

Object Inspection Commands

Examine object contents and metadata:
# Show object content
git cat-file -p <object-id>

# Show object type
git cat-file -t <object-id>

# Show object size
git cat-file -s <object-id>
Create objects from files:
# Create blob object
git hash-object -w <file>

# Create blob from stdin
echo 'content' | git hash-object -w --stdin
List contents of tree objects:
# List tree contents
git ls-tree <tree-id>

# Recursive listing
git ls-tree -r <tree-id>

Object Packing

Git optimizes storage by packing objects:
# Pack loose objects
$ git gc

# View pack file contents
$ git verify-pack -v .git/objects/pack/pack-*.idx
Packed objects use delta compression to store only differences between similar objects, dramatically reducing repository size.