Digging Up the Dead: Disk Forensics & Git Object Archaeology (PicoCTF Walkthrough)
Most people think of Git as a version control tool. CTF players think of it as a crime scene.Press e 2026-5-18 10:22:50 Author: infosecwriteups.com(查看原文) 阅读量:23 收藏

SAFAL GAUTAM

Most people think of Git as a version control tool. CTF players think of it as a crime scene.

Press enter or click to view image in full size

In this challenge — “Git 2” from PicoCTF — a flag is buried inside a disk image. There are no commits. No branches. No obvious trail. Just raw objects sitting quietly in .git/objects, waiting for someone who knows where to look.

This article walks through the full forensic methodology: from cracking open a disk image, to mounting partitions, to understanding Git’s internal object database well enough to recover data that was never meant to be found.

Part 1: The Disk Image — What Are We Working With?

Step 1: Decompress the Image

gunzip disk.img.gz

You start with a .gz compressed file. gunzip simply decompresses it into disk.img — a raw binary snapshot of an entire hard drive.

Step 2: Identify the File Type

file disk.img

Output:

disk.img: DOS/MBR boot sector; partition 1 : ID=0x83, active, start-CHS (0x2,0,33),
end-CHS (0x263,8,56), startsector 2048, 614400 sectors;
partition 2 : ID=0x82, start-CHS (0x263,8,57) ...
partition 3 : ID=0x83, start-CHS (0x3ff,15,63) ...

file reads the magic bytes at the start of the file and identifies it. What you see here is a DOS/MBR boot sector — this is a real disk layout with a Master Boot Record at the very beginning, followed by a partition table. There are three partitions on this disk.

Step 3: Inspect the Partition Table

fdisk -l disk.img

Output:

Disk disk.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Device     Boot   Start     End  Sectors  Size Id Type
disk.img1 * 2048 616447 614400 300M 83 Linux
disk.img2 616448 1140735 524288 256M 82 Linux swap / Solaris
disk.img3 1140736 2097151 956416 467M 83 Linux

fdisk -l reads the partition table and gives you a human-readable layout. Think of it like a table of contents for the disk.

Press enter or click to view image in full size

Why partition 3? Because it’s the largest non-swap Linux partition. User files, home directories, and application code live here.

Part 2: Mounting the Partition

The Byte Offset Calculation

To mount partition 3, we need to tell Linux exactly where in the file it starts. Partitions are measured in sectors, and each sector is 512 bytes.

Start sector of partition 3 = 1140736
Byte offset = 1140736 × 512 = 583,544,832 bytes
sudo mkdir -p /mnt/git2
sudo mount -o loop,offset=$((1140736 * 512)) disk.img /mnt/git2

Breaking down the mount command:

  • -o loop — treat the file as a loop device (a virtual block device)
  • offset=$((1140736 * 512)) — start reading from this byte offset
  • disk.img — the source file
  • /mnt/git2 — where to mount it

After this, /mnt/git2 behaves like a real mounted filesystem. You can ls, cat, and find files just like a normal drive.

Finding the Git Repository

find /mnt/git2 -name ".git" -type d 2>/dev/null

This recursively searches the entire partition for directories named .git. The 2>/dev/null suppresses permission errors.

Result:

/mnt/git2/home/ctf-player/Code/killer-chat-app/.git

Found it. A Git repository lives inside the home directory of a user called ctf-player.

Part 3: The Git Repository Has No Commits — But That Doesn’t Mean It’s Empty

cd /mnt/git2/home/ctf-player/Code/killer-chat-app/
git log --oneline

Output:

fatal: your current branch 'master' does not have any commits yet
git status

Output:

On branch master

No commits yet

Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: client
new file: logs/1.txt
new file: logs/2.txt
new file: logs/4.txt
new file: server

Two things immediately stand out:

  • No commitsgit log is useless here
  • logs/3.txt is missing — we have 1, 2, and 4 staged, but not 3

This is your first investigative clue. Something happened to logs/3.txt. It was either:

  • Removed from the staging area with git rm --cached
  • Part of a commit that was reset
  • Deliberately hidden

Regardless of which scenario, Git’s object database may still hold the answer.

Part 4: Understanding .git/objects — Git's Internal Database

This is the heart of the investigation, and the most important concept in this entire article.

What Is .git/objects?

Git uses a content-addressable storage system. Every piece of data Git ever processes — every file, every directory snapshot, every commit — is stored as an “object” in .git/objects.

Each object is named by the SHA-1 hash of its content:

.git/objects/
66/
273877d2ff3f51a14473b7200aae5a798ff64f

Full hash = 66273877d2ff3f51a14473b7200aae5a798ff64f

Get SAFAL GAUTAM’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

The first 2 characters become the folder name. The remaining 38 characters become the filename. This is purely an optimization — it prevents a single directory from containing hundreds of thousands of files.

The Four Object Types

1. blob — Raw File Content

A blob is nothing more than the raw bytes of a file. No filename. No metadata. Just content.

git cat-file -p 66273877d2ff3f51a14473b7200aae5a798ff64f
# → (raw contents of whatever file this is)

Key insight: if two files have identical content, they share one blob. Git deduplicates automatically.

2. tree — Directory Snapshot

A tree maps filenames to blob hashes (and other tree hashes for subdirectories).

git cat-file -p a0c13fe974d95661f24e32bc0d79f54f05ea13c5
100644 blob 66273877...   logs/1.txt
100644 blob 7178644... logs/2.txt
100644 blob f150f0b... logs/3.txt ← could be here even if not staged
100644 blob aa1cc01... logs/4.txt
040000 tree 6b1ebe1... src/

This is critical: a tree object can reference logs/3.txt even if that file was never committed. If the blob for 3.txt was created (by git add), it lives in .git/objects forever — until git gc is run.

3. commit — Snapshot in Time

A commit points to a tree and stores metadata.

git cat-file -p 01533f718556a0e59f1467dae4fa462eed82c2a1
tree 22f7d0c9bd045563ae33bfacfbe46fe406a5b318
parent 2c0a9b2b15dce92f800393d5030c7454efc278ae
author ctf-player <[email protected]> 1693000000 +0000
committer ctf-player <[email protected]> 1693000000 +0000

initial commit

Even though git log shows no commits on master, commit objects can exist in the database if they were created on another branch, or if the branch pointer was reset.

4. tag — Named Reference

Annotated tags point to commits and add a name and message. Less relevant for forensics in this case.

The Object Graph

Git stores everything as objects:

  • commit → points to a tree
  • tree → points to blobs/files and subtrees
  • blob → actual file contents
commit ──→ tree ──→ blob  (logs/1.txt)
├──→ blob (logs/2.txt)
├──→ blob (logs/3.txt) ← orphaned, the flag is here
├──→ blob (logs/4.txt)
└──→ blob (client)

Every commit is a complete snapshot, not a diff. Git’s diffs are computed on-the-fly by comparing blobs between commits.

Reachable vs Unreachable

Git normally starts from refs:

HEAD
branch refs
tags

and walks the graph.

Anything connected to those refs is reachable.

Reachable Example

main → commit A → tree → blob(flag.txt)

Git can reach everything.

Unreachable Objects

Suppose you delete a branch:

main → commit A
old_branch → commit B → tree → blob(flag)

After deleting old_branch, nothing references commit B.

Now:

commit B = unreachable
tree from B = unreachable
blob from tree = unreachable

BUT the objects still physically exist in .git/objects.

That’s why forensic recovery works.

Part 5: Listing All Objects

git cat-file --batch-all-objects --batch-check

This command iterates over every object in .git/objects and prints a one-line summary:

01533f718556a0e59f1467dae4fa462eed82c2a1 commit 238
201c707b43219a63c1d3499b29c7d539af079861 tree 99
2151ef0ccc15aed1ab88e1afdc7484aaeff211c4 commit 244
66273877d2ff3f51a14473b7200aae5a798ff64f blob 140
7178644433e7cb6da3adf028f1c80d382a18e7b6 blob 188
...

Format: <hash> <type> <size in bytes>

--batch-check vs --batch

FlagWhat it outputsUse case--batch-checkHeader only (hash + type + size)Survey — understand what exists--batchHeader + raw content of every objectExtract — dump everything for searching

Think of --batch-check as the table of contents and --batch as reading every page.

Part 6: Finding the Flag — Two Approaches

Approach 1: The Blunt Hammer (Fast)

git cat-file --batch-all-objects --batch | strings | grep -i "picoCTF\|3.txt"

Breaking this down:

PartWhat it does--batch-all-objectsIterate over every object--batchOutput raw content of each objectstringsExtract printable ASCII strings from binary datagrep -i "picoCTF|3.txt"Search for the flag format OR the missing filename

Output:

.100644 3.txt
Jay: Ask Rusty at the door and use password picoCTF{g17_r35cu3_********}.

The flag was inside the blob for logs/3.txt. It was never staged, never committed — but its content was git added at some point, creating a blob object that persisted.

Approach 2: The Surgical Scalpel (Thorough)

git fsck --unreachable

fsck stands for File System Check. It walks the entire object graph starting from known references (HEAD, branches, tags) and identifies objects that cannot be reached.

Example output:

unreachable blob 66273877d2ff3f51a14473b7200aae5a798ff64f
unreachable commit 2151ef0ccc15aed1ab88e1afdc7484aaeff211c4
dangling commit 01533f718556a0e59f1467dae4fa462eed82c2a1

dangling vs unreachable

  • dangling — nothing points to this object at all. Truly orphaned.
  • unreachable — not reachable from HEAD/branches, but another unreachable object points to it.
dangling commit → unreachable tree → unreachable blob ← (flag lives here)

Then follow the chain:

# Read the dangling commit → get the tree hash
git cat-file -p 01533f718556a0e59f1467dae4fa462eed82c2a1
# Read the tree → see all files including 3.txt
git cat-file -p <tree-hash>
# Read the blob → get the flag
git cat-file -p <blob-hash>

When to use which approach

Press enter or click to view image in full size

Part 7: Why Orphaned Objects Exist — The Core Forensic Insight

This is the principle that makes Git forensics possible:

Git objects are immutable and persist until git gc is explicitly run.

Here are the common scenarios that create orphaned objects:

Press enter or click to view image in full size

The only thing that cleans this up is git gc (garbage collection), which prunes objects not reachable from any reference. Until then, the data is fully recoverable.

This is by design. Git prioritizes data safety over storage efficiency. It would rather keep a “deleted” file than risk losing something the user might need.

Key Takeaways

1. Disk images are layered. MBR → partition table → filesystem → files. Each layer requires a different tool: file, fdisk, mount, then standard shell commands.

2. Partition byte offsets matter. offset = start_sector × sector_size. Getting this wrong means mounting nothing, or the wrong partition.

3. Git never truly deletes. git log showing "no commits" is not the whole story. The object database is the whole story.

4. .git/objects is a content-addressed database. Every blob, tree, and commit has a SHA-1 hash name. Objects are immutable. The database only grows until git gc is run.

5. Two forensic strategies, different tradeoffs. --batch | strings | grep is fast and broad. git fsck --unreachable is slow and structured. Use both depending on what you need.

6. The missing file is the clue. logs/1.txt, logs/2.txt, logs/4.txt were staged. logs/3.txt was not. Gaps in sequences are almost always intentional. Always look for what's missing.

7. Another useful command. Find unreachable/dangling objects
`git fsck — full — no-reflogs`

The Git 2 challenge demonstrates how powerful forensic analysis becomes when low-level system knowledge is combined with an understanding of application internals. What initially appeared to be an empty repository with no commit history ultimately revealed recoverable evidence hidden inside Git’s object database.


文章来源: https://infosecwriteups.com/digging-up-the-dead-disk-forensics-git-object-archaeology-picoctf-walkthrough-465fcfdefd07?source=rss----7b722bfd1b8d---4
如有侵权请联系:admin#unsafe.sh