What is GSD-2: The Autonomous AI Agent That Builds Software While You Sleep about?

A complete guide to Get Shit Done 2 — the CLI that turns Claude Code into a fully autonomous coding agent. Setup on Ubuntu/Linux, full workflow walkthrough, and why it changes everything about how you build software.

Who should read this article?

This article is written for engineers, technical leads, and data teams working with GSD-2, Claude Code, AI Agents.

What can readers use from it?

Readers can use the article as a practical reference for ai tools decisions, implementation tradeoffs, and production engineering workflows.

GSD-2: The Autonomous AI Agent That…

The original Get Shit Done (GSD) framework went viral. Engineers at Amazon, Google, Shopify, and Webflow adopted it. It amassed 31,000+ GitHub stars in weeks. The idea was simple and powerful: instead of letting Claude’s context window fill up with accumulated garbage until quality collapses, break your project into small, well-defined tasks — each one executed in a fresh 200K-token context window.

That worked. But it was still a prompt framework. You were still the orchestrator. You ran the commands, reviewed each phase, and advanced the workflow manually.

GSD-2 removes you from the loop.

It is a standalone CLI built on the Pi SDK. It has direct TypeScript access to the agent harness — meaning it can actually clear context between tasks, inject exactly the right files at dispatch time, manage git branches, track cost and token usage, detect stuck loops, recover from crashes, and auto-advance through an entire milestone without you touching anything.

One command. Walk away. Come back to a built project with clean git history.

Why Context Rot Destroys AI-Generated Code

Before diving into GSD-2 setup, it helps to understand the problem it solves.

When you work with Claude Code on a long project in a single session, the context window fills up progressively. The model is attending to everything — your early architecture decisions, the failed approaches you discarded, the debug output from three hours ago, the half-written functions you told it to ignore. This accumulated noise degrades output quality in a measurable way. Engineers have a name for it: context rot.

Single long session (what most people do)
─────────────────────────────────────────

Context window filling over time:

0%   [──────────────────────────] 100%
      │                          │
   Start                      Context rot
   (sharp, focused)            (degraded quality,
                                forgotten decisions,
                                inconsistent code)

GSD-1 attacked this with prompt engineering — forcing the model to work in bounded chunks. GSD-2 attacks it at the infrastructure level, programmatically clearing and rebuilding context for every task.

The Architecture: How GSD-2 Thinks About Work

GSD-2 organizes all software work into a strict three-level hierarchy:

PROJECT
│
└── MILESTONE  (shippable version — 4 to 10 slices)
    │            e.g. "v1.0 — user authentication system"
    │
    └── SLICE  (one demoable vertical capability — 1 to 7 tasks)
        │        e.g. "JWT login and session management"
        │
        └── TASK  (one context-window-sized unit of work)
                   e.g. "Implement /api/auth/login endpoint"

Every task must fit within ~200K tokens. That constraint is not optional — it is the mechanism that keeps quality high. Tasks that are too large get decomposed further during the planning phase.

Each slice flows through four phases automatically:

┌─────────┐    ┌─────────┐    ┌──────────┐    ┌──────────┐
│  PLAN   │───▶│ EXECUTE │───▶│ COMPLETE │───▶│REASSESS  │
│         │    │         │    │          │    │          │
│Research │    │Fresh    │    │Summaries │    │Validate  │
│codebase │    │context  │    │UAT script│    │roadmap   │
│Decompose│    │per task │    │Git commit│    │still OK? │
└─────────┘    └─────────┘    └──────────┘    └──────────┘

The PLAN phase researches your codebase and decomposes the slice into atomic tasks with explicit, verifiable success criteria. EXECUTE runs each task in an isolated context window with pre-loaded relevant files. COMPLETE writes summaries and commits. REASSESS checks if the roadmap is still valid given what was just built.

Prerequisites

Ubuntu 20.04+ (this guide uses Ubuntu 22.04 LTS)
Node.js 20+ — check with node --version
npm configured with a user-writable global prefix
Claude API key or access to Claude Code
Git initialized in your project directory

Check Node.js and npm:

node --version   # Should print v20.x.x or higher
npm --version

If Node.js is not installed or is outdated:

# Install via NodeSource (Ubuntu/Debian)
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt-get install -y nodejs

Step 1 — Install GSD-2

GSD-2 is distributed as an npm package called gsd-pi:

npm install -g gsd-pi@latest

If npm gives a permissions error, set a user-writable global prefix first:

mkdir -p ~/.npm-global
npm config set prefix ~/.npm-global
echo 'export PATH="$HOME/.npm-global/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

Then retry the install:

npm install -g gsd-pi@latest

Verify:

gsd --version

You should see the version number printed, something like GSD-2 v2.38.0.

Step 2 — Log In and Select a Provider

GSD-2 supports 20+ AI providers including Claude (Anthropic), OpenAI, Gemini, and local models. For the best results with autonomous agentic work, Claude Sonnet or Opus is recommended.

Start the login flow:

gsd
/login

This opens an interactive provider selection menu:

Select provider:
  ▶ Anthropic (Claude)
    OpenAI
    Google Gemini
    Local (Ollama)
    [20+ more...]

Select Anthropic (Claude) and enter your API key when prompted. GSD-2 stores credentials securely in its config directory.

To verify login:

gsd
/gsd status

You should see your provider listed as connected.

Step 3 — Initialize a New Project

Navigate to your project directory (or create a new one):

mkdir my-project && cd my-project
git init

Then initialize GSD-2:

gsd
/gsd new-project

GSD-2 will guide you through:

Project name and description — What you are building
Tech stack — Languages, frameworks, key dependencies
First milestone — What the first shippable version looks like
Slices — The 4–10 capabilities that make up the milestone

This creates a .gsd/ directory in your project root with:

.gsd/
├── roadmap.md          # Full milestone plan
├── decisions.md        # Architecture decisions register
├── M001/               # Milestone 1
│   ├── slices/         # One file per slice
│   └── summaries/      # Task completion summaries
└── config.json         # Project-level preferences

Step 4 — Running GSD-2

Step Mode (Review Each Phase)

Step mode pauses after each phase and waits for your approval before advancing. This is the right mode when you are learning GSD-2 or working on a critical codebase.

gsd
/gsd

GSD-2 picks up wherever you last stopped, runs the next phase, and waits. You review the output, then type /gsd again to advance.

Auto Mode (Walk Away)

Auto mode executes everything autonomously — all slices, all phases, all tasks — until the milestone is complete or it hits a problem it cannot resolve alone.

gsd
/gsd auto

What happens during auto mode:

/gsd auto
│
├── Slice 1: "User authentication"
│   ├── [PLAN]    Researches codebase, decomposes into 4 tasks
│   ├── [EXECUTE] Task 1 in fresh context → commit
│   ├── [EXECUTE] Task 2 in fresh context → commit
│   ├── [EXECUTE] Task 3 in fresh context → commit
│   ├── [EXECUTE] Task 4 in fresh context → commit
│   ├── [COMPLETE] Writes summaries, generates UAT script
│   └── [REASSESS] Roadmap still valid ✓
│
├── Slice 2: "Dashboard UI"
│   └── [same pipeline]
│
└── ... continues until milestone complete

You can safely close your terminal. GSD-2 runs as a persistent process and resumes from disk state if interrupted.

Check Progress Without Interrupting

gsd headless query

This gives you a JSON snapshot of current status in ~50ms — no LLM call, no interruption to the running agent.

Step 5 — The Status Dashboard

gsd
/gsd status

This shows:

GSD-2 Status Dashboard
═══════════════════════════════════════════════════════

Project     my-project
Milestone   M001 — v1.0 User Auth System
Progress    ████████████░░░░░░░░  3/5 slices complete

Current     Slice 4: Dashboard UI
Phase       EXECUTE — Task 2 of 4
Model       claude-sonnet-4-6

Cost        $0.47 spent │ $2.00 budget │ $1.53 remaining
Tokens      94,832 total │ 22,341 this slice

Git         gsd/M001/  (worktree)
Commits     12 commits this milestone

Step 6 — Completing a Milestone and Starting the Next

When GSD-2 finishes all slices:

/gsd complete-milestone

This:

Squash-merges all task commits into a single semantic commit on main
Deletes the worktree (gsd/M001/)
Generates a self-contained HTML report (see below)
Archives the milestone in .gsd/archive/

To start the next milestone:

/gsd new-milestone

Same workflow as new-project, but GSD-2 already knows your codebase — it reads prior task summaries, the decisions register, and the architecture overview to plan the next milestone in full context.

What Makes GSD-2 Actually Powerful

1. Context Isolation Per Task

This is the core mechanism. Every task gets its own fresh agent session. GSD pre-constructs the dispatch with:

The task plan and must-haves
The slice plan it belongs to
Summaries of prior tasks in the same slice
Relevant source files (pre-inlined, not discovered via tool calls)
The decisions register
Roadmap excerpts

The agent is never disoriented. It never wastes tool calls figuring out where things are. It starts with exactly the right context and nothing else.

What a task agent receives at dispatch time:
─────────────────────────────────────────────

  ┌──────────────────────────────────────────┐
  │  Task: "Implement /api/auth/login"        │
  │                                          │
  │  Must-haves:                             │
  │    ✦ POST /api/auth/login returns JWT    │
  │    ✦ Invalid credentials → 401          │
  │    ✦ Token validates in middleware       │
  │                                          │
  │  Prior task summary: "auth middleware    │
  │  implemented in src/middleware/auth.ts"  │
  │                                          │
  │  Pre-loaded files:                       │
  │    • src/middleware/auth.ts              │
  │    • src/routes/index.ts                 │
  │    • src/models/user.ts                  │
  └──────────────────────────────────────────┘

2. Crash Recovery

GSD-2 writes lock files and session state to disk continuously. If the process dies, power cuts out, or your machine reboots:

gsd
/gsd auto   # Resumes from exact last checkpoint

It reads the lock files, determines which task was in progress, checks git history to see what was actually committed, and picks up from there. No work is lost.

3. Verification Gates

Tasks declare must-haves — explicit, verifiable success criteria:

Truths: Observable behaviors (“POST /api/auth/login returns 200 with valid JWT”)
Artifacts: Files with real implementation (not stubs)
Key Links: Import wiring between components (“auth.ts imported in routes/index.ts”)

After execution, GSD-2 runs verification against these criteria. If verification fails, it auto-retries with diagnostics. Only after passing verification does it advance to the next task.

4. Automatic Git Management

Each milestone gets its own git worktree (gsd/M001/). GSD-2 commits after every completed task with a message derived from the task plan and actual changes. When the milestone is done, all commits squash-merge to main as one clean, semantic commit.

git log --oneline (after milestone complete)
────────────────────────────────────────────

a3f891c  feat(auth): implement JWT authentication system
         with login, registration, session management,
         password reset flow, and middleware protection
         [GSD M001 — 5 slices, 18 tasks, 12 commits squashed]

5. Budget Enforcement

Set a cost ceiling before running auto mode:

/gsd prefs
# Set budget: $5.00

If auto mode approaches the ceiling, it pauses and shows the dashboard. No surprise bills.

6. Parallel Reactive Dispatch (v2.38+)

In GSD v2.38, tasks within a slice that have no dependencies on each other can be dispatched in parallel via subagents. A slice that previously ran 4 tasks sequentially now runs independent tasks concurrently.

Before v2.38 (sequential):    After v2.38 (parallel):

Task 1 ──▶ Task 2             Task 1 ──┐
              │                Task 2 ──┼──▶ Task 4
           Task 3              Task 3 ──┘
              │
           Task 4

Migrating from GSD-1

If you have projects with .planning/ directories from the original GSD, migration is one command:

cd your-old-project
gsd
/gsd migrate

GSD-2 reads the .planning/ structure, converts it to .gsd/ format, and preserves your existing milestones, slices, and any completed task summaries.

Key Commands Reference

Command	What it does
`gsd`	Start GSD-2 CLI
`/login`	Set up AI provider
`/gsd new-project`	Initialize new project
`/gsd`	Step mode — one phase at a time
`/gsd auto`	Autonomous mode — run entire milestone
`/gsd status`	Progress dashboard with cost tracking
`/gsd discuss`	Architecture decisions (works alongside auto mode)
`/gsd prefs`	Model selection, timeouts, budget ceiling
`/gsd complete-milestone`	Squash commits, generate report, archive
`/gsd new-milestone`	Start next milestone on existing project
`/gsd migrate`	Convert GSD-1 `.planning/` to `.gsd/` format
`gsd headless query`	JSON state snapshot (~50ms, no LLM call)

HTML Report

After /gsd complete-milestone, GSD-2 generates a self-contained HTML report in .gsd/reports/. Open it in any browser — no server required, all CSS/JS is inlined.

The report contains:

Project and milestone summary
Slice completion timeline
Per-task cost and token breakdown
Dependency graph (which tasks depended on which)
Knowledge base sections (key decisions made during development)
UAT test scripts for every slice

The report is PDF-printable — useful for handing off to a QA team or including in project documentation.

Common Issues on Ubuntu/Linux

npm global install fails with EACCES

# Check current prefix
npm config get prefix

# Set to a user-writable path
mkdir -p ~/.npm-global
npm config set prefix ~/.npm-global

# Add to shell profile
echo 'export PATH="$HOME/.npm-global/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Retry install
npm install -g gsd-pi@latest

`gsd` command not found after install

# Check if npm global bin is in PATH
echo $PATH | grep -o "[^:]*npm[^:]*"

# If missing, add it
echo 'export PATH="$HOME/.npm-global/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

Node.js version too old

GSD-2 requires Node.js 20+. Check your version:

node --version

Upgrade via NodeSource:

curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt-get install -y nodejs

Git not initialized

GSD-2 requires a git repository. If you see an error about git:

git init
git add .
git commit -m "Initial commit"

Process dies mid-milestone

If your terminal closes or the process is killed:

gsd
/gsd auto   # Automatically resumes from last checkpoint

GSD-2 reads lock files from .gsd/locks/ and reconstructs where it was.

Budget ceiling reached

If auto mode pauses with a budget warning:

/gsd prefs   # Increase the budget ceiling
/gsd auto    # Resume

GSD-2 vs. GSD-1 vs. Just Using Claude Code

It helps to be clear about when to use what:

When to use each tool:
──────────────────────────────────────────────────────

  Small task / quick fix
  └── Claude Code directly — fastest, no overhead

  Medium project, you want control
  └── GSD-1 — structured prompting, manual orchestration
      good when: you want to review each step closely

  Large project, you want autonomy
  └── GSD-2 — autonomous execution, crash recovery, git automation
      good when: multi-day work, many interdependent components

  CI/CD pipeline
  └── GSD-2 headless mode — JSON state, scriptable

GSD-2 is not better than Claude Code for a quick bug fix. It has overhead — project initialization, the planning phase, the slice structure. That overhead pays off on projects with more than a few hours of work.

A Real Workflow Example

Here is what a typical day with GSD-2 looks like on Ubuntu:

# Morning — check what GSD-2 did overnight
gsd headless query | jq '.milestone.progress'
# Output: {"slices_complete": 4, "slices_total": 6, "current_phase": "EXECUTE"}

# Open the dashboard
gsd
/gsd status

# Check git commits from overnight
git log --oneline gsd/M001/

# Not happy with a task output — discuss it
/gsd discuss "The login endpoint is missing rate limiting — add it to the next task"

# GSD-2 adds a note to the decisions register and factors it into upcoming planning

# Evening — check if milestone is complete
gsd headless query | jq '.milestone.status'
# Output: "complete"

# Wrap up the milestone
gsd
/gsd complete-milestone

# Open the HTML report
xdg-open .gsd/reports/M001-report.html

# Start the next milestone
/gsd new-milestone

Summary

GSD-2 solves the three hard problems of autonomous AI development:

Context rot — eliminated by giving every task its own fresh 200K-token session with exactly the right pre-loaded context.

Reliability — addressed by lock files, crash recovery, verification gates, and stuck-loop detection with automatic retry.

Git hygiene — handled automatically with worktree isolation, task-level commits, and squash merges that produce clean, semantic git history.

The result is a system where you describe what you want to build, press go, and come back to working software with a clean git history, a cost report, and a ready-to-use HTML document for handoff.

For large projects, it is not a productivity improvement. It is a category change in how the work gets done.

GitHub: gsd-build/gsd-2 Original GSD (v1): gsd-build/get-shit-done GSD Organization: github.com/gsd-build