Skip to content
imarch.dev
Back to blog
· 8 min read

Your AI Assistant Gets Dumber Every Minute

AI development tools architecture

I recently stumbled upon an interesting approach to AI-assisted development called GSD (“Get Shit Done”). And you know what? It solves a problem I constantly face: when you chat with Claude or GPT for too long in one session, the model starts “drifting” and forgetting important details. Classic context rot problem - quality degradation as the context window fills up.

As an architect, I often use AI for code generation, but I’m always frustrated: you start with clear requirements, and an hour later you get a mess of contradictory solutions. GSD offers a radical solution: stop treating chat as your project build system.

GSD meta-prompting for AI development

Context rot - every developer’s pain

Let’s be honest: we’ve all been through this. You start a project with an AI assistant, first iterations are fire. The model perfectly understands the architecture, follows patterns, writes clean code. And after a few hours of work you get:

  • Long-session drift - the agent slowly forgets initial constraints
  • Invisible decision loss - you agreed on a pattern, but the agent stops following it
  • Unreviewable diffs - one giant commit with no explanations or rollback points
  • Fake progress - lots of code, little working product

Sound familiar? I’ve had plenty of such sessions. In my experience, it’s especially painful when a project is close to the finish line, and then it turns out that half the code was written violating architectural principles we discussed at the beginning.

What is GSD and why it’s not just another wrapper

GSD is not just a wrapper over Claude or GPT. It’s a whole philosophy of spec-driven development that solves context rot through:

  1. Externalized state - all system memory is moved to files
  2. Fresh contexts - each task is executed in a clean context
  3. Atomic commits - each commit is tied to one atomic task
  4. Explicit verification - each stage has clear verification criteria

Essentially, it’s an AI agent orchestration system with strict discipline in context management.

Installation and first steps

It all starts simple:

npx get-shit-done-cc@latest

The installer supports Claude Code, OpenCode, Gemini CLI, and Codex. For Claude Code, it’s recommended to run with the flag:

claude --dangerously-skip-permissions

Yeah, the flag name is telling. It’s a compromise between security and productivity - a topic for a separate conversation.

Development phases: discuss → plan → execute → verify

The most interesting thing about GSD is the clear process structure. Each stage has a specific purpose and artifacts.

1. Initialize - creating project skeleton

/gsd:new-project

The system asks questions, understands the project idea and creates basic artifacts:

  • PROJECT.md - project vision
  • REQUIREMENTS.md - requirements split into v1/v2
  • ROADMAP.md - development phase plan
  • STATE.md - current decisions and blockers

For existing projects, there’s the /gsd:map-codebase command that analyzes architecture and code conventions.

2. Discuss Phase - underrated stage

Here’s where the main GSD feature lies. Before planning comes the discussion phase:

/gsd:discuss-phase 1

The system explicitly identifies gray areas that usually lead to misunderstandings:

  • Visual features: layout density, interactivity, empty states
  • API/CLI: response formats, flags, error handling, detail level
  • Content systems: structure, tone, depth of elaboration
  • Organizational tasks: grouping criteria, naming, duplicates

The result is a {phase}-CONTEXT.md file that feeds the researcher and planner.

Honestly, this is gold! How many projects have I seen where the developer and AI spoke different languages precisely because of underdiscussed details.

3. Plan Phase - structuring work

/gsd:plan-phase 1

Planning includes three stages:

  1. Research - investigation based on phase context
  2. Plan creation - creating 2-3 atomic tasks
  3. Plan verification - checking plans until success, but not more than limit

Plans are described in XML format:

<task type="auto">
  <name>Create login endpoint</name>
  <files>src/app/api/auth/login/route.ts</files>
  <action>
    Use jose for JWT (not jsonwebtoken - CommonJS issues).
    Validate credentials against users table.
    Return httpOnly cookie on success.
  </action>
  <verify>curl -X POST localhost:3000/api/auth/login returns 200 + Set-Cookie</verify>
  <done>Valid credentials return cookie, invalid return 401</done>
</task>

This removes ambiguity and provides clear completion criteria.

4. Execute Phase - fresh context for each task

/gsd:execute-phase 1

Execution is organized in waves considering dependencies:

┌─────────────────────────────────────────────────────────────────────┐
│  PHASE EXECUTION                                                    │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  WAVE 1 (parallel)          WAVE 2 (parallel)          WAVE 3       │
│  ┌─────────┐ ┌─────────┐    ┌─────────┐ ┌─────────┐    ┌─────────┐  │
│  │ Plan 01 │ │ Plan 02 │ →  │ Plan 03 │ │ Plan 04 │ →  │ Plan 05 │  │
│  │    ↑    │ │    ↑    │    │    ↑    │ │    ↑    │    │    ↑    │  │
│  │ User    │ │ Product │    │ Orders  │ │ Cart    │    │ Checkout│  │
│  │ Model   │ │ Model   │    │ API     │ │ API     │    │ UI      │  │
│  └─────────┘ └─────────┘    └─────────┘ └─────────┘    └─────────┘  │
│       │           │              ↑           ↑              ↑       │
│       └───────────┴──────────────┴───────────┘              │       │
│                                                              │       │
│       Dependencies: 01,02 → 03,04 → 05                      │       │
└─────────────────────────────────────────────────────────────────────┘

Key point: each plan is executed in a fresh context, which prevents accumulating “garbage” in the model’s memory.

5. Verify Phase - manual testing with support

/gsd:verify-work 1

The system extracts testable results and walks you through verifying each one (“Can you log in with email?”). When problems are found, debug agents are launched to create fix plans.

Nyquist Validation - controversial but interesting idea

One of GSD’s most intriguing concepts is Nyquist validation. During plan research, the system can create a map of automated test coverage for each requirement before writing code, creating a {phase}-VALIDATION.md file with a feedback contract.

The plan checker considers lack of verification commands a failure condition.

In my opinion, the idea is right, but in practice it might create overhead. In startups you often need to quickly validate hypotheses, and writing tests can slow things down. But for production - absolutely necessary.

Configuration: from YOLO to production-ready

GSD allows configuring the process for different scenarios. Standard configuration:

{
  "mode": "interactive",
  "depth": "standard",
  "model_profile": "balanced",
  "workflow": {
    "research": true,
    "plan_check": true,
    "verifier": true,
    "nyquist_validation": true
  },
  "planning": {
    "commit_docs": true,
    "search_gitignored": false
  },
  "git": {
    "branching_strategy": "none"
  }
}

And for quick prototyping:

{
  "mode": "yolo",
  "depth": "quick",
  "model_profile": "budget",
  "workflow": {
    "research": false,
    "plan_check": false,
    "verifier": false,
    "nyquist_validation": false
  },
  "planning": {
    "commit_docs": false
  }
}

Quick mode for ad-hoc tasks

For one-off tasks there’s /gsd:quick mode, which provides guarantees (state tracking, atomic commits) without the full research→plan→check→verify cycle.

What’s actually useful and what raises doubts

Useful:

  • Atomic commits - git bisect becomes a superpower for debugging
  • Artifact file system - code becomes reviewable and observable
  • Discuss phase - solves most AI miscommunication problems
  • Fresh context - code quality doesn’t degrade over time

Questionable:

  • Planning overhead - many artifacts for simple tasks
  • Security - the --dangerously-skip-permissions flag speaks for itself
  • Learning curve - takes time to master the entire command system

Frankly dubious:

  • Documentation generation in git - can be both feature and problem for history
  • Rigid structure - doesn’t always fit exploratory projects

Philosophical question: who are you?

GSD makes you think about a fundamental question: who are you in the relationship with AI?

Option 1: Developer who chats with a tool Option 2: Developer who manages a build system with a non-deterministic code generator

GSD assumes the second option. Hence the apparent “heaviness” compared to a simple prompt, but also greater reliability compared to raw chat.

Practical conclusions

In the era of AI assistants, we need new engineering practices. Classic principles remain relevant:

  • Short feedback loops
  • Explicit verification
  • Atomic changes
  • Decision traceability

GSD is an attempt to adapt these principles to the realities of working with AI. Should you use GSD specifically? Depends on the project. But the ideas it promotes are definitely worth attention:

  1. Context is an exhaustible resource
  2. Memory should live in files, not in chat
  3. Work should be sized for fresh context
  4. Verification should be tied to tasks
  5. Git history is part of system observability

If you’re seriously working with AI assistants in development, these principles will help avoid many pitfalls. And GSD is one tool that makes their application systematic.

And most importantly: don’t be afraid to experiment with new approaches. In the banking sector we’re used to conservatism, but AI tools are developing so fast that lagging by six months could become critical.


Original article: GET SH*T DONE: Meta-prompting and Spec-driven Development for Claude Code and Codex

Share:

Related posts