Architecture Reference

How the System Works

System Overview

Claude Code’s power comes from five configuration layers that work together. For details on each layer, see the Customization Guide.

Layer Location Loaded When
CLAUDE.md Project root Every session
Rules .claude/rules/ Always-on or path-scoped
Skills .claude/skills/ On demand (slash commands)
Agents .claude/agents/ On demand (via skills or orchestrator)
Hooks .claude/settings.json On events (automatic)

Claude reliably follows about 100–150 custom instructions. Your system prompt uses ~50, leaving ~100 for your project. CLAUDE.md and always-on rules share this budget. Path-scoped rules, skills, and agents load on demand.


Agent Workflow Diagram

How 16 agents coordinate across the research pipeline. Each phase pairs workers (who create) with critics (who review). The Orchestrator manages dispatch, quality gates, and escalation. For detailed descriptions of each agent, see Meet the Agents.

The Pipeline

%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#1a1a36', 'primaryTextColor': '#00f0ff', 'primaryBorderColor': '#00f0ff', 'lineColor': '#b44dff', 'secondaryColor': '#14142a', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810', 'clusterBkg': '#0c0c18', 'clusterBorder': '#1e1e35'}}}%%
flowchart LR
    YOU:::start --> D:::phase1
    D --> S:::phase2
    S --> E:::phase3
    E --> R:::phase4
    R -->|score >= 95| SUB:::phase5
    R -.->|missing lit| D
    R -.->|flawed ID| S
    R -.->|more checks| E
    E -.->|parallel| T:::talks

    YOU((You))
    D(Discovery)
    S(Strategy)
    E(Execution)
    R(Peer Review)
    SUB(Submission)
    T(Talks)

    classDef start fill:#080810,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
    classDef phase1 fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:2px
    classDef phase2 fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:2px
    classDef phase3 fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:2px
    classDef phase4 fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
    classDef phase5 fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:2px
    classDef talks fill:#0c0c18,stroke:#8888a8,color:#8888a8,stroke-width:1px,stroke-dasharray: 5 5

Agents by Phase

Each phase dispatches worker → critic pairs. The worker creates; the critic reviews (read-only, never edits). If they can’t agree after 3 rounds, the Orchestrator escalates.

Phase 1: Discovery

Worker What it creates Critic What it checks
Librarian Bibliography, frontier map librarian-critic Coverage, gaps, recency
Explorer Data sources, feasibility explorer-critic Measurement, sample, ID fit

Phase 2: Strategy

Worker What it creates Critic What it checks
Strategist DiD/IV/RDD/SC design, strategy memo strategist-critic Assumptions, inference, sanity

Phase 3: Execution

Worker What it creates Critic What it checks
Data-engineer Cleaning scripts, figures coder-critic Code quality, reproducibility
Coder Analysis scripts, tables coder-critic Strategy alignment, correctness
Writer Paper sections, humanizer pass writer-critic Notation, hedging, claims vs evidence

Phase 4: Peer Review

Referee Focus Independence
domain-referee Contributions, literature, external validity Blind — doesn’t see the other report
methods-referee Identification, estimation, robustness Blind — doesn’t see the other report

The Orchestrator synthesizes both reports into an editorial decision: Accept / Minor / Major / Reject.

Phase 5: Submission

Agent What it does Scoring
Verifier 10-check AEA replication audit Pass/fail (binary)

Talks (parallel with Peer Review)

Worker What it creates Critic What it checks
Storyteller Beamer slides (4 formats) storyteller-critic Narrative, visuals, fidelity

The Worker-Critic Loop

%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#14142a', 'primaryTextColor': '#d0d0e0', 'primaryBorderColor': '#b44dff', 'lineColor': '#b44dff', 'secondaryColor': '#0c0c18', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810'}}}%%
flowchart TD
    W[WORKER creates artifact]:::worker --> C[CRITIC reviews, scores 0-100]:::critic
    C -->|score >= 80| A([APPROVED]):::approved
    C -->|score < 80| F[WORKER fixes issues]:::worker
    F --> C2[CRITIC re-reviews]:::critic
    C2 -->|pass| A
    C2 -->|fail, round 3| ESC([ESCALATE to user]):::escalate

    classDef worker fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:2px
    classDef critic fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:2px
    classDef approved fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:2px
    classDef escalate fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px

Why it works: The critic can’t fix files (read-only), so it has no incentive to downplay issues. The worker can’t approve itself (the critic re-audits). This prevents Claude from saying “looks good” about its own work.

Severity gradient: Critics calibrate harshness by phase — encouraging in Discovery, precise in Execution, adversarial in Peer Review.


The Orchestrator (Contractor Mode)

Once a plan is approved, the orchestrator takes over autonomously, dispatching agents based on the dependency graph.

The Mental Model

Think of the orchestrator as a general contractor. You are the client. You describe what you want. The plan-first protocol is the blueprint phase. Once you approve the blueprint, the contractor takes over: hires the right specialists (agents), inspects their work (verification), sends them back to fix issues (review-fix loop), and only calls you when the job passes inspection (quality gates).

The Orchestrator:

  • Dispatches worker-critic pairs based on the dependency graph
  • Enforces quality gates: 80 (commit), 90 (PR), 95 (submission)
  • Escalates when pairs can’t converge after 3 rounds
  • Synthesizes referee reports into editorial decisions
  • Selects target journals at submission time
  • Tracks the research journal with scores and phase transitions

Dependency-Driven Dispatch

%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#14142a', 'primaryTextColor': '#d0d0e0', 'primaryBorderColor': '#b44dff', 'lineColor': '#b44dff', 'secondaryColor': '#0c0c18', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810'}}}%%
flowchart TD
    P1:::phase --> P2:::phase
    P2 --> P3:::phase
    P3 --> P4:::phase
    P4 -->|score >= 95| P5:::phase

    P1(Discovery)
    P2(Strategy)
    P3(Execution)
    P4(Peer Review)
    P5(Submission)

    subgraph s1 [" "]
        direction LR
        L[Librarian]:::worker ---|reviewed by| LC[librarian-critic]:::critic
        X[Explorer]:::worker ---|reviewed by| XC[explorer-critic]:::critic
    end

    subgraph s2 [" "]
        ST[Strategist]:::worker ---|reviewed by| STC[strategist-critic]:::critic
    end

    subgraph s3 [" "]
        direction LR
        DE[Data-engineer]:::worker ---|reviewed by| CC1[coder-critic]:::critic
        CO[Coder]:::worker ---|reviewed by| CC2[coder-critic]:::critic
        WR[Writer]:::worker ---|reviewed by| WC[writer-critic]:::critic
    end

    subgraph s4 [" "]
        direction LR
        DR[domain-referee]:::referee
        MR[methods-referee]:::referee
    end

    subgraph s5 [" "]
        VE[Verifier]:::infra
    end

    P1 --- s1
    P2 --- s2
    P3 --- s3
    P4 --- s4
    P5 --- s5

    classDef phase fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
    classDef worker fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:1px
    classDef critic fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:1px
    classDef referee fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:1px
    classDef infra fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:1px

Phases activate when dependencies are met, not by forced sequence. Parallel dispatch within phases (e.g., Librarian and Explorer run simultaneously in Phase 1).

The Loop

%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#14142a', 'primaryTextColor': '#d0d0e0', 'primaryBorderColor': '#b44dff', 'lineColor': '#b44dff', 'secondaryColor': '#0c0c18', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810'}}}%%
flowchart TD
    START([Plan approved]):::start --> IMPL[IMPLEMENT]:::step
    IMPL --> VERIFY[VERIFY: compile, render, check]:::step
    VERIFY --> REVIEW[REVIEW: worker-critic pairs]:::step
    REVIEW --> FIX[FIX: critical then major then minor]:::step
    FIX --> REVERIFY[RE-VERIFY: confirm fixes]:::step
    REVERIFY --> SCORE[SCORE: weighted aggregate]:::step
    SCORE -->|score >= threshold| DONE([Present summary]):::approved
    SCORE -->|score < threshold| REVIEW
    SCORE -.->|max 5 rounds| DONE

    classDef start fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
    classDef step fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:2px
    classDef approved fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:2px

Three-Strikes Escalation

If a worker-critic pair can’t resolve after 3 rounds:

Pair Escalation Target
Coder + coder-critic → Strategist (design may be wrong)
Writer + writer-critic → domain-referee (framing may need rethinking)
Strategist + strategist-critic → User (fundamental design disagreement)
Storyteller + storyteller-critic → Writer (paper content may need revision)

“Just Do It” Mode

When you say “just do it”, the orchestrator still runs the full verify-review-fix loop, but skips the final approval pause and auto-commits if the score is >= 80.


Plan-First Workflow

For any non-trivial task, Claude enters plan mode before writing code.

The Protocol

Non-trivial task arrives
  │
  Step 1: Enter Plan Mode
  Step 2: Draft plan (approach, files, verification)
  Step 3: Save to quality_reports/plans/YYYY-MM-DD_description.md
  Step 4: Present plan to user
  Step 5: User approves (or revises)
  Step 6: Save initial session log
  Step 7: Orchestrator takes over (dependency-driven dispatch)
  Step 8: Update session log + plan status to COMPLETED

Plans are saved to disk so they survive context compression. The rule: avoid /clear — prefer auto-compression.

Session Logging (3 Triggers)

  1. After plan approval — create the log with goal, plan summary, rationale
  2. During implementation — append 1–3 lines as design decisions happen
  3. At session end — add what was accomplished, open questions, unresolved issues

Parallel Agents

Claude Code can spawn multiple agents simultaneously using the Task tool.

When to Use

Scenario Sequential (slow) Parallel (fast)
Paper review Run each agent sequentially Run strategist-critic + coder-critic + writer-critic + Verifier simultaneously
Literature search Search one database at a time Spawn agents for journals, NBER, SSRN
Data exploration Check one source at a time Spawn agents for each data category

The orchestrator recognizes independent subtasks and spawns parallel agents automatically.

Practical Limits

  • 3–4 agents is the sweet spot. More increases overhead without proportional speedup.
  • Agents are independent — they cannot see each other’s work. Dependent tasks run sequentially.
  • Parallel agents multiply token usage.

Scoring Protocol

The weighted aggregate score determines quality gate passage:

Component Weight Agent(s)
Literature 10% Librarian + librarian-critic
Data 10% Explorer + explorer-critic
Identification 25% strategist-critic
Code 15% coder-critic
Paper 25% Avg(domain-referee + methods-referee)
Polish 10% writer-critic
Replication 5% Verifier

Missing components: If a component hasn’t been scored (e.g., no literature review yet), its weight is redistributed proportionally across available components.

Submission gate: Aggregate >= 95 AND every individual component >= 80.


Additional Workflows: Slides & Lectures

The Clo-Author inherits a lecture production pipeline from its origin. These workflows are available for users who also produce course materials.

Creating Talks from Papers

The /talk skill dispatches the Storyteller (creator) and storyteller-critic (reviewer) to generate Beamer presentations in 4 formats (job market, seminar, short, lightning). All content derives from Paper/main.tex.

Replication-First Coding

When working with papers that have replication packages:

Phase 1: Inventory original code → record "gold standard" numbers
Phase 2: Translate (e.g., Stata → R) → match original spec EXACTLY
Phase 3: Verify match → tolerance < 0.01 for estimates, < 0.05 for SEs
Phase 4: Only then extend with new estimators and specifications

Cross-Language Replication

Run the same analysis in two languages simultaneously to catch implementation bugs:

  • /analyze --dual r,python dispatches two Coder agents in parallel with the same strategy memo
  • /review --replicate python re-implements existing R code in Python and compares outputs
  • Tolerances come from domain-profile.md — divergence beyond tolerance is flagged as a bug

Inspired by Scott Cunningham’s approach: if two independent implementations produce the same estimates, neither has a bug. If they diverge, you’ve caught something before it reaches the paper.

Branch Isolation with Git Worktrees

Git worktrees create a separate working directory linked to the same repository. Useful for major translations, risky refactors, or multi-day projects.

Situation Use Worktree?
Quick fix to one file No
Major refactor Yes
Experimenting with new approach Yes

For most users, working directly on main with frequent commits is simpler and sufficient.


Constitutional Governance (Optional)

As your project grows, some decisions become non-negotiable. The templates/constitutional-governance.md template helps you distinguish between:

  • Immutable principles (Articles I–V): Non-negotiable rules
  • User preferences: Flexible patterns that can vary

Use it after you’ve established 3–7 recurring patterns.