%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#1a1a36', 'primaryTextColor': '#00f0ff', 'primaryBorderColor': '#00f0ff', 'lineColor': '#b44dff', 'secondaryColor': '#14142a', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810', 'clusterBkg': '#0c0c18', 'clusterBorder': '#1e1e35'}}}%%
flowchart LR
YOU:::start --> D:::phase1
D --> S:::phase2
S --> E:::phase3
E --> R:::phase4
R -->|score >= 95| SUB:::phase5
R -.->|missing lit| D
R -.->|flawed ID| S
R -.->|more checks| E
E -.->|parallel| T:::talks
YOU((You))
D(Discovery)
S(Strategy)
E(Execution)
R(Peer Review)
SUB(Submission)
T(Talks)
classDef start fill:#080810,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
classDef phase1 fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:2px
classDef phase2 fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:2px
classDef phase3 fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:2px
classDef phase4 fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
classDef phase5 fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:2px
classDef talks fill:#0c0c18,stroke:#8888a8,color:#8888a8,stroke-width:1px,stroke-dasharray: 5 5
Architecture Reference
How the System Works
System Overview
Claude Code’s power comes from five configuration layers that work together. For details on each layer, see the Customization Guide.
| Layer | Location | Loaded When |
|---|---|---|
| CLAUDE.md | Project root | Every session |
| Rules | .claude/rules/ |
Always-on or path-scoped |
| Skills | .claude/skills/ |
On demand (slash commands) |
| Agents | .claude/agents/ |
On demand (via skills or orchestrator) |
| Hooks | .claude/settings.json |
On events (automatic) |
Claude reliably follows about 100–150 custom instructions. Your system prompt uses ~50, leaving ~100 for your project. CLAUDE.md and always-on rules share this budget. Path-scoped rules, skills, and agents load on demand.
Agent Workflow Diagram
How 16 agents coordinate across the research pipeline. Each phase pairs workers (who create) with critics (who review). The Orchestrator manages dispatch, quality gates, and escalation. For detailed descriptions of each agent, see Meet the Agents.
The Pipeline
Agents by Phase
Each phase dispatches worker → critic pairs. The worker creates; the critic reviews (read-only, never edits). If they can’t agree after 3 rounds, the Orchestrator escalates.
Phase 1: Discovery
| Worker | What it creates | Critic | What it checks |
|---|---|---|---|
| Librarian | Bibliography, frontier map | librarian-critic | Coverage, gaps, recency |
| Explorer | Data sources, feasibility | explorer-critic | Measurement, sample, ID fit |
Phase 2: Strategy
| Worker | What it creates | Critic | What it checks |
|---|---|---|---|
| Strategist | DiD/IV/RDD/SC design, strategy memo | strategist-critic | Assumptions, inference, sanity |
Phase 3: Execution
| Worker | What it creates | Critic | What it checks |
|---|---|---|---|
| Data-engineer | Cleaning scripts, figures | coder-critic | Code quality, reproducibility |
| Coder | Analysis scripts, tables | coder-critic | Strategy alignment, correctness |
| Writer | Paper sections, humanizer pass | writer-critic | Notation, hedging, claims vs evidence |
Phase 4: Peer Review
| Referee | Focus | Independence |
|---|---|---|
| domain-referee | Contributions, literature, external validity | Blind — doesn’t see the other report |
| methods-referee | Identification, estimation, robustness | Blind — doesn’t see the other report |
The Orchestrator synthesizes both reports into an editorial decision: Accept / Minor / Major / Reject.
Phase 5: Submission
| Agent | What it does | Scoring |
|---|---|---|
| Verifier | 10-check AEA replication audit | Pass/fail (binary) |
Talks (parallel with Peer Review)
| Worker | What it creates | Critic | What it checks |
|---|---|---|---|
| Storyteller | Beamer slides (4 formats) | storyteller-critic | Narrative, visuals, fidelity |
The Worker-Critic Loop
%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#14142a', 'primaryTextColor': '#d0d0e0', 'primaryBorderColor': '#b44dff', 'lineColor': '#b44dff', 'secondaryColor': '#0c0c18', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810'}}}%%
flowchart TD
W[WORKER creates artifact]:::worker --> C[CRITIC reviews, scores 0-100]:::critic
C -->|score >= 80| A([APPROVED]):::approved
C -->|score < 80| F[WORKER fixes issues]:::worker
F --> C2[CRITIC re-reviews]:::critic
C2 -->|pass| A
C2 -->|fail, round 3| ESC([ESCALATE to user]):::escalate
classDef worker fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:2px
classDef critic fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:2px
classDef approved fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:2px
classDef escalate fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
Why it works: The critic can’t fix files (read-only), so it has no incentive to downplay issues. The worker can’t approve itself (the critic re-audits). This prevents Claude from saying “looks good” about its own work.
Severity gradient: Critics calibrate harshness by phase — encouraging in Discovery, precise in Execution, adversarial in Peer Review.
The Orchestrator (Contractor Mode)
Once a plan is approved, the orchestrator takes over autonomously, dispatching agents based on the dependency graph.
The Mental Model
Think of the orchestrator as a general contractor. You are the client. You describe what you want. The plan-first protocol is the blueprint phase. Once you approve the blueprint, the contractor takes over: hires the right specialists (agents), inspects their work (verification), sends them back to fix issues (review-fix loop), and only calls you when the job passes inspection (quality gates).
The Orchestrator:
- Dispatches worker-critic pairs based on the dependency graph
- Enforces quality gates: 80 (commit), 90 (PR), 95 (submission)
- Escalates when pairs can’t converge after 3 rounds
- Synthesizes referee reports into editorial decisions
- Selects target journals at submission time
- Tracks the research journal with scores and phase transitions
Dependency-Driven Dispatch
%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#14142a', 'primaryTextColor': '#d0d0e0', 'primaryBorderColor': '#b44dff', 'lineColor': '#b44dff', 'secondaryColor': '#0c0c18', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810'}}}%%
flowchart TD
P1:::phase --> P2:::phase
P2 --> P3:::phase
P3 --> P4:::phase
P4 -->|score >= 95| P5:::phase
P1(Discovery)
P2(Strategy)
P3(Execution)
P4(Peer Review)
P5(Submission)
subgraph s1 [" "]
direction LR
L[Librarian]:::worker ---|reviewed by| LC[librarian-critic]:::critic
X[Explorer]:::worker ---|reviewed by| XC[explorer-critic]:::critic
end
subgraph s2 [" "]
ST[Strategist]:::worker ---|reviewed by| STC[strategist-critic]:::critic
end
subgraph s3 [" "]
direction LR
DE[Data-engineer]:::worker ---|reviewed by| CC1[coder-critic]:::critic
CO[Coder]:::worker ---|reviewed by| CC2[coder-critic]:::critic
WR[Writer]:::worker ---|reviewed by| WC[writer-critic]:::critic
end
subgraph s4 [" "]
direction LR
DR[domain-referee]:::referee
MR[methods-referee]:::referee
end
subgraph s5 [" "]
VE[Verifier]:::infra
end
P1 --- s1
P2 --- s2
P3 --- s3
P4 --- s4
P5 --- s5
classDef phase fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
classDef worker fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:1px
classDef critic fill:#14142a,stroke:#00f0ff,color:#00f0ff,stroke-width:1px
classDef referee fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:1px
classDef infra fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:1px
Phases activate when dependencies are met, not by forced sequence. Parallel dispatch within phases (e.g., Librarian and Explorer run simultaneously in Phase 1).
The Loop
%%{init: {'theme': 'dark', 'themeVariables': {'primaryColor': '#14142a', 'primaryTextColor': '#d0d0e0', 'primaryBorderColor': '#b44dff', 'lineColor': '#b44dff', 'secondaryColor': '#0c0c18', 'tertiaryColor': '#080810', 'edgeLabelBackground': '#080810'}}}%%
flowchart TD
START([Plan approved]):::start --> IMPL[IMPLEMENT]:::step
IMPL --> VERIFY[VERIFY: compile, render, check]:::step
VERIFY --> REVIEW[REVIEW: worker-critic pairs]:::step
REVIEW --> FIX[FIX: critical then major then minor]:::step
FIX --> REVERIFY[RE-VERIFY: confirm fixes]:::step
REVERIFY --> SCORE[SCORE: weighted aggregate]:::step
SCORE -->|score >= threshold| DONE([Present summary]):::approved
SCORE -->|score < threshold| REVIEW
SCORE -.->|max 5 rounds| DONE
classDef start fill:#14142a,stroke:#ff2d7b,color:#ff2d7b,stroke-width:2px
classDef step fill:#14142a,stroke:#b44dff,color:#b44dff,stroke-width:2px
classDef approved fill:#14142a,stroke:#00cc66,color:#00cc66,stroke-width:2px
Three-Strikes Escalation
If a worker-critic pair can’t resolve after 3 rounds:
| Pair | Escalation Target |
|---|---|
| Coder + coder-critic | → Strategist (design may be wrong) |
| Writer + writer-critic | → domain-referee (framing may need rethinking) |
| Strategist + strategist-critic | → User (fundamental design disagreement) |
| Storyteller + storyteller-critic | → Writer (paper content may need revision) |
“Just Do It” Mode
When you say “just do it”, the orchestrator still runs the full verify-review-fix loop, but skips the final approval pause and auto-commits if the score is >= 80.
Plan-First Workflow
For any non-trivial task, Claude enters plan mode before writing code.
The Protocol
Non-trivial task arrives
│
Step 1: Enter Plan Mode
Step 2: Draft plan (approach, files, verification)
Step 3: Save to quality_reports/plans/YYYY-MM-DD_description.md
Step 4: Present plan to user
Step 5: User approves (or revises)
Step 6: Save initial session log
Step 7: Orchestrator takes over (dependency-driven dispatch)
Step 8: Update session log + plan status to COMPLETED
Plans are saved to disk so they survive context compression. The rule: avoid /clear — prefer auto-compression.
Session Logging (3 Triggers)
- After plan approval — create the log with goal, plan summary, rationale
- During implementation — append 1–3 lines as design decisions happen
- At session end — add what was accomplished, open questions, unresolved issues
Parallel Agents
Claude Code can spawn multiple agents simultaneously using the Task tool.
When to Use
| Scenario | Sequential (slow) | Parallel (fast) |
|---|---|---|
| Paper review | Run each agent sequentially | Run strategist-critic + coder-critic + writer-critic + Verifier simultaneously |
| Literature search | Search one database at a time | Spawn agents for journals, NBER, SSRN |
| Data exploration | Check one source at a time | Spawn agents for each data category |
The orchestrator recognizes independent subtasks and spawns parallel agents automatically.
Practical Limits
- 3–4 agents is the sweet spot. More increases overhead without proportional speedup.
- Agents are independent — they cannot see each other’s work. Dependent tasks run sequentially.
- Parallel agents multiply token usage.
Scoring Protocol
The weighted aggregate score determines quality gate passage:
| Component | Weight | Agent(s) |
|---|---|---|
| Literature | 10% | Librarian + librarian-critic |
| Data | 10% | Explorer + explorer-critic |
| Identification | 25% | strategist-critic |
| Code | 15% | coder-critic |
| Paper | 25% | Avg(domain-referee + methods-referee) |
| Polish | 10% | writer-critic |
| Replication | 5% | Verifier |
Missing components: If a component hasn’t been scored (e.g., no literature review yet), its weight is redistributed proportionally across available components.
Submission gate: Aggregate >= 95 AND every individual component >= 80.
Additional Workflows: Slides & Lectures
The Clo-Author inherits a lecture production pipeline from its origin. These workflows are available for users who also produce course materials.
Creating Talks from Papers
The /talk skill dispatches the Storyteller (creator) and storyteller-critic (reviewer) to generate Beamer presentations in 4 formats (job market, seminar, short, lightning). All content derives from Paper/main.tex.
Replication-First Coding
When working with papers that have replication packages:
Phase 1: Inventory original code → record "gold standard" numbers
Phase 2: Translate (e.g., Stata → R) → match original spec EXACTLY
Phase 3: Verify match → tolerance < 0.01 for estimates, < 0.05 for SEs
Phase 4: Only then extend with new estimators and specifications
Cross-Language Replication
Run the same analysis in two languages simultaneously to catch implementation bugs:
/analyze --dual r,pythondispatches two Coder agents in parallel with the same strategy memo/review --replicate pythonre-implements existing R code in Python and compares outputs- Tolerances come from
domain-profile.md— divergence beyond tolerance is flagged as a bug
Inspired by Scott Cunningham’s approach: if two independent implementations produce the same estimates, neither has a bug. If they diverge, you’ve caught something before it reaches the paper.
Branch Isolation with Git Worktrees
Git worktrees create a separate working directory linked to the same repository. Useful for major translations, risky refactors, or multi-day projects.
| Situation | Use Worktree? |
|---|---|
| Quick fix to one file | No |
| Major refactor | Yes |
| Experimenting with new approach | Yes |
For most users, working directly on main with frequent commits is simpler and sufficient.
Constitutional Governance (Optional)
As your project grows, some decisions become non-negotiable. The templates/constitutional-governance.md template helps you distinguish between:
- Immutable principles (Articles I–V): Non-negotiable rules
- User preferences: Flexible patterns that can vary
Use it after you’ve established 3–7 recurring patterns.