Meet the Agents
16 Specialists That Do the Work
How Agents Work
When you ask Claude to analyze data or draft a paper, it doesn’t do everything itself. It dispatches specialized agents — think of them as 16 different research assistants, each with a specific skill.
The key pattern: every agent that creates something (a literature review, analysis code, paper text) has a paired critic that checks its work. The critic can read but never edit — so it has no reason to go easy. If the critic finds problems, the worker fixes them. If they can’t agree after 3 rounds, you get asked to decide.
This is what prevents Claude from saying “looks good” about its own work.
The Researchers (Discovery Phase)
These agents help you find the literature and data you need before designing your empirical strategy.
Librarian
Searches top-5 economics journals, NBER working papers, SSRN, and RePEC for papers related to your topic. Produces an annotated bibliography with BibTeX entries, maps where the research frontier is, and identifies where your paper fits in the conversation.
librarian-critic
Reads the bibliography the Librarian produced and looks for blind spots. Are there important recent papers missing? Is the review biased toward one methodology or subfield? Does it cover the right adjacent literatures? Scores the bibliography and sends it back for fixes if needed.
Explorer
Searches for public, administrative, and survey datasets that could answer your research question. Evaluates each source for variable coverage, sample size, time span, access restrictions, and how well it fits your identification strategy. Produces a ranked list with feasibility grades.
explorer-critic
Reviews the Explorer’s data assessment with a skeptical eye. Checks whether the recommended datasets actually measure what you need, whether the sample is representative enough for your question, and whether the data limitations would threaten your identification strategy.
The Strategist (Strategy Phase)
These agents design and validate your empirical approach before you write a single line of code.
Strategist
Given your research question, literature, and available data, designs the identification strategy. Produces a strategy memo covering: the estimand (what you’re trying to measure), the estimator (DiD, IV, RDD, synthetic control), the key assumptions, a robustness plan, and falsification tests. Can also draft Pre-Analysis Plans in AEA/OSF/EGAP format.
strategist-critic
The toughest reviewer in the system. Reads the strategy memo and stress-tests it: Are the identifying assumptions defensible? Is the estimator appropriate for this design? Are there threats to inference (clustering, multiple testing, weak instruments)? Would a skeptical referee accept this? Sends it back with specific objections until the design is sound.
The Builders (Execution Phase)
These agents produce the actual analysis code, data pipelines, and paper manuscript.
Data-engineer
Handles the unglamorous but critical work: data cleaning, variable construction, panel building, merges, and missing data. Also creates publication-quality figures (ggplot2 themes, multi-panel layouts, accessibility-conscious colors). Paired with coder-critic for review.
Coder
Translates the strategy memo into working analysis scripts. Implements the main specification, robustness checks, and produces publication-ready tables. Supports R, Stata, Python, and Julia. If you’re replicating someone else’s work first, it matches the original results before extending.
coder-critic
Reviews all code — from both the Data-engineer and the Coder. Checks that scripts are reproducible (no hardcoded paths, packages loaded at top, seeds set for stochastic processes), that the code actually implements what the strategy memo says, and that results are numerically sensible.
Writer
Drafts paper sections with proper economics structure: an introduction that states the contribution within the first two pages, an empirical strategy section templated to your design (DiD, IV, RDD), and results that distinguish statistical from economic significance. Runs an automatic humanizer pass that strips common AI writing patterns (inflated symbolism, rule-of-three lists, hedging language).
writer-critic
Reads the manuscript looking for problems a referee would catch: inconsistent notation (\(Y_{it}\) in one place, \(y_i\) in another), hedging language (“it could potentially suggest”), claims not supported by the evidence shown, and LaTeX issues (overfull hboxes, broken references). Sends it back for fixes.
The Reviewers (Peer Review Phase)
These agents simulate what happens when you submit to a journal. Two independent referees review your paper — neither sees the other’s report.
domain-referee
The subject expert. Evaluates whether your paper makes a real contribution to the literature, whether you’ve positioned it correctly relative to existing work, whether your substantive arguments hold up, and whether your results generalize beyond the specific setting. Calibrated to your research field via the domain profile. When you specify a target journal (/review --peer JHR), it emulates that journal’s review culture using journal profiles.
methods-referee
The econometrics expert. Evaluates whether your identification strategy is valid, whether the estimation is appropriate, whether inference is correct (clustering, standard errors, multiple testing), whether the robustness checks are sufficient, and whether another researcher could replicate your results. Also calibrates to a target journal when specified.
The Orchestrator synthesizes both referee reports into an editorial decision: Accept, Minor Revisions, Major Revisions, or Reject.
The Infrastructure
These agents coordinate and verify — they don’t produce research content.
Verifier
Runs in two modes. Standard mode (4 checks): does the paper compile? Do the scripts run? Are output files fresh? Are file references valid? Submission mode (10 checks): adds a full AEA replication package audit — numbered script order, README completeness, data provenance, and dependency documentation. Pass/fail only.
Orchestrator
The general contractor. Manages the entire pipeline: dispatches worker-critic pairs based on what phase you’re in, enforces quality gates (80 to commit, 90 for PRs, 95 for submission), escalates when pairs can’t converge, synthesizes referee reports into editorial decisions, and recommends target journals at submission time. You rarely interact with it directly — it works behind the scenes when you run /new-project or when Claude plans a complex task.
Quick Reference
All 16 agents at a glance:
| Agent | File | What It Does |
|---|---|---|
| Librarian | librarian.md |
Literature collection, bibliography, frontier mapping |
| librarian-critic | librarian-critic.md |
Reviews bibliography for coverage, gaps, recency |
| Explorer | explorer.md |
Data source discovery, feasibility scoring |
| explorer-critic | explorer-critic.md |
Reviews data for measurement validity, ID fit |
| Strategist | strategist.md |
Identification strategy design, Pre-Analysis Plans |
| strategist-critic | strategist-critic.md |
Validates assumptions, inference, robustness |
| Data-engineer | data-engineer.md |
Data cleaning, wrangling, publication figures |
| Coder | coder.md |
Analysis implementation (R/Stata/Python/Julia) |
| coder-critic | coder-critic.md |
Code quality, reproducibility, strategy alignment |
| Writer | writer.md |
Paper drafting, humanizer pass |
| writer-critic | writer-critic.md |
Notation, hedging, LaTeX, claims vs evidence |
| domain-referee | domain-referee.md |
Blind peer review (subject expertise) |
| methods-referee | methods-referee.md |
Blind peer review (econometric methods) |
| Storyteller | storyteller.md |
Beamer talk creation (4 formats) |
| storyteller-critic | storyteller-critic.md |
Talk narrative, visuals, content fidelity |
| Verifier | verifier.md |
Compilation + AEA replication audit |
| Orchestrator | orchestrator.md |
Agent dispatch, quality gates, referee synthesis |
Multi-Model Strategy
Not all agents need the same model. Each agent file has a model: field that controls which Claude model it uses:
| Task Type | Recommended Model | Why |
|---|---|---|
| Deep reasoning (strategy, peer review) | model: opus |
Needs nuanced judgment |
| Fast, constrained work (code review, compilation) | model: sonnet |
Speed matters more |
| Default | model: inherit |
Uses whatever model your session is running |
This saves cost and time — a Verifier checking if LaTeX compiles doesn’t need the most powerful model. A strategist-critic stress-testing your identification design does.
Talks (Presentation Phase)
These agents create and review Beamer presentations derived from your paper.
Storyteller
Creates Beamer presentations from Paper/main.tex in 4 formats: job market talk (45–60 min, full results), seminar (30–45 min, standard format), short talk (15 min, conference session), and lightning (5 min, elevator pitch). All content derives from the paper — the Storyteller never invents results.
storyteller-critic
Reviews the talk for narrative flow (does the argument build?), visual quality (readable slides, not text walls), content fidelity (does the talk match the paper?), and format appropriateness (is a 15-min talk actually 15 minutes?). Scored as advisory — talks don’t block the pipeline.