What’s New

Release History

v4.2.0 — Theorist Pair, Personal Style Guide, Checkpoint

Released 2026-04-17

Three feature additions inspired by parallel work in the Claude-Code-for-economists space and by a theorist pair developed in the bad-controls project.

Theorist + theorist-critic pair

A first-class pair for formal theory sections — assumptions, definitions, lemmas, theorems, and proofs calibrated to top methods journals (Econometrica, Journal of Econometrics, Quantitative Economics, Annals of Statistics).

The theorist drafts identification results, consistency, asymptotic normality, influence functions, DML, bootstrap validity, test properties, and comparative-static propositions. Paper-type aware — activates for econometric methods, theory+empirics, structural identification, and methodological reduced-form papers.

The theorist-critic reviews through 4 sequential phases with early-stop on critical gaps: triage, proof validity (logical, measurability, expansions, identification, asymptotic distribution), assumption minimality + statement calibration, citations + linkage + polish.

Field-specific anchors live in a new Theoretical Foundational References table in domain-profile.md; broad defaults cover DiD, IV, RDD, DML, semiparametric efficiency, GMM, and bootstrap. A Paper Author Team table lets the critic avoid lecturing authors on results they themselves wrote.

Scoring: 20% weight in the aggregate when a theory section is present, renormalized otherwise. Invoke via /strategize theory [target] or audit via /review --theory [target].

Personal style guide

A /write style-guide [paper-dir] mode that extracts the user’s writing voice from their prior papers, then feeds it back to the writer on every subsequent invocation.

Strategic sampling across intros, section openings, abstracts, conclusions, and results paragraphs. Quantitative patterns (sentence-length distribution, passive/active ratio, em dash rate) and qualitative patterns (paragraph openings, section openings, lexicon used and avoided, hedging, comparison style, citation split, tone markers). Self-citation check surfaces author self-citations missing from Bibliography_base.bib.

Writes to .claude/references/personal-style-guide.md with quoted examples from the corpus — never invents patterns. The writer auto-loads it if real content is present; voice guide overrides generic academic defaults but never INV-1..21.

Compaction discipline + /checkpoint skill

Codifies compaction hygiene and adds a project-level session-handoff skill.

Compaction discipline (workflow.md Section 5): manual /compact at natural stopping points over auto-compression; 5–10 turn focused sessions; /checkpoint before /compact or session end; Session Recovery starts at Step 0 — read recent checkpoint artifacts before the plan.

/checkpoint — scaffold-friendly port. Core (always on, fork-friendly): auto-memory updates, SESSION_REPORT.md append, research_journal.md append, git-state snapshot. Obsidian integration is gated: activates only when .claude/state/obsidian-config.md exists AND Obsidian MCP is connected. Fork users get the template; user-specific paths stay local via .gitignore.


v4.1.1 — Modern LaTeX Stack

Released 2026-04-11

Modernizes the LaTeX infrastructure with Overleaf-compatible upgrades.

latexmk Build System

paper/latexmkrc configures XeLaTeX, TEXINPUTS, and BIBINPUTS. One command replaces the manual 4-command build: cd paper && latexmk main.tex. No dot prefix — Overleaf reads it automatically. Updated across CLAUDE.md, verifier, /tools compile, and /talk compile.

tabularray

Modern table engine with key-value interface. Hand-written tables use tblr/talltblr (captions, notes, and rules in one declarative block). R/Python/Julia output continues exporting bare tabular wrapped with threeparttable. Both approaches documented with examples.

cleveref

\usepackage[nameinlink]{cleveref} loaded after hyperref. \cref{fig:x} auto-generates “Figure 1” — eliminates Figure~\ref{} boilerplate. Writer-critic deducts for missing cleveref and manual ref patterns.

microtype Promotion

Promoted from Recommended to Required. Writer-critic now deducts -2 for missing microtype.

GitHub Actions CI

.github/workflows/compile-paper.yml compiles the paper on push/PR when paper/** files change. Skips gracefully in template repos (no main.tex). Uploads compiled PDF as artifact.


v4.1.0 — Enforcement Layer

Released 2026-04-09

Adds mechanical enforcement, structured pre-flight reporting, and decision traceability across the pipeline.

Content Invariants

21 numbered rules (INV-1 through INV-21) in .claude/rules/content-invariants.md. Non-negotiable standards for paper, code, and talks — critics now cite violations by invariant number. Covers table format, figure notes, notation consistency, anti-hedging, reproducibility requirements, and more.

Pre-Flight Reports

Two new structured reports that agents produce before doing work:

  • Pre-Strategy Report (Strategist): documents paper type classification, available data, and candidate designs before proposing a strategy
  • Pre-Code Report (Coder): documents naming map, script structure, and numerical guards before writing code

Grep-Based Linter

/tools lint [file|dir] runs mechanical checks on R, Python, and Julia scripts — prohibited patterns, style violations, reproducibility issues. /review --code now runs lint as Step 1 before dispatching coder-critic. Fast, deterministic, no LLM calls.

Decision Records

ADR-style records at discovery and strategy stages. Template: templates/decision-record.md. Saved to quality_reports/decisions/. Each record captures the decision, alternatives considered, rationale, and status.

PostToolUse Lint Hook

Auto-lints R, Python, and Julia files on every Edit or Write operation. Advisory only — reports issues but does not block. Catches problems at write time instead of review time.


v4.0.0 — Paper-Type Architecture

Released 2026-04-08

Every agent now knows whether it’s working on a reduced-form, structural, theory+empirics, or descriptive paper — and adapts accordingly.

7 Agents Rewritten

Agent Before After
Writer Humanizer-first, one template Paragraph-level argument moves, 4 paper-type section templates
Writer-critic 6 checks, format-focused 8 checks: structure, coherence, design-specific completeness
Strategist Reduced-form only 4 paper types: reduced-form, structural, theory+empirics, descriptive
Strategist-critic DiD/IV/RDD checklists + structural model checks, prediction sharpness, construct validity
Coder Basic script standards Engineering discipline: naming maps, numerical guards, function-per-file
Coder-critic 12 checks 16 checks: + numerical discipline, prohibited patterns
Methods-referee Reduced-form evaluation only Paper-type-specific dimensions and scoring rubrics

Numerical Discipline

New in the Coder and Coder-critic, derived from C++ Core Guidelines:

  • Float comparison guards (no == on floats)
  • CDF clamping to [0, 1], inverse link protection
  • Integer literals (1L, seq_len(n))
  • Pre-allocation (no growing lists in loops)
  • Bootstrap/parallel patterns with proper seed handling
  • Language-specific coding standards: .claude/references/coding-standards-{r,python,julia}.md

Scope: Back to Economics

The v3.0 release claimed to serve “all empirical social science.” That was too broad — the pipeline’s agents, rules, and section templates are built for economics.

v4.0 is honest about this: built for economics, adaptable to adjacent fields. Finance, accounting, marketing, and management researchers can customize the domain profile and journal profiles. The 30 journal profiles across these fields are retained as useful reference data.

Profile-Aware Table Standards

Significance stars are now journal-dependent:

  • Working papers (default): Stars OK
  • AEA journals (AER, AEJ:Applied, AEJ:Policy, AER:Insights): No stars, per current AEA style guide
  • All others: Stars acceptable

Journal profiles now include a Table format field for overrides.

Compilation Alignment

  • Fixed bibtex → biber mismatch across CLAUDE.md and tools/SKILL.md
  • Fixed remaining uppercase paths (Paper → paper, Talks → paper/talks)

Dead Code Removal

  • Deleted scripts/quality_score.py (761 lines targeting nonexistent directories)
  • Deleted scripts/sync_to_docs.sh (92 lines targeting nonexistent deployment structure)

README Honesty

  • “turns your terminal into a research assistant” → “scaffold for empirical economics research”
  • “works autonomously” → removed
  • “Realistic Peer Review” → “Simulated Peer Review”
  • “calibrated referee pools” → “configured referee pools”
  • Added Limitations section

v3.1.1 — 2026-03-24

Output organization setting (by-script default). Bug fixes for Paper/ → paper/ path migration, domain-profile path, plan mode in /new-project.


v3.1.0 — 2026-03-23

Skill detail restoration — 22 items lost in the v2→v3 consolidation were restored while keeping v3 structure.


v3.0.0 — 2026-03-20

Major architecture redesign. Peer review simulation with Editor + dispositions + desk reject. 30 journal profiles. Demand-loaded references. Skill consolidation (26 → 10 commands). Scope clarification to economics.