Skip to content

Herding Cats in the AI Age

Herding CatsThe series mascot, observing

Herding Cats in the AI Age

AI doesn’t need more intelligence — it needs doctrine, process discipline, and quality assurance.

Start: Paper 1 — The Five-Year-OldThen: Paper 6 — Form a TeamRead the full Stance

Every few years an industry convinces itself that the problem it keeps failing to solve will yield to a smarter version of the same thing it has been trying. AI is in that moment now.

Gartner projects 40% of enterprise agentic AI projects will be canceled by the end of 2027. Deloitte found only one in five companies has mature governance for AI agents. The market is projected to surge from $7.8 billion to $52 billion by 2030 — most of that money chasing agents organizations cannot reliably govern. The industry’s response has been to ask for larger models, longer context windows, and more guardrails.

This series takes a different stance. The constraint is not raw capability. The constraint is coordination — getting one agent to finish what another started, getting a team of agents to agree on what “done” means, getting any of it to hold up when the user steps away for a day.

The military solved this problem in the 1790s. Industrial manufacturers solved it again in the 1950s. Both solutions are documented, battle-tested, and freely available. Neither was cited in the research papers the current agentic AI platforms were built on. The AI industry is, in the most literal sense, rediscovering things other disciplines already know.

Civilian AI’s coordination failure has four recurring signatures. They appear in every case study in this series.

The agent that forgets the mission. Given a clear goal at session start, the agent drifts mid-execution and ends the session having worked on something adjacent. Doctrine calls it a failure of mission command; Lean Six Sigma calls it scope creep. Civilian AI’s response: ask the model to summarize the goal more often.

The agent that cannot hand off. One session produces an artifact; the next inherits it and cannot tell what was shipped, deferred, or assumed. Doctrine calls it a broken running estimate. Six Sigma calls it a broken SIPOC. Civilian AI’s fix: longer README files.

The agent that ships defects past every review. Self-validation passes, supervisor signs off, AAR grades green — and twenty-six hours later production crashes because a field was typed as a string instead of an integer. Doctrine has rehearsals and backbriefs. Quality engineering has design reviews and FMEA. Civilian AI’s fix: add another eval.

The team that never learns. The same defect appears in three sessions, each time logged as a novel finding, each time promised a fix that never lands. CPI solved this in manufacturing. The AAR solved this in the Army. Civilian AI has no standard mechanism.

Every paper in this series applies an existing remedy — military doctrine, Lean Six Sigma, information theory — to one or more of those signatures.

The research that produced this series ran inside a single Obsidian vault — a working second brain that doubled as the operational surface for a multi-agent AI team. Not a demo. The author’s daily knowledge base, used for caregiving coordination, writing, inbox triage, and operational work. Everything described was invented because it was needed to function.

The governance shape that emerged has a name — the Toboggan Doctrine (Paper 8) — and a shape. Agents enter the channel loaded with templates, knowledge wells, and pre-made decisions. Gravity pulls them through a pre-execution gate, an execution phase, and a completion gate. At the bottom, an after action review captures lessons that feed back up the continuous improvement loop, updating the templates for the next session. The system improves itself without dedicated improvement effort. The factory worker does not push the template; the template pushes the factory worker.

Each paper stands alone; the series is designed so a reader can enter through whichever one matches their current problem. The recommended path for a first read is the order listed here, but any order works.

  • Paper 1The Super-Intelligent Five-Year-OldNames the problem. Frontier AI clears every cognitive benchmark a decade ago would have ranked superhuman, and still produces the behavioral profile of a capable, unsupervised child.
  • Paper 2The Digital Battle StaffNapoleon’s 1795 headquarters and SOCOM’s 2026 agentic AI experiments converged on the same architecture: a staff of specialists coordinating under a shared commander’s intent.
  • Paper 3The PARA ExperimentOne practitioner, one knowledge vault, 33 days, 1,768 git commits. Twelve of fourteen predicted failure modes appeared within 72 hours.
  • Paper 4The Creative MiddlemanAdobe Firefly routes prompts to competitors’ models because its own can’t render readable text. A case study in coordination failure.
  • Paper 5When the Cats Talk to Each OtherTwo frontier AI systems with opposing design philosophies engaged in structured dialogue and produced a formal coordination framework neither could produce alone.
  • Paper 6When the Cats Form a TeamFour frontier AI systems assigned military staff roles produce six strategic insights any solo agent missed. The most direct empirical demonstration in the series.
  • Paper 6bWhen the Cats Take the Same TestSix AI systems received identical Commander’s Intent and produced experimental designs of vastly different quality. A rebuke to the assumption that “frontier model” is a meaningful quality tier.
  • Paper 7MDMP Platform BlueprintThe platform spec. The paper to hand an engineering lead who has asked, “fine, what would you actually build?”
  • Paper 8The Toboggan DoctrineThe governance synthesis. Template-driven channels outperform hook-based walls. Build the channel, let gravity work.
  • Paper 9Finding the Breaking PointWhere do channels fail? Under what load does the toboggan break? The case for the doctrine’s limits.

The series is not a sales pitch and not an academic treatise. It is a field report from a working system, written for practitioners who already know their AI coordination is failing and have run out of patience for advice that amounts to “try a larger model.”

If you lead an AI team inside an enterprise, the papers give you a doctrine-based language for describing what is going wrong and a specific, already-proven remedy for each failure class. If you research agentic AI, the papers give you empirical data from a production multi-agent system to measure your own ideas against. If you build for yourself, the papers give you a toolkit. The vault that produced this research is small enough for one person to operate and large enough to coordinate a real workload.

The stance, stated one last time: AI does not need more intelligence. It needs doctrine, process discipline, and quality assurance — all three of which existed before the current generation of models was trained, all three of which work when applied, and all three of which the civilian AI industry has so far declined to adopt.

Build the channel. Let gravity work. Measure the results. Report back.

Jeep Marshall — LTC, US Army (Retired). Airborne Infantry, Special Operations, Process Improvement. Writes from the intersection of military doctrine, Lean Six Sigma, and production AI operations.