Innovation-Inventory - Herding Cats in the AI Age

# Innovation Inventory ## Three Patterns With No Academic Precedent > *"When you build a system that runs 270+ sessions, 168 scripts, 25 hooks, 33 skills, and 420+ lessons learned — and you measure it — you find things nobody has published about."* --- ## What This Page Is This series began as a practitioner's account of applying military doctrine to AI coordination. Along the way, the vault — a live Obsidian knowledge management system operating as a multi-agent AI laboratory — produced fourteen distinct coordination patterns. Most have academic parallels. Three do not. This page documents those three HIGH-novelty patterns: what they are, what academic field they extend, what gap they fill, and what evidence supports them. All three emerged from operational necessity, not theoretical design. They were invented under time pressure, refined through 420+ lessons learned, and measured across 270+ sessions. The claim is specific: these patterns work in a live system, fill gaps in published research, and are independently verifiable from the vault's git history. --- ## The Full Innovation Inventory Fourteen patterns emerged from vault operations. Novelty is assessed against published academic literature as of April 2026. | # | Pattern | Novelty | Academic Field | Gap | |---|---------|---------|----------------|-----| | 1 | **Tetris Task Primitives** | **HIGH** | Constraint scheduling, workflow automation | No published work on composable atomic AI task operations | | 2 | **Gravity-Fed Pipeline** | **HIGH** | Continuous improvement (DMAIC), cognitive architecture | No published work on self-improving governance-as-momentum | | 3 | **Toboggan Doctrine** | **HIGH** | Behavioral economics, formal methods | Choice architecture applied to AI agent enforcement — novel | | 4 | PAT Brainstorming | MED-HIGH | Ensemble methods, voting theory | Parallel orthogonal review extends Condorcet | | 5 | OODA + IG&C | MED-HIGH | Decision theory, military science | Boyd's advanced framework applied to AI — unexplored | | 6 | OC AAR as Facilitated Dialogue | MED | Institutional learning, AAR doctrine | TC 25-20 extended to AI agent teams | | 7 | Standing Task Orders | MED | State machines, workflow engines | Template-based persistent state across sessions | | 8 | Knowledge Wells | LOW-MED | Information retrieval, RAG | Pull-based domain loading (47 wells) | | 9 | CPI Scorecard | MED | Lean Six Sigma, dashboards | Auto-populated from gravity deposits — no manual entry | | 10 | Multi-Agent Role Identity | MED | Agent architecture, organization theory | Named persistent specialists vs. anonymous workers | | 11 | MDMP for AI Planning | LOW-MED | Military planning, AI planning | FM 5-0 adapted to AI task decomposition | | 12 | Gate A/B Structural Enforcement | MED | Formal verification, CI/CD | PreToolUse/PostToolUse hooks as formal verification gates | | 13 | DMAIC Integration | LOW-MED | Quality management, process engineering | Lean Six Sigma woven into AI operational backbone | | 14 | Ralph Loop | MED | Batch processing, task schedulers | Serial/parallel executor with embedded DMAIC metrics | --- ## Pattern 1: Tetris Task Primitives ### What It Is Every task in the vault decomposes into a small set of atomic operations — primitives — that combine like Tetris pieces. Thirteen primitives cover 93% of all vault work. Six recurring combinations ("stacks") account for the majority of sessions. **The 13 primitives:** Research, Read, Classify, Enrich, Route, Create, Edit, Validate, Deploy, Monitor, Diagnose, Coordinate, Archive. Each primitive has standardized inputs and outputs. A "Research" primitive takes a question and produces findings. An "Enrich" primitive takes a file and produces an enriched file. They compose: Research → Classify → Enrich → Route is the inbox triage stack. Research → Create → Validate → Deploy is the skill creation stack. ### What Academic Field It Extends **Constraint-based scheduling** (Gomes et al., AAAI 2007)[^61] demonstrated that structural constraints produce optimal multi-agent schedules. **Workflow automation** research has produced task decomposition frameworks for decades. Formally, Tetris task primitives define a typed workflow algebra: each primitive is a typed function (input schema → output schema), composition is function application, and the six recurring stacks are pre-compiled function chains.[^62] Coverage(S) = |primitives_used| / |primitive_universe| ≈ 93%. Six recurring stacks account for 93% of session work — Pareto concentration tighter than typical operational distributions. ### What's New No published work treats AI task operations as composable atomic units with standardized I/O contracts. Existing workflow systems define tasks as opaque black boxes. Existing AI agent frameworks define tasks as natural language prompts. Neither provides the composability that emerges when primitives have typed inputs, typed outputs, and measurable cycle times. The vault demonstrates that primitive-level decomposition enables: - **Predictive routing:** If a task decomposes into known primitives, execution time and agent assignment can be predicted before the task starts. - **Cross-session learning:** Primitive-level metrics (cycle time, error rate, rework frequency) transfer across tasks. A "Research" primitive that takes 45 seconds in one context takes roughly 45 seconds in another. - **Dependency coloring:** Primitives with shared dependencies (same file, same git index) are colored to prevent contention. This is constraint scheduling applied at the operation level, not the agent level. ### Figure 1 - Tetris Task Primitives - Composition Architecture ```mermaid %%{init: {"theme": "base", "themeVariables": {"darkMode": true, "background": "#282a36", "mainBkg": "#44475a", "nodeBorder": "#6272a4", "clusterBkg": "#383a4a", "clusterBorder": "#6272a4", "titleColor": "#f8f8f2", "primaryColor": "#44475a", "primaryTextColor": "#f8f8f2", "primaryBorderColor": "#6272a4", "lineColor": "#6272a4", "edgeLabelBackground": "#383a4a"}}}%% flowchart LR subgraph Primitives["13 Atomic Primitives — typed I/O contracts"] Research[Research] Read[Read] Classify[Classify] Enrich[Enrich] Route[Route] Create[Create] Validate[Validate] Deploy[Deploy] Monitor[Monitor] Diagnose[Diagnose] Coordinate[Coordinate] Edit[Edit] Archive[Archive] end subgraph Stacks["6 Recurring Stacks — 93 pct coverage"] S1[Inbox Triage Stack] S2[Skill Creation Stack] S3[Bug Fix Stack] S4[Research Stack] S5[Session Close Stack] S6[Audit Stack] end Research --> S1 Classify --> S1 Enrich --> S1 Route --> S1 Research --> S2 Create --> S2 Validate --> S2 Deploy --> S2 Diagnose --> S3 Edit --> S3 Validate --> S3 Research --> S4 Read --> S4 Monitor --> S5 Archive --> S5 Coordinate --> S5 Read --> S6 Classify --> S6 Diagnose --> S6 classDef primitive fill:#1E3A8A,stroke:#6272a4,stroke-width:2px,color:#f8f8f2 classDef stack fill:#D97706,stroke:#ffb86c,stroke-width:2px,color:#f8f8f2,font-weight:bold classDef default fill:#44475a,stroke:#6272a4,stroke-width:1px,color:#f8f8f2 class Research,Read,Classify,Enrich,Route,Create,Validate,Deploy,Monitor,Diagnose,Coordinate,Edit,Archive primitive class S1,S2,S3,S4,S5,S6 stack ``` *Figure 1 - 13 atomic task primitives compose into 6 recurring stacks covering 93 pct of vault session work. Each primitive has typed input/output contracts, enabling Tetris-like composition. Six pre-verified chains account for the majority of all session operations.* ### Evidence - 13 primitives identified from 270+ sessions of operational data - 6 recurring stacks covering 93% of vault operations - Task files carry a `primitives:` field in YAML frontmatter - Primitive metrics tracked in task-schema.yaml - Documented in `Task-Primitives-Taxonomy.md` knowledge well --- ## Pattern 2: Gravity-Fed Pipeline ### What It Is The vault's governance system improves itself without dedicated improvement effort. Each operational action — booting a session, closing a task, running an AAR — deposits metrics and artifacts into standardized locations. Downstream processes (CPI scorecard, quality dashboards, lesson learned shards) consume these deposits automatically. Governance improves because the system is designed so that doing normal work *is* doing improvement work. The metaphor: water flows downhill. You don't pump it. You build channels. ### What Academic Field It Extends **DMAIC continuous improvement** (Lean Six Sigma) defines a structured improvement cycle: Define, Measure, Analyze, Improve, Control. **SOAR cognitive architecture** (Laird, 2022)[^63] integrates episodic memory — experiences that inform future decisions. The gravity pipeline implements potential energy minimization: each session deposits artifacts at lower-energy states, and the system flows toward the minimum-energy configuration.[^64] The CPI loop acts as a Lyapunov function, monotonically decreasing the system's defect rate: V(x) ≥ 0 and dV/dt ≤ 0, where V(x) = cumulative defect frequency and x = system state.[^65] ### What's New DMAIC requires *deliberate* improvement effort — someone must run the cycle. SOAR's episodic memory is *pull-based* — the system retrieves relevant episodes when needed. The gravity-fed pipeline is *push-based and self-organizing*: normal operations deposit artifacts that downstream processes consume without retrieval effort. No published work describes a governance system where: 1. **Normal operations produce governance data** (boot metrics, close metrics, AAR findings) as a side effect, not a separate activity. 2. **Quality improves through momentum** — each session's deposits make the next session's analysis richer, without anyone running a DMAIC cycle. 3. **The CPI loop closes inside the template** — templates carry institutional knowledge from prior sessions, so any agent picking up the template inherits the system's learning. ### Figure 2 - Gravity-Fed Pipeline - Self-Improving Governance Flow ```mermaid %%{init: {"theme": "base", "themeVariables": {"darkMode": true, "background": "#282a36", "mainBkg": "#44475a", "nodeBorder": "#6272a4", "clusterBkg": "#383a4a", "clusterBorder": "#6272a4", "titleColor": "#f8f8f2", "primaryColor": "#44475a", "primaryTextColor": "#f8f8f2", "primaryBorderColor": "#6272a4", "lineColor": "#6272a4", "edgeLabelBackground": "#383a4a"}}}%% flowchart TB Start["Agent Enters Heavy\nTemplates + Wells + MA loaded"] GateA["Gate A - Pre-flight\nresource scan + validation method"] Exec["Execution\nwork flows downhill"] GateB["Gate B - Completion\nartifacts deposited automatically"] AAR["AAR at Bottom\nlessons captured + metrics logged"] CPI["CPI Loop\ntemplates refined + skills upgraded"] Gravity[" GRAVITY "] Start --> GateA GateA --> Exec Exec --> GateB GateB --> AAR AAR --> CPI CPI -.->|feeds back to next session| Start Gravity -.->|push-based auto-deposit| AAR classDef forward fill:#1E3A8A,stroke:#6272a4,stroke-width:2px,color:#f8f8f2,font-weight:bold classDef gate fill:#3a2860,stroke:#bd93f9,stroke-width:2px,color:#f8f8f2,font-weight:bold classDef cpi fill:#059669,stroke:#50fa7b,stroke-width:3px,color:#f8f8f2,font-weight:bold classDef note fill:#2a3050,stroke:#6272a4,stroke-width:1px,color:#f8f8f2 classDef default fill:#44475a,stroke:#6272a4,stroke-width:1px,color:#f8f8f2 class Start,Exec forward class GateA,GateB gate class CPI,AAR cpi class Gravity note ``` *Figure 2 - Agents enter heavy with full preparation and flow downhill through gates. Normal operations auto-deposit metrics and artifacts at each stage. The CPI loop at the bottom feeds improvements back into templates for the next session - no manual improvement effort required.* ### Evidence - 168 scripts, 25 hooks, 33 skills in the operational pipeline - Boot metrics: ~580 entries auto-deposited across sessions - Close metrics: ~90 entries auto-deposited - CPI scorecard reads gravity deposits — no manual data entry - 4 fully automated pipelines, 4 cron-ready - Documented in `Gravity-Pipeline-Design-Spec.md` and `Vault-Operating-Architecture.md` --- ## Pattern 3: Toboggan Doctrine ### What It Is AI agents don't follow rules because they understand them. They follow rules because the system makes following rules easier than breaking them. The Toboggan Doctrine applies choice architecture — the behavioral economics principle that how choices are structured determines which choice gets made — to AI agent enforcement. Instead of walls (hard blocks that agents route around) or suggestions (soft guidance that agents ignore under pressure), the toboggan builds channels: structural paths where the compliant action is the path of least resistance. Agents enter the channel at the top and slide to the correct outcome. The channel doesn't require understanding. It requires gravity. **Implementation:** PreToolUse hooks intercept agent actions before execution. Non-compliant actions receive `deny` (structural block) rather than `ask` (advisory prompt). The hook doesn't explain why — it blocks. The agent, unable to proceed on the non-compliant path, routes to the compliant alternative. The alternative is designed to be obvious and easy. ### What Academic Field It Extends **Choice architecture** (Thaler & Sunstein, *Nudge*, 2008)[^66] established that the design of choice environments determines outcomes more reliably than education or incentives. **Differential privacy** (Dwork, ICALP 2006)[^67] established that formal mathematical guarantees outperform empirical promises. **AI governance** (Gebru et al., FAccT 2021)[^68] argued for structural oversight mechanisms over capability-based safety. The deny hook is jidoka applied to agent governance: errors caught at the point of production.[^69] The template is the poka-yoke: structurally preventing the error before it can be made.[^69] The compliance rate Γ(t) = 1 − |violations(t)| / |enforcement_opportunities(t)| → Γ ≈ 1.0 sustained over 270 sessions after framework deployment. The mechanism is incentive-compatible:[^70] the deny hook makes non-compliance a dominated strategy (no benefit, certain cost), while the template makes compliance the minimum-effort path. Agents don't need to understand the rules — the channel structure makes compliance the path of least resistance. ### What's New No published work applies choice architecture to AI agent enforcement. The existing AI safety literature focuses on: - **Alignment:** Training the model to want the right things (RLHF, Constitutional AI) - **Guardrails:** Filtering outputs after generation (content filters, toxicity classifiers) - **Capability control:** Limiting what the model can do (sandboxing, permission systems) The toboggan doctrine operates at a different level: it structures the *operational environment* so that the compliant action is the default, the non-compliant action is blocked, and the agent doesn't need to understand why. This is not alignment (the agent's values are unchanged), not guardrails (the filtering happens before generation, not after), and not capability control (the agent retains full capability — it just can't use it on the blocked path). The closest parallel is Dwork's differential privacy: a formal guarantee that holds regardless of the adversary's strategy. The toboggan hook holds regardless of the agent's "intent" — it fires on the action, not the reasoning. ### Figure 3 - Toboggan Doctrine - Wall-Based vs Channel-Based Governance ```mermaid %%{init: {"theme": "base", "themeVariables": {"darkMode": true, "background": "#282a36", "mainBkg": "#44475a", "nodeBorder": "#6272a4", "clusterBkg": "#383a4a", "clusterBorder": "#6272a4", "titleColor": "#f8f8f2", "primaryColor": "#44475a", "primaryTextColor": "#f8f8f2", "primaryBorderColor": "#6272a4", "lineColor": "#6272a4", "edgeLabelBackground": "#383a4a"}}}%% flowchart LR subgraph Walls["Wall-Based Governance - reactive"] WA[Agent Action] WB{Hook Check} WC[Block or Allow] WD[Agent re-routes\nor fails] WA --> WB --> WC --> WD end subgraph Channel["Channel-Based Governance - proactive"] CA[Agent Receives Template] CB[Compliant path = default path] CC[Right outcome emerges\nno routing required] CA --> CB --> CC end Walls -.->|"upgrade path"| Channel classDef danger fill:#5c1a1a,stroke:#ff5555,stroke-width:2px,color:#f8f8f2,font-weight:bold classDef warning fill:#4a3a1a,stroke:#ffb86c,stroke-width:2px,color:#f8f8f2 classDef success fill:#1a3d2a,stroke:#50fa7b,stroke-width:3px,color:#f8f8f2,font-weight:bold classDef action fill:#1a3d2a,stroke:#50fa7b,stroke-width:2px,color:#f8f8f2 classDef decision fill:#1a4a5c,stroke:#8be9fd,stroke-width:2px,color:#f8f8f2,font-weight:bold classDef default fill:#44475a,stroke:#6272a4,stroke-width:1px,color:#f8f8f2 class WA,WD danger class WB decision class WC warning class CA action class CB,CC success ``` *Figure 3 - Wall-based governance blocks non-compliant actions reactively - agents route around or fail. Channel-based governance structures the environment so the compliant action is the path of least resistance - compliance emerges without enforcement events. The toboggan doctrine operates at the channel level.* ### Evidence - 25 hooks in production, enforcing compliance via `deny` pattern - Hook effectiveness data tracked in `hook-effectiveness.jsonl` — T-690 DONE: Pareto analysis (4,882 entries, 16 sessions). 6 keep, 5 noise, 12 observers. See `Hook-Effectiveness-Pareto-Report.md` - Zero gate violations after compliance framework deployed (Paper 3, [^60]) - "Kill zone" channel metaphor originated in session taupe (2026-03-29) - 420+ lessons learned, with structural fixes replacing behavioral guidance at 3-strike threshold - Documented in `Herding-Cats-Toboggan-Doctrine-DRAFT.md` (48K words, Paper 8 candidate) --- ## Academic Foundations These three patterns did not emerge in a vacuum. They build on established research across multiple fields: | Researcher | Institution | Contribution | Vault Pattern It Validates | |-----------|-------------|-------------|---------------------------| | **John Laird** | Michigan | SOAR cognitive architecture — multiple knowledge types | Knowledge wells + skills + hooks | | **Maja Matarić** | USC | Structure over intelligence in multi-robot coordination | OC pattern, agent teams, PAT | | **Carla P. Gomes** | Cornell | Constraint-based multi-agent scheduling | Tetris task primitives | | **Benjamin Grosof** | MIT/IBM | Defeasible logic — rule systems with exceptions | Hook/rule/skill architecture | | **Cynthia Dwork** | Harvard | Differential privacy — formal guarantees | Structural deny hooks | | **Timnit Gebru** | DAIR Institute | Structural governance over capability | Gate enforcement, oversight patterns | | **Andrej Karpathy** | Eureka Labs | LLM OS — layered operating architecture | Three-layer vault architecture | Full citations: [[References|References & Bibliography]] --- ## What Makes This System Different This is not a research paper proposing theoretical patterns. This is a running implementation with measurable outputs: | Metric | Value | |--------|-------| | Total sessions | 270+ | | Git commits | 6,000+ | | Scripts in production | 168 | | Hooks enforcing compliance | 25 | | Skills (executable procedures) | 33 | | Knowledge wells | 47 | | Lessons learned (documented) | 420+ | | CPI metrics (auto-collected) | ~670 entries | | Task primitives identified | 13 | | Recurring task stacks | 6 (93% coverage) | The system improves itself. Every session deposits data. Every AAR produces fixes. Every fix propagates through templates and skills. The gravity pipeline ensures this happens as a side effect of normal operations, not as a separate improvement program. Three patterns have no academic precedent. The vault provides the evidence. The published series provides the narrative. This page provides the map. --- --- [^61]: Gomes, Carla P., van Hoeve, Willem-Jan, Selman, Bart, & Lombardi, Michele. "Optimal Multi-Agent Scheduling with Constraint Programming." AAAI 2007, pp. 1813–1818. [^62]: The typed workflow algebra framing: each primitive P has type signature τ(P) = (Input, Output). Composition P₂ ∘ P₁ is valid iff Output(P₁) ⊆ Input(P₂). The six recurring stacks are pre-verified composition chains. This formalizes what practitioners discover empirically: not all task combinations compose cleanly. [^63]: Laird, John E. "Introduction to the Soar Cognitive Architecture." arXiv:2205.03854, 2022. SOAR integrates procedural, semantic, and episodic memory — the three knowledge types mirrored by vault skills, knowledge wells, and session logs. [^64]: The potential energy analogy: each artifact deposit moves the system from a higher-entropy state (scattered lessons, unprocessed data) toward a lower-entropy state (structured knowledge, actionable metrics). Entropy H(X) = −Σ p(xᵢ) log₂ p(xᵢ) decreases as artifact deposits constrain the action space for future sessions. Shannon, Claude E. "A Mathematical Theory of Communication." *Bell System Technical Journal*, 27(3): 379–423, 1948. [^65]: Lyapunov stability: a dynamical system is stable if there exists a scalar function V(x) ≥ 0 (Lyapunov function) with dV/dt ≤ 0. The CPI loop satisfies this: defect frequency V is non-negative and decreases monotonically as fixes propagate through templates and hooks. Lyapunov, A.M. "The General Problem of the Stability of Motion." *International Journal of Control*, 55(3): 531–534, 1892 (translated 1992). [^66]: Thaler, Richard H. & Sunstein, Cass R. *Nudge: Improving Decisions About Health, Wealth, and Happiness.* Yale University Press, 2008. Choice architecture: the way choices are presented systematically influences which choice is made, independent of rational deliberation. [^67]: Dwork, Cynthia. "Differential Privacy." ICALP 2006. DOI: 10.1007/11787006_1. The parallel to toboggan hooks: both provide guarantees that hold regardless of adversary strategy, not dependent on adversary intent. [^68]: Bender, Emily M., Gebru, Timnit, et al. "On the Dangers of Stochastic Parrots." FAccT 2021. DOI: 10.1145/3442188.3445922. [^69]: Jidoka (自働化): Toyota Production System principle — machines detect defects and stop automatically, rather than passing defects downstream. Poka-yoke (ポカヨケ): mistake-proofing device that makes defects physically impossible to produce. Both from Shingo, Shigeo. *Zero Quality Control: Source Inspection and the Poka-yoke System.* Productivity Press, 1986. [^70]: Mechanism design (reverse game theory): design rules/incentives such that each agent's self-interested strategy produces the desired collective outcome. Hurwicz, Leonid, Maskin, Eric, & Myerson, Roger B. — 2007 Nobel Prize in Economics for mechanism design theory. The deny hook satisfies incentive compatibility: compliance dominates non-compliance in expected value for any agent, regardless of the agent's objectives. --- ## Navigation [[Home|← Series Home]] | [[References|Full References]] | [[Paper-3-The-PARA-Experiment|Paper 3: The Live Laboratory]] | [[Paper-7-MDMP-Platform-Blueprint|Paper 7: Platform Blueprint]] --- ## Related - [[Home|Herding Cats in the AI Age — Home]] - [[Paper-3-The-PARA-Experiment|Paper 3: The PARA Experiment]] - [[Paper-7-MDMP-Platform-Blueprint|Paper 7: MDMP Platform Blueprint]] - [[References|References & Bibliography]]