Paper-2-The-Digital-Battle-Staff - Herding Cats in the AI Age

# THE DIGITAL BATTLE STAFF ## How Two Centuries of Military Doctrine Predicted the AI Agent Coordination Problem **Jeep Marshall** LTC, US Army (Retired) Airborne Infantry | Special Operations | Process Improvement February 2026 --- ## 1. INTRODUCTION: THE DIGITAL BATTLE STAFF In 1795, Major General Louis Alexandre Berthier was assigned as Chief of Staff to the French Army of Italy. His office organized army headquarters into four departments: Movements, Secretariat, Accounting, and Intelligence. When Napoleon Bonaparte took command the following year, he recognized what Berthier had built: a system that allowed one mind to coordinate an army too large for one mind to control. Napoleon did not fight his campaigns alone. He fought them through a staff — specialized sections that gathered intelligence, planned operations, managed logistics, and tracked personnel, all synchronized through a hierarchical command structure with clear authorities and standardized communication formats.[^1] That staff system conquered Europe. It also survived Napoleon. The Prussians studied it, refined it, and formalized the Great General Staff — the Grosse Generalstab — that became the template for every modern military. The numbered staff sections that every NATO nation uses today — G-1 Personnel, G-2 Intelligence, G-3 Operations, G-4 Logistics — descend directly from Berthier's four departments. Two hundred years of warfare, from Austerlitz to Afghanistan, validated the core architecture: specialized agents coordinated by a hierarchical command structure under unified intent.[^2] Now the AI industry has independently arrived at the same design. Every major technology company building multi-agent AI systems has converged on what the military calls command and control — an orchestrator that decomposes complex objectives into subtasks and delegates them to specialized worker agents. Anthropic's multi-agent research system uses a lead agent (Claude Opus) that breaks queries into subtasks for specialized subagents (Claude Sonnet). OpenAI's Agents SDK replaced its experimental Swarm framework with production-ready handoff patterns built on hierarchical delegation. Microsoft merged AutoGen with Semantic Kernel for enterprise orchestrator-worker deployments. Google's Agent Development Kit implements parallel and sequential agent coordination under a central controller. Cursor's production system — which generated over one million lines of code in a single week — uses a strict planner-worker-judge hierarchy. Steve Yegge's Gastown framework, running 20-30 Claude Code instances in parallel, deploys a Mayor (commander) who assigns tasks to ephemeral Polecats (workers) that execute in isolation and report results through a merge queue.[^3] None of these teams consulted military doctrine. They did not need to. The physics of coordination are the physics of coordination. When you have more agents than one mind can direct, you build a staff. When you have more tasks than one agent can execute, you build a hierarchy. When agents need to share information without drowning in communication overhead, you standardize the message formats. Napoleon knew this. Berthier built it. Two centuries of combat refined it. Silicon Valley is rediscovering it from first principles — and paying the tuition the military already paid. The tuition is steep. A December 2025 study from Google Research, Google DeepMind, and MIT ran 180 controlled experiments across five agent architectures and three model families. The results demolished the assumption that more agents produce better outcomes. On sequential tasks, multi-agent systems degraded performance by 39% to 70%. In decentralized configurations, errors amplified 17.2 times faster than in single-agent setups. Even centralized coordination — the architecture closest to military C2 — still amplified errors 4.4 times. The computational cost was devastating: hybrid multi-agent systems burned five times the tokens per successful task compared to a single agent. Most of the budget evaporated into agents talking to each other instead of doing the work.[^4] A companion study from UC Berkeley's Sky Computing Lab, awarded Spotlight designation at NeurIPS 2025, provided the autopsy. Researchers analyzed over 1,600 execution traces across seven state-of-the-art multi-agent frameworks and identified fourteen distinct failure modes organized into three categories: specification and system design, inter-agent misalignment, and task verification. Failure rates ranged from 41% to 86.7% across the frameworks tested. Not edge cases. Not stress tests. Standard operating conditions.[^5] Sixty-eight percent of all failures occurred either before execution began or after execution ended. The agents failed not because they could not do the work, but because nobody told them what the work was, or nobody checked whether they did it correctly. Military professionals recognize this distribution instantly. It is the same distribution that drives planning doctrine. The Military Decision Making Process front-loads the thinking — spending up to 40% of available time on Mission Analysis — precisely because the planning phase is where most operational failures originate. The convergence is not coincidental. It is structural. Multi-agent coordination is a command-and-control problem, and command and control is the military's core competency. The AI industry is building agents that can reason, plan, and execute. What it has not built — and what the military has refined across seven decades of doctrine development — is the organizational architecture that makes agents effective at scale. Nate B. Jones, AI strategist and former Head of Product at Amazon Prime Video, synthesized the production data from Cursor, Gastown, and the Google/MIT research into a single observation: "The job is not to make one brilliant Jason Bourne agent running around for a week. It's actually 10,000 dumb agents that are really well coordinated in the system running around for an hour at a time progressively getting work done against a very tight definition of the goal they're accomplishing."[^6] That sentence describes a battalion operation, not a software deployment. A battalion commander does not send one brilliant soldier to accomplish a brigade-level mission. The commander deploys hundreds of soldiers — each with a narrow task, clear left and right limits, a defined end state, and no need to understand the division commander's strategy. The orchestration complexity lives at the staff level, not in the rifleman's head. Scaling AI agents is herding cats. The cats are brilliant, tireless, and fast — but nobody told them where the barn is, and half of them are chasing mice that don't exist.[^7] This paper argues that the military is not just a metaphor for the AI agent coordination problem. The military is the solution. Two institutions are converging on the same problem from opposite directions. The AI industry builds capability and discovers it needs doctrine. The military builds doctrine and discovers it needs capability. The Center for Strategic and International Studies has published the framework for rethinking the Napoleonic staff model for the age of agentic warfare. The Pentagon has released a strategy to make the Department of War an "AI-first warfighting force," backed by billions in funding and seven Pace-Setting Projects designed to demonstrate AI agent capabilities by mid-2026. Special Operations Command is actively seeking agentic AI demonstrations. The Army has created a new officer career field — AOC 49B — dedicated to AI and machine learning. The thesis is straightforward: the military solved the multi-agent coordination problem before electricity existed. The AI industry is rediscovering the same principles — and making the same mistakes the military learned to avoid seventy years ago. The organizations that survive the coming shakeout will stop treating AI as a software problem and start treating it as an operations problem. They will hire military planners, process engineers, and quality assurance specialists. They will apply doctrine. And they will build the digital battle staff that two centuries of warfare already designed. --- ## 2. THE NAPOLEONIC MODEL UNDER PRESSURE ### The Architecture That Conquered Europe Napoleon did not invent the military staff. He perfected it. Before Napoleon, armies were commanded through personal presence — the general rode to each unit, issued orders directly, and relied on messengers to carry updates across the battlefield. This worked when armies numbered in the tens of thousands. When Napoleon assembled the Grande Armée — 600,000 soldiers for the invasion of Russia in 1812 — personal command became a physical impossibility.[^8] Berthier's solution was organizational. The Army General Headquarters divided staff functions into specialized departments that processed information, generated plans, and transmitted orders in parallel. The corps system subdivided the Grande Armée into self-contained formations of 10,000 to 50,000 troops, each with its own staff, supply trains, and administrative services. Each corps traveled designated roads, maintained specific foraging areas, and operated semi-autonomously under the commander's intent. Napoleon directed the campaign. The staff translated his intent into executable orders. The corps executed independently within defined boundaries.[^9] This architecture solved three problems simultaneously. First, span of control: no single mind could track 600,000 soldiers, so the staff created a hierarchy that compressed information upward and distributed decisions downward. Second, communication bandwidth: standardized order formats meant that a courier carrying a two-page operations order could synchronize an entire corps without the commanding general being present. Third, parallel execution: corps could march, forage, and prepare for battle simultaneously because each operated under mission-type orders — the intent was clear even when the specific situation was not.[^10] The Prussians formalized these innovations into the General Staff system after Napoleon's defeat. By the mid-nineteenth century, every major European army had adopted some version of it. The numbered staff sections — personnel, intelligence, operations, logistics — became universal. The Continental Staff System that NATO uses today traces its lineage directly to Berthier's headquarters in 1795.[^11] The model worked because it matched the physics of its era. Decisions traveled at the speed of a galloping horse. Battles unfolded over hours or days. The planning cycle — receive the mission, analyze the situation, develop options, issue orders — could unfold over a timeline measured in days without losing operational relevance. For two centuries, the tempo of warfare and the processing speed of the staff were roughly aligned. That alignment is now broken. ### The Speed Problem In July 2025, the Center for Strategic and International Studies published "Agentic Warfare and the Future of Military Operations: Rethinking the Napoleonic Staff." Authors Benjamin Jensen and Matthew Strohmeyer of the CSIS Futures Lab delivered a verdict that would have been unthinkable a decade ago: the Napoleonic staff model — the command architecture that won every major conflict from 1800 to 2025 — is too slow for the emerging character of war.[^12] The fundamental problem is temporal compression. AI agents operate in milliseconds. A multi-agent system can fuse intelligence from a dozen sensor feeds, generate three courses of action, war-game each against adversary responses, and recommend a decision — all within the time it takes a human staff officer to open a laptop. The Napoleonic staff was designed for a decision cycle measured in hours and days. When the adversary operates at machine speed, a staff that plans at human speed is not just slower. It is operationally irrelevant. Jensen and Strohmeyer's analysis went beyond diagnosis. They developed three alternative models for AI-enabled command structures, each grounded in a distinct theoretical framework from the social sciences, and tested each against operational requirements.[^13] ### Three Models for the AI-Enabled Staff **The Networked Staff** draws on Bruno Latour's concept of actants — entities (human and non-human) that act within networks to produce outcomes. In this model, smaller staff elements work through functional AI agents that ingest live operational data, doctrine, military history, and theory. These functional agents adjudicate inputs from other functional agents in real time. A small G-2 intelligence section interacts with its AI agent to develop possible enemy courses of action. That output feeds directly to the fires agent, which assesses target areas of interest, cross-references munition inventories, and begins developing fire support tasks and attack guidance matrices — all without waiting for a formal planning meeting. The networked staff eliminates the sequential bottleneck. But it introduces fragmentation risk: without a unifying authority, functional agents may optimize locally while degrading the overall plan.[^14] **The Relational Staff** builds on Harrison White's network sociology, where identity and control emerge from relational networks rather than hierarchical positions. In White's framework, a "switcher" bridges distinct social domains — "netdoms" — translating between communities that would otherwise operate in isolation. Applied to military operations, the Relational Staff creates AI-powered netdoms aligned with traditional staff sections — maritime, air, cyber, intelligence — that continuously share information and adapt force posture. A human "switcher," likely the commander or chief of staff, evaluates competing options generated by each netdom's AI agents. The relational model enables deep integration across domains. But the switcher becomes a critical bottleneck: if the human bridge fails or is overwhelmed, the netdoms lose synchronization.[^15] **The Adaptive Staff** is grounded in Andrew Abbott's insights about how professions create knowledge through non-linear, iterative processes rather than rigid sequences. Abbott observed that professional expertise emerges from interconnected cycles of planning, execution, and assessment that continuously adjust to shifting realities. The Adaptive Staff applies this principle to command and control: the entire planning and operations process becomes an agent-informed evolutionary system. Staff sections interact with planning agents that integrate data on doctrine, lessons learned, historical precedent, and military theory. These planning agents inform operations agents, which in turn generate feedback that updates the planning cycle. AI agents embedded within each process provide real-time data and analysis. Human facilitators ensure alignment with higher-level objectives. The result is a fluid, iterative decision cycle that evolves with changing circumstances rather than following a rigid sequence from planning to execution.[^16] Jensen and Strohmeyer's assessment: the Adaptive model proved "most effective and resilient." Its advantage lies in the feedback loop. The Networked model processes fast but fragments. The Relational model integrates deeply but bottlenecks. The Adaptive model does both — processing in parallel while continuously adjusting based on outcomes. It is, in effect, a military OODA loop running at machine speed with human oversight at the critical decision points. ### The Timeline That Changes Everything The CSIS report documents a temporal compression that renders the traditional staff planning cycle obsolete. AI now automates intelligence fusion, refines threat assessments, and recommends courses of action — compressing decision timelines from days to minutes.[^17] Consider what this means for a brigade combat team conducting operations in a contested environment. Under the traditional Napoleonic model, the staff receives a mission from higher headquarters, conducts mission analysis (four to eight hours for a complex operation), develops courses of action (two to four hours), war-games each option (two to four hours per COA), compares and selects (one to two hours), and produces the operations order (four to eight hours). Total planning timeline: sixteen to twenty-six hours for a single decision cycle. If the enemy disrupts communications during this process, the staff may need to restart with updated intelligence. Under the Adaptive model, AI agents conduct parallel analysis across all staff functions simultaneously. Intelligence agents fuse sensor data and generate threat assessments. Operations agents develop multiple courses of action. Fires agents assess target sets and munition requirements. Logistics agents model sustainment timelines. All of this happens in minutes rather than hours, with the human commander reviewing AI-generated recommendations rather than waiting for sequential staff briefings. The difference is not incremental. It is generational. An adversary operating at machine speed will complete three or four decision cycles before a traditionally staffed headquarters completes one. In military terms, this is the equivalent of the German blitzkrieg against the French in 1940 — not a technology advantage, but a tempo advantage enabled by organizational design. ### The China Problem The CSIS report places this organizational challenge in stark strategic context. China's strategy to disrupt U.S. decision networks through cyber attacks, electronic warfare, and long-range precision strikes directly targets the Napoleonic model's greatest vulnerability: its reliance on centralized command and reliable communications.[^18] The traditional staff assumes that information flows vertically — from sensors to analysts to staff officers to the commander and back down to executing units. Every node in this chain is a target. Degrade the communications, and the staff cannot function. Destroy the headquarters, and the entire formation loses direction. China's anti-access/area-denial strategy is designed to do exactly this: isolate U.S. command nodes, overwhelm communication networks, and force American formations to operate without the centralized coordination they depend on. The Adaptive model counters this by distributing decision authority across AI-enabled nodes that can continue operating when centralized communications fail. If the brigade headquarters loses connectivity, AI agents at the battalion level continue fusing local intelligence, generating courses of action, and recommending decisions — operating under the commander's intent rather than waiting for orders that may never arrive. This is not autonomy for its own sake. It is resilience through distributed intelligence. Before the CSIS report formalized these concerns, Jensen joined Dan Tadross and Strohmeyer to publish "Agentic Warfare Is Here. Will America Be the First Mover?" in War on the Rocks. The argument had sharpened. The question was no longer whether AI would transform military command structures. The question was whether the United States would transform first — or cede the advantage to an adversary that would.[^19] ### The Recommendations The CSIS report closes with five recommendations that read less like policy proposals and more like operational imperatives:[^20] 1. **Sustained experimentation** with new command structures — not studies about experimentation, but actual experimentation with AI-enabled staffs in operational settings. 2. **Expanded computing infrastructure** to support AI-enabled decision-making at the tactical edge, not just in rear-area headquarters. 3. **AI-focused officer education** that trains military professionals to work alongside AI agents — not as operators pressing buttons, but as commanders understanding capability and limitation. 4. **Hardened, cyber-resilient decision networks** that can sustain AI-enabled operations under contested conditions. 5. **Rapid learning cycles** that capture lessons from experimentation and field them as updated doctrine at operational speed. The report stresses what should be obvious but apparently is not: the continued importance of human expertise to manage AI systems, verify recommendations, and assume control when networks fail. The Adaptive staff does not replace the commander. It amplifies the commander. But a commander who does not understand AI capability — who cannot distinguish between a reliable recommendation and an AI-generated hallucination, who cannot diagnose a system failure under operational pressure — is worse than no commander at all. The Napoleonic model served the profession of arms for two centuries because it matched the tempo and complexity of its era. That era is ending. What replaces it will determine whether the United States maintains decision superiority in the conflicts ahead — or discovers, under fire, that the world's most powerful military is running a two-hundred-year-old operating system against adversaries who upgraded. --- ## 3. THE PENTAGON'S AI ACCELERATION ### The Strategy On January 9, 2026, Secretary of Defense Pete Hegseth released the "Artificial Intelligence Strategy for the Department of War" — a six-page document that represents the most aggressive AI adoption mandate in Department of Defense history. The document rebrands the Department of Defense as the Department of War, eliminates the previous administration's "Responsible AI" ethical framework, and directs the military to become an "AI-first warfighting force" operating at what Hegseth calls "wartime speed."[^21] The strategy's centerpiece is seven Pace-Setting Projects (PSPs), each assigned a single accountable leader with aggressive timelines: | # | Pace-Setting Project | Description | |---|---------------------|-------------| | 1 | **GenAI.mil** | Secure, enterprise-wide generative AI platform providing frontier LLMs to all military personnel | | 2 | **Swarm Forge** | Competitive mechanism for developing and fielding autonomous drone swarm tactics | | 3 | **Agent Network** | Semi-autonomous AI battle management agents operating from strategic to tactical echelons | | 4 | **Ender's Foundry** | AI-enabled simulation capabilities for training, planning, and wargaming | | 5 | **Open Arsenal** | Intelligence-to-capability pipeline linking threat assessment to weapons development | | 6 | **Project Grant** | Transformation of strategic deterrence through AI-enabled capabilities | | 7 | **Enterprise Agents** | AI agents for back-office administrative systems across the Department | Each PSP must demonstrate initial capabilities within six months — by July 2026. This is not an aspirational timeline. It is a directed mandate with named accountability.[^22] ### The Money The FY2026 budget request allocates $13.4 billion specifically for AI and autonomy — the first year the Department has broken out a dedicated budget line for these capabilities. This includes $9.4 billion for unmanned aerial vehicles, $1.7 billion for maritime autonomous systems, $734 million for underwater capabilities, and $1.2 billion for supporting software and cross-domain integration. An additional $200 million targets AI and automation technology directly, while $150 million funds the replacement of legacy business systems with AI-enabled alternatives.[^23] Beyond the dedicated AI line, the Office of the Under Secretary of Defense (Comptroller) funds approximately $2.5 billion for AI programs, with the largest allocation — $1 billion — directed at improving munition depth and supply chain resiliency through next-generation automated production facilities. Every service branch increased its AI allocation. The Navy alone added $308 million — a 22.7% year-over-year increase.[^24] These numbers represent a fundamental shift in how the Department invests. Previous AI strategies were aspirational documents with distributed funding. This strategy concentrates resources on seven named projects with measurable deliverables and six-month deadlines. ### The Policy Shift The strategy's most consequential provision may be its most bureaucratic. The document directs the Chief Digital and AI Officer to establish benchmarks for "model objectivity" as a primary procurement criterion within 90 days — by April 2026. Within 180 days — by July 2026 — all Department of War AI contracts must incorporate standard "any lawful use" language.[^25] That contract language eliminates the guardrails that AI companies have imposed on military use of their models. Under the previous administration's approach, each AI company negotiated its own terms of service with the Department — defining what the military could and could not do with commercial models. The "any lawful use" mandate inverts this relationship. The Department will determine lawful use. The companies will provide the capability. Any company that imposes restrictions beyond the legal standard faces exclusion from the largest AI procurement market in the world. The strategy explicitly addresses the philosophical shift. Under a section titled "Clarifying 'Responsible AI' at the Department of War — Out with Utopian Idealism, In with Hard-Nosed Realism," the document states: "Responsible AI at the War Department means objectively truthful AI capabilities employed securely and within the laws governing the activities of the department." The previous framework's emphasis on equity, transparency, and bias mitigation is replaced by a single criterion: does the AI help fight and win wars?[^26] Hegseth framed the shift in his January 12 remarks: the Department will not employ AI models "that won't allow you to fight wars" and will judge models on being "factually accurate and mission-relevant without ideological constraints that limit lawful military applications." The Secretary directed that models free from usage-policy constraints that may limit lawful military applications must be prioritized in procurement.[^27] ### GenAI.mil: The Scale of Adoption The flagship visible deliverable is GenAI.mil — a secure, enterprise-wide platform providing frontier large language models to the Department's three million personnel. Launched in late 2025, GenAI.mil reached 1.1 million unique users by February 2026. Five of six military branches — Army, Air Force, Space Force, Marine Corps, and Navy — have designated it as their primary enterprise AI platform. Only the Coast Guard, which is developing its own "Ask Hamilton" tool, has not adopted it.[^28] The platform hosts models from multiple frontier AI companies. Google's Gemini was available at launch. Elon Musk's xAI contributed Grok. OpenAI's ChatGPT joined in February 2026 under a contract worth up to $200 million. The platform operates at Impact Level 5, meaning it handles Controlled Unclassified Information and sensitive unclassified data — one step below classified operations.[^29] The scale is significant. In less than three months, GenAI.mil achieved broader adoption than most enterprise software platforms achieve in years. The 1.1 million users represent a workforce that is now developing operational intuition for AI capabilities and limitations — the kind of hands-on experience that no classroom training can replicate. ### The Frontier Partnerships The Department's AI procurement strategy rests on direct partnerships with every major frontier AI company. In July 2025, the Chief Digital and AI Officer awarded contracts of up to $200 million each to four companies: OpenAI, Anthropic, Google, and xAI. The purpose: develop agentic AI workflows across warfighting, intelligence, and enterprise mission areas.[^30] These contracts represent something new in defense procurement. The Department is not buying finished products. It is buying access to frontier research capabilities and directing that capability toward military applications. Each contract covers model access, fine-tuning for military use cases, and deployment across classification levels. Pentagon Chief Technology Officer Emil Michael — the Under Secretary for Research and Engineering who now serves as the unified CTO overseeing the Department's innovation ecosystem — has pushed each company to make its models available "across all classification levels" for "all lawful purposes." The message is unambiguous: the Department wants unrestricted access to the most capable AI models in the world, deployed on networks ranging from unclassified internet to Top Secret/SCI systems.[^31] ### The Organizational Restructuring The strategy arrived alongside a sweeping organizational redesign. In January 2026, a companion memorandum — "Transforming the Defense Innovation Ecosystem to Accelerate Warfighting Advantage" — unified the Department's fragmented innovation infrastructure under the CTO. The Chief Digital and AI Office (CDAO), previously an independent organization reporting directly to the Deputy Secretary, was restructured under the Under Secretary for Research and Engineering. The Silicon Valley-based Defense Innovation Unit (DIU) and the Strategic Capabilities Office (SCO) were reestablished as departmental "Field Activities" under the CTO's authority.[^32] The consolidation eliminated the bureaucratic fragmentation that had slowed previous AI initiatives. Under the old structure, the CDAO, DIU, SCO, and service-specific AI offices operated with overlapping mandates and competing priorities. Under the new structure, a single CTO — Michael — owns the entire pipeline from research through prototype through fielding. The organizational design mirrors what the CSIS report recommended: unified command with clear authority. ### What It Means The Department of War AI Strategy is not a technology plan. It is an organizational transformation plan that uses technology as its mechanism. The seven Pace-Setting Projects span the full range of military operations — from enterprise back-office functions (PSP 7) to battlefield decision-making (PSP 3) to strategic deterrence (PSP 6). GenAI.mil puts frontier AI in the hands of over a million users. The frontier partnerships give the Department access to the most capable models being built anywhere. The "any lawful use" mandate removes the friction between commercial AI development and military application. The CTO consolidation creates a single authority to drive execution. The timeline — six months for initial demonstrations — is deliberately compressed. Hegseth rejected what he called the "linear" model that moves from laboratory to program of record over many years. The strategy demands concurrent development: field capability, learn from use, iterate, and field again. This is the rapid learning cycle that the CSIS report recommended — implemented not as an experiment but as policy.[^33] The scale of this acceleration creates its own risks. Deploying frontier AI models across classification levels in six months leaves minimal time for the kind of systematic testing that prevents catastrophic failure. The 41% to 86.7% failure rates documented in the UC Berkeley MAST taxonomy apply to military multi-agent systems just as they apply to civilian ones. The fourteen failure modes — disobeyed specifications, inter-agent misalignment, absent verification — do not respect institutional boundaries. An AI agent that fails to verify its output is equally dangerous whether it is writing marketing copy or developing a target nomination. The Department is betting that speed of adoption will outweigh the risks of premature deployment — that an imperfect AI-enabled force fielded now is more valuable than a perfect one fielded too late. It is a bet the military has made before. The question is whether the doctrine exists to manage what the technology delivers. The Pace-Setting Projects will answer that question by July 2026. The answer will shape not just the American military, but every organization on earth that deploys AI agents at scale. --- ## 4. SOCOM — WHERE AGENTS MEET OPERATORS Special Operations Command does not wait for doctrine to catch up. It never has. While the conventional force debates organizational charts and publishes field manual updates on eighteen-month cycles, SOCOM runs experiments, breaks things, and fields what works. That institutional DNA makes it the natural proving ground for agentic AI —and SOCOM knows it. On December 15, 2025, SOCOM posted a Request for Information on SAM.gov for Technical Experimentation 26-2, an event scheduled for April 13--17, 2026, at Avon Park Air Force Range, Florida.[^34] The RFI was not asking for chatbots. It was not asking for dashboards. SOCOM was asking for AI systems that can "reason, adapt to their environments, and make their own decisions with human-like agency" —language that explicitly distinguishes agentic AI from the rule-based automation that has characterized military software for decades.[^35] The distinction matters. Rule-based systems execute predetermined logic trees. Agentic systems assess situations, generate options, and act. One follows orders. The other exercises judgment. SOCOM wants the latter. ### THE EIGHT DOMAINS SOCOM identified eight focus domains for agentic AI integration into special operations: | # | Domain | Operational Application | |---|--------|------------------------| | 1 | Software development and integration | Accelerate tool building for SOF-specific needs | | 2 | Cybersecurity and business intelligence | Defensive and offensive cyber operations | | 3 | Decision support | Commander and staff augmentation | | 4 | Intelligence gathering and analysis | Fusion of multi-source intelligence | | 5 | Mission planning | Automated MDMP and deliberate planning | | 6 | Mission execution | Real-time operational support | | 7 | Mission control | Command and control during operations | | 8 | Tactical workflow support | Streamline SOF-specific processes | The technology areas under evaluation read like an AI agent framework specification: agentic protocols and agent-to-agent communication, agentic workflows and orchestration, human-machine teaming, knowledge representation systems, low size-weight-and-power-compute (SWaP-C) solutions, AI agent frameworks, metrics and accuracy assessment, and collaborative autonomous systems optimization.[^36] That last item —collaborative autonomous systems optimization —is the multi-agent coordination problem, stated in military language. SOCOM is not building a single super-agent. It is building teams of agents that must coordinate under operational conditions, on hardware that fits in a rucksack, in environments where connectivity is unreliable. ### THE SAFETY CONSTRAINT One line in the RFI deserves particular attention. SOCOM officials acknowledged that "online learning is not allowed" in kinetic operations, as it "may lead to undesired behavior."[^37] This is a doctrinal constraint, not a technical limitation. The military has decades of experience with systems that adapt in real-time —electronic warfare suites, radar warning receivers, adaptive jamming platforms. The problem is accountability. If an AI agent learns a new behavior during a live operation and that behavior produces an unintended outcome, the chain of command cannot explain what happened or why. In garrison, you can tolerate adaptation and study the results. In combat, you need predictability. The civilian AI industry has no equivalent constraint. Commercial agents learn, adapt, and modify their behavior continuously. The military position is that this freedom is a liability, not an asset, when lives depend on the output. This is a design philosophy that the enterprise AI world would benefit from examining. ### THE SOCOM AI ECOSYSTEM TE 26-2 is not an isolated experiment. It sits within an accelerating portfolio of SOCOM AI initiatives that together reveal a command-wide transformation. In January 2026, SOCOM posted a separate notice exploring how AI can process biometrics and other data gathered by operators during sensitive site exploitation. The command sought industry solutions for facial recognition, speaker identification, and DNA profiling capabilities that would allow operators to generate a DNA profile and compare it against existing databases to help decide whether to hold or release a target within twenty-four hours.[^38] This is the intelligence-to-action cycle compressed from weeks to a single operational period. In February 2026, SOCOM —working through SOFWERX —posted another notice exploring how AI and automated solutions could accelerate the Integrated Survey Program, which provides detailed tactical planning data for key diplomatic facilities and vessels. The command specifically asked whether AI could map compounds without existing data, automate route data collection between key terrain features, and rapidly photograph buildings.[^39] Every one of these tasks currently requires deploying trained surveyors to hostile or semi-permissive environments. AI replaces boots on the ground with bytes in the cloud. General Bryan Fenton, then the SOCOM commander, articulated the strategic vision before departing command in October 2025. "We see the emergence of 'physical AI,' meaning the convergence of AI, autonomy, and robotics, as the big bet and principal to the success of future warfighting," Fenton stated. He described a "special ops renaissance" in which technologies such as distributed AI and autonomy give smaller teams an edge against larger adversaries —AI that helps operators sift through "the mammoth mountains and glaciers of data that we have," while autonomy enables swarms of drones coordinated by a single controller.[^40] The infrastructure is following the vision. SOCOM's SOFWERX is evaluating industry hardware solutions to upgrade remote locations with high-performance GPU servers supporting large language model workloads for over 100 concurrent users. The requirement specifies turnkey rack-mounted server solutions including GPUs, memory, storage, networking, cooling, and power infrastructure, ready for immediate deployment at remote data centers.[^41] ### WHY SOF IS THE NATURAL PROVING GROUND The convergence of SOCOM and agentic AI is not accidental. It is structural. Special Operations Forces share four characteristics with the environments where AI agents perform best: **Small teams.** A twelve-person Operational Detachment Alpha operates with a level of autonomy that conventional battalions cannot match. Small teams can adopt, test, and discard technology faster than large formations. AI agents designed for small-team integration face fewer coordination challenges and faster feedback loops. **High autonomy.** SOF operators make consequential decisions at the tactical level without waiting for higher headquarters approval. They are trained to exercise judgment in ambiguous situations. This is exactly the operational model that agentic AI requires —a human who can evaluate AI recommendations and override them when necessary, without waiting for a committee. **Adaptive doctrine.** Special Operations doctrine evolves at the speed of the threat. Techniques, tactics, and procedures update continuously based on operational feedback. This doctrinal agility means SOCOM can absorb AI capabilities and codify their employment faster than any other part of the military. **Rapid feedback loops.** SOF operations generate immediate performance data. A mission succeeds or fails within hours, not months. This feedback density is ideal for evaluating AI agent performance and iterating on system design. The civilian AI industry builds agents for enterprise environments —large organizations, bureaucratic approval processes, long evaluation cycles. SOCOM operates at the opposite end of that spectrum. If agentic AI works for a Special Forces team conducting a direct action mission in a denied environment, it will work for a Fortune 500 company managing its supply chain. The reverse is not necessarily true. --- ## 5. BLUE — AUTOMATING THE MILITARY DECISION-MAKING PROCESS If Section 4 describes where agentic AI meets the battlefield, this section describes where it meets the planning process. The Military Decision-Making Process is the most structured, most tested, and most widely taught planning methodology in the Department of Defense. It is also the most labor-intensive. A brigade-level MDMP cycle typically consumes 48 to 72 hours of continuous staff work —terrain analysis, intelligence preparation, course of action development, war-gaming, synchronization. The question is not whether AI will automate parts of this process. The question is how far and how fast. The most concrete answer comes from a startup in Bellevue, Washington. ### EXIA LABS AND THE BLUE SYSTEM Exia Labs was founded by Jonathan Pan, a U.S. Army veteran who spent years in the video game industry at Amazon, Meta, and Walmart before co-founding the company with Serj Kazar, a former colleague at Riot Games.[^42] The combination is not accidental. Video game development demands the same skills that military AI planning requires: real-time simulation, procedural scenario generation, adaptive adversary modeling, and decision trees that branch based on player actions. Exia raised $2.5 million in pre-seed funding from Anorak Ventures, Pathbreaker Ventures, Mana Ventures, a16z Speedrun, and Space Capital.[^43] With that capital, they built Blue —a system of AI workflow agents, each designed to automate a specific step of the Military Decision-Making Process. ### HOW BLUE WORKS Blue does not replace MDMP. It accelerates it. The system deploys specialized AI agents for each phase of the process: **Receipt of Mission Agent.** Operators upload Operations Orders, Warning Orders, annexes, and Concepts of Operations. The agent categorizes the content into structured fields: enemy intent, terrain features, friendly forces, and tasks to subordinate units. What takes a staff section hours of reading and annotation, the agent accomplishes in minutes.[^44] **Mission Analysis Agent.** This agent conducts METT-TC and OAKOC terrain analysis workflows —the foundational analytical frameworks that every Army planner learns at the Infantry Officer Basic Course and refines through years of practice. The agent does not invent new analysis. It applies existing doctrine at machine speed, surfacing the factors that human planners must then weigh with professional judgment. **Course of Action Generation Agent.** This is where Blue moves from analysis to synthesis. The agent generates multiple courses of action, which are then refined and tested through AI simulation. Each COA represents a distinct approach to accomplishing the mission, with different risk profiles, resource requirements, and decision points —exactly as MDMP prescribes.[^45] The architecture mirrors the staff structure that MDMP was designed for. Just as a brigade S-2 handles intelligence, an S-3 handles operations, and an S-4 handles logistics, Blue assigns specialized agents to specialized tasks. The orchestration challenge —ensuring these agents produce coherent, synchronized output —is the same challenge that a Chief of Staff manages on a battle staff. ### TESTING WITH THE FORCE Blue is not a PowerPoint concept. Units from the 101st Airborne Division and the Washington Army National Guard are testing the system in both garrison environments and field exercises.[^46] Exia Labs has also established a Cooperative Research and Development Agreement with the United States Military Academy at West Point, focused on developing AI red agents for strategic wargaming.[^47] And through the Defense Innovation Unit's Blue Object Management Challenge, Exia delivered its Keystone product to the U.S. Navy, extending the MDMP automation concept to the Joint Planning Process for cross-service application.[^48] The progression is deliberate: prove the concept with Army planners, validate with the force, extend to joint operations, and build the next generation of planners at West Point. This is how doctrine propagates. ### THE CGSC EXPERIMENT: PROOF OF CONCEPT The most rigorous validation of AI-augmented MDMP came in November 2025, when faculty at the U.S. Army Command and General Staff College conducted a wargaming experiment integrating large language models into the MDMP process. The results were published in Small Wars Journal in January 2026.[^49] The experiment simulated a four-star Joint Task Force operation in an Indo-Pacific theater, tasked with defeating a peer adversary's anti-access/area-denial network and restoring maritime access amid contested logistics and cyber threats. The research team employed a custom AI agent called Vantage with a 128,000-token context window loaded with the full Joint Task Force exercise scenario, relevant Joint Publications, enemy battle books, and missile-mathematics probability tables developed for multi-domain operations.[^50] Two findings from this experiment have direct implications for the broader AI agent industry. **First: simplified, intent-focused prompts produced "far more realistic adjudications" than detailed prompts.** This was counterintuitive. The research team initially provided exhaustive, step-by-step instructions to the AI —the equivalent of micromanaging a staff officer. The detailed prompts paradoxically constrained the AI's reasoning. When the team shifted to providing commander's intent and allowing the AI to apply its doctrinal knowledge base autonomously, the outputs improved dramatically.[^51] This finding validates a core principle of MDMP that has been taught at every Army school for decades: tell subordinates *what* to accomplish and *why*, not *how*. Commander's intent enables initiative. Detailed instructions constrain it. The same principle, it turns out, applies to AI agents. **Second: the AI correctly applied doctrinal effects calculations that human staff had underweighted.** In one case, the AI adjudicated that a friendly battalion of three companies could fix enemy forces of five-plus companies while retaining 75 percent combat power. The human staff doubted the assessment. Upon inquiry, the AI's reasoning proved sound —it had properly calculated Close Air Support and artillery effects per doctrinal probability-of-kill tables, revealing a hidden assumption in the staff's analysis that underweighted joint fires.[^52] This is not a story about AI being smarter than humans. It is a story about AI being more disciplined. The human planners subconsciously discounted the impact of joint fires because their experience was weighted toward ground-centric operations. The AI had no such bias. It applied the doctrine as written. A separate Small Wars Journal article in February 2026, by LTC (Ret.) Thad Weist and Majors Skyler Kepley and Braxton Musgrove, reinforced the finding: "The most critical lesson is that AI is a tool to augment, not replace, professional military judgment. The human 'gut check' remains the ultimate arbiter of realism."[^53] ### THE GENWAR INITIATIVE The CGSC experiment was not isolated. Johns Hopkins Applied Physics Laboratory established its GenWar Lab, integrating large language models with the Advanced Framework for Simulation, Integration, and Modeling to make high-fidelity simulations accessible through natural language interaction. GenWar enables running potentially hundreds of wargames in a matter of days, exploring a far wider range of scenarios than conventional methods allow. Players describe intended actions in natural language, and the system translates them into executable simulations.[^54] A dedicated GenWar Lab facility at APL's Laurel campus is expected to open in 2026. ### CONNECTING THE THESIS MDMP front-loads the thinking. Military commanders spend 40 percent of their planning time on Mission Analysis alone —understanding the problem before generating solutions. Blue validates this principle computationally: the AI does better work when given commander's intent than when given step-by-step instructions. The more you constrain the agent, the worse it performs. The more you define the *what* and *why* and release the *how*, the better the output. This is not a new insight for anyone who has trained a battalion staff through a BCTP rotation or led a planning team through a Joint Planning Group. But it is a profound insight for the AI industry, which has spent years building increasingly detailed prompt chains, instruction sets, and guardrails —the computational equivalent of micromanagement. The military learned this lesson in 1943 with Auftragstaktik. The AI industry is learning it now. --- ## 6. THE DEFENSE AI INDUSTRIAL BASE Behind every military AI initiative sits an industrial base that is reshaping the defense technology landscape. The companies building this infrastructure are not traditional prime contractors iterating on Cold War platforms. They are software-first companies that treat AI as a core capability, not an add-on feature. Five companies define the contours of this emerging industrial base. ### PALANTIR TECHNOLOGIES Palantir occupies a position in the defense AI ecosystem that no other company can match. It is the connective tissue between intelligence, operations, and decision-making at the highest classification levels. The centerpiece is Maven Smart System. Originally awarded at $480 million, the contract was raised to nearly $1.3 billion through 2029 after the Department of Defense cited "growing demand" —a $795 million increase that reflects operational appetite, not bureaucratic momentum.[^34] Maven deploys across five combatant commands: Central Command, European Command, Indo-Pacific Command, Northern Command/NORAD, and Transportation Command. It uses AI algorithms and machine learning to scan, identify, and prioritize enemy systems by fusing data from intelligence, surveillance, and reconnaissance sources. As of early 2026, there are more than 20,000 active Maven users across more than 35 military service and combatant command software tools in three security domains.[^35] The speed of adoption is remarkable. When NATO inked a deal for Maven Smart System on March 25, 2025, the alliance began using it within 30 days —one of the most expeditious procurements in NATO history, taking only six months from outlining the requirement to fielding the system.[^36] Palantir's reach extends well beyond Maven. The company won a $178 million contract for TITAN —the Tactical Intelligence Targeting Access Node —a ground station connecting Army units to high-altitude and space sensors, using AI and machine learning to automate target recognition and reduce sensor-to-shooter timelines. The program delivers both "advanced" systems connecting to space assets and "basic" tactical systems at lower echelons.[^37] In July 2025, the Army awarded Palantir a single Enterprise Service Agreement worth up to $10 billion over ten years, consolidating 75 existing contracts —15 prime contracts and 60 related subcontracts —into one vehicle. The consolidation eliminates reseller pass-through fees and accelerates software delivery to the force.[^38] The most consequential Palantir program may be the least visible. Gotham, Palantir's intelligence platform, serves as the central digital infrastructure for the Golden Dome missile defense system. The system integrates satellite and sensor data from multiple technology partners, with AI models that ingest, organize, and act upon vast data flows —predicting, flagging anomalies, and modeling threat trajectories before they materialize.[^39] Palantir holds Impact Level 6 accreditation, meaning its systems are authorized to process secret-level national security data. This accreditation enabled the Anthropic Claude integration through Palantir's platform —the same integration that powered AI support during the January 2026 Maduro operation and triggered the subsequent Pentagon-Anthropic dispute.[^40] The Artificial Intelligence Platform (AIP) extends Palantir's capabilities into autonomous agent territory. AIP enables military operators to deploy reconnaissance drones, generate courses of action, and devise tactical responses through AI-assisted workflows —moving from intelligence fusion to operational decision support within a single platform. ### SHIELD AI If Palantir provides the brain, Shield AI provides the reflexes. The company's flagship product, Hivemind, is autonomy software that pilots aircraft without GPS, communications, or a remote pilot. When the datalink drops, Hivemind does not loiter and wait for reconnection. It continues the mission. In February 2026, the U.S. Air Force selected Shield AI as the mission autonomy provider for the Collaborative Combat Aircraft program, integrating Hivemind into Anduril's Fury drone (designated YFQ-44A). This marks the first time mission autonomy software has been formally decoupled from the aircraft platform in a major defense program —the software brain chosen independently of the hardware body.[^41] The Navy demonstration in December 2025 was equally significant. At Point Mugu Sea Range, two BQM-177A subsonic drones flew autonomously under Hivemind control, connected to a Live Virtual Constructive environment with a virtual F/A-18 acting as mission lead. When simulated adversary aircraft attempted to penetrate protected airspace, the autonomous drones independently maneuvered to counter the threats —executing defensive combat maneuvers without continuous pilot input. The remote operator's role was reduced to safety oversight while Hivemind handled perception, decision-making, and maneuver execution.[^42] In January 2026, India selected Shield AI to provide V-BAT unmanned aircraft systems with Hivemind autonomy software to the Indian Army, including domestic manufacturing through JSW Defence at a $90 million facility in Hyderabad.[^43] The HII partnership, announced at DSEI 2025, extends Hivemind into the maritime domain, pairing with HII's Odyssey autonomy suite for cross-domain operations across air, land, surface, and undersea environments.[^44] Hivemind's design philosophy embodies a principle the military has enforced for decades: systems must work when communications fail. In contested environments —where GPS is jammed, radio frequencies are denied, and satellite links are degraded —autonomy is not a feature. It is a survival requirement. ### ANDURIL INDUSTRIES Anduril builds the infrastructure that connects sensors, shooters, and decision-makers. Its Lattice operating system is an AI-powered command and control platform with an open software development kit supporting both REST and gRPC APIs —meaning any developer can build applications that integrate with the military's sensor and weapons networks.[^45] SOCOM awarded Anduril a three-year, $86 million contract as "Mission Autonomy Systems Integration Partner," providing Lattice as the foundation for SOCOM's autonomy management infrastructure. The contract covers development and deployment of mission autonomy software for multi-domain uncrewed systems, enabling coordination of control systems, sensors, weapons, and payloads across multiple vehicles.[^46] The Army selected Anduril's Lattice for the Integrated Battle Command System Maneuver (IBCS-M) program —the service's next-generation fire control platform for counter-UAS missions. During testing at Yuma Proving Ground, Lattice integrated a previously undisclosed sensor and effector within hours, executed live-fire intercepts achieving four out of four kills, and demonstrated autonomy-enhanced fire control with distributed tracking.[^47] Anduril's Fury drone (YFQ-44A) completed its first flight on October 31, 2025, at a California test site. The flight was semi-autonomous —"no operator with a stick and throttle flying the aircraft behind the scenes," as Anduril stated. Weapons integration is underway, with the first live shot planned for 2026.[^48] The scale of Anduril's ambition is visible in steel and concrete. Arsenal-1, a 5-million-square-foot manufacturing facility on 500 acres near Rickenbacker International Airport in Ohio, represents nearly $1 billion of Anduril's own investment. Production begins in July 2026 with three autonomous air systems: Fury, Roadrunner, and Barracuda. It is the largest single job-creation project in Ohio history, with 4,008 employees projected by 2035.[^49] Anduril also partners on Thunderforge, Scale AI's flagship wargaming initiative, providing Lattice as part of the integrated planning ecosystem alongside Microsoft. ### SCALE AI Scale AI occupies the data layer of the defense AI stack. While Palantir integrates intelligence and Anduril integrates platforms, Scale provides the data infrastructure, labeling, evaluation, and generative AI capabilities that make military AI systems accurate and deployable. Thunderforge is Scale's marquee defense program. Awarded by the Defense Innovation Unit, Thunderforge integrates AI into military operational and theater-level planning, fusing cutting-edge modeling and simulation tools. The system deploys AI agents that simulate wargaming scenarios and refine proposed courses of action, delivered initially to U.S. Indo-Pacific Command and U.S. European Command for campaign development, theater-wide resource allocation, and strategic assessment.[^50] Scale's Donovan platform operates on classified networks including SIPRNet and JWICS, with FedRAMP High authorization and Department of Defense Impact Level 4 certification. Donovan integrates LLM-based tools including geospatial analysis, text-to-API translation, and retrieval-augmented generation to support operational decisions "at mission speed."[^51] The contract portfolio is substantial: a $99 million Army Research and Development contract through Aberdeen Proving Ground covering data labeling, annotation, generative AI dataset creation, model development, and testing and evaluation through 2030. A separate $100 million five-year agreement through the Chief Digital and Artificial Intelligence Office provides AI tools on Top Secret and Sensitive Compartmented Information networks.[^52] Scale's position in the defense ecosystem reflects a deeper truth about military AI: the models are only as good as the data they train on and the evaluations that validate them. The military learned this lesson with intelligence analysis decades ago. Bad intelligence produces bad plans. Bad data produces bad AI. ### HELSING Helsing represents the European dimension of the defense AI industrial base —and a reminder that the United States does not have a monopoly on this technology. Valued at 12 billion euros following a 600-million-euro Series D round in June 2025 —led by Spotify co-founder Daniel Ek's Prima Materia, with participation from Lightspeed Ventures, Accel, and Saab —Helsing is Europe's most valuable defense startup at just four years old.[^53] Helsing's Centaur AI produced a milestone that will define the era. In May and June 2025, working with Saab, Helsing flew a Gripen E fighter jet autonomously in beyond-visual-range combat against a human-piloted Gripen D in a trial known as "Project Beyond." This was the first publicly known instance of an AI-piloted fighter jet executing autonomous combat maneuvers against a human adversary.[^54] The implications are not theoretical. Helsing's CA-1 Europa is an autonomous uncrewed combat aerial vehicle weighing approximately four tonnes, designed for high subsonic speeds, with a weapons capacity of 500 kilograms and an operational range of 1,400 to 1,800 kilometers. First flight is targeted for 2027, with entry into service by 2029. The CA-1 operates either independently, in swarms, or as a wingman to crewed fighters —powered by the same Centaur AI that flew the Gripen.[^34] Below the surface, Helsing's Lura system uses a large acoustic model to detect underwater acoustic signatures 10 times quieter than other AI models and at speeds 40 times faster than human operators, deployed on SG-1 Fathom autonomous underwater gliders that can patrol for 90 days without resurfacing. A single operator can task and monitor hundreds of gliders from a Maritime Headquarters at 10 percent of the cost of crewed anti-submarine warfare patrols.[^35] ### THE INDUSTRIAL BASE SUMMARY | Company | Key Platform | Representative Contract Value | Primary Military Application | |---------|-------------|-------------------------------|------------------------------| | **Palantir** | Maven Smart System / Gotham / AIP | $1.3B (Maven) + $10B (Army EA) + $178M (TITAN) | Intelligence fusion, C2, missile defense | | **Shield AI** | Hivemind | USAF CCA program (value undisclosed) | Autonomous flight without GPS/comms | | **Anduril** | Lattice OS / Fury CCA | $86M (SOCOM) + Arsenal-1 ($1B investment) | Autonomous C2, counter-UAS, manufacturing | | **Scale AI** | Thunderforge / Donovan | $99M (Army R&D) + $100M (CDAO) | Data infrastructure, wargaming, classified AI | | **Helsing** | Centaur / CA-1 Europa / Lura | EUR 12B valuation (Series D) | Autonomous air combat, undersea detection | These five companies share a design philosophy that distinguishes them from the traditional defense industrial base. They build software first and integrate hardware second. They treat AI as the platform, not the feature. They iterate at commercial speed and deploy at military scale. And they are collectively building the infrastructure for a defense establishment that will be fundamentally different from the one that exists today. The combined contract values approach $15 billion. The combined capabilities span every domain of warfare: air, land, sea, undersea, space, and cyber. The combined workforce is building not just products but an industrial capacity —factories, classified networks, training programs, and evaluation frameworks —that will define military AI for the next generation. What none of them have built yet is the coordination layer that makes all of these systems work together. That is the problem that military doctrine was designed to solve —and the subject of the remaining sections of this paper. --- ## 7. THE MADURO RAID — WHEN AI GOES TO WAR At 10:46 PM Eastern Standard Time on January 2, 2026, President Donald Trump gave the execute order. Operation Absolute Resolve — months of planning and rehearsal compressed into a single night — went live. Delta Force operators supported by more than 150 aircraft from 20 bases and ships launched from multiple staging areas. Navy, Air Force, and Marine Corps fighter jets and bombers provided air cover. Helicopters from the 160th Special Operations Aviation Regiment — the legendary Night Stalkers — began ingress at 2:01 AM Venezuelan local time, carrying assault teams low over the Caribbean and into the heart of Caracas. The ground time was approximately two hours. By 5:21 AM Venezuelan time, President Nicolas Maduro and his wife Cilia Flores were in American custody, extracted from a compound defended by Cuban military and intelligence personnel who had served as Maduro's personal security force.[^55] Seven U.S. service members were injured. No Americans were killed. The U.S. government assessed approximately 75 people died in the operation, including Venezuelan security forces and 32 Cuban soldiers — among them Colonel Humberto Roca, who had previously been responsible for Fidel Castro's personal security. Small Wars Journal published its analysis within days: "Operation Absolute Resolve: Anatomy of a Modern Decapitation Strike."[^56] The operation succeeded. By every conventional military metric — objective achieved, friendly casualties minimized, enemy capability neutralized — Absolute Resolve was a textbook special operations raid. What followed was not a military failure. It was a doctrine failure. And it revealed every gap that this paper is designed to address. ### The Phone Call Sometime after the raid, a senior executive from Anthropic — the company that builds Claude, the AI system central to this paper's analysis — contacted a senior executive at Palantir Technologies. The question was direct: had Claude been used during the operation to capture Maduro?[^36] The inquiry was, by one Pentagon official's account, "raised in such a way to imply that Anthropic might disapprove of their software being used during that raid." The Palantir executive was "alarmed." The Pentagon was alarmed for different reasons. The inquiry "caused real concerns" across the Department of War — not about the operation, which had achieved every objective, but about the reliability of a critical technology partner in the middle of a strategic pivot toward AI-enabled warfare.[^37] Claude had been deployed through Anthropic's partnership with Palantir Technologies, whose Maven Smart System holds Impact Level 6 accreditation — authorization for secret-level national security data. Palantir's Maven platform is operational across five combatant commands under a contract that has grown from an initial $480 million to $1.3 billion. The precise role Claude played during Absolute Resolve has not been publicly confirmed. Anthropic's spokesperson stated: "We cannot comment on whether Claude, or any other AI model, was used for any specific operation, classified or otherwise."[^38] What is publicly confirmed is that an AI system built by a company that positions itself as the safety-first leader in artificial intelligence was integrated into a classified military platform, deployed during a live combat operation in which people were killed, and the company that built it did not know — until it asked — whether its technology had been used to help capture a foreign head of state. ### The Escalation Pentagon Chief Technology Officer Emil Michael responded publicly. He urged Anthropic to "cross the Rubicon" — to commit irrevocably to supporting military applications without the restrictions Anthropic had imposed in its terms of service. "What we're not going to do is let any one company dictate a new set of policies above and beyond what Congress has passed," Michael said. "That is not democratic. That is giving any one company control over what new policies are, and that's for the president, that's for Congress, and that's for the agencies to determine how to implement those rules."[^39] The language escalated. "You can't have an AI company sell AI to the Department of War and don't let it do Department of War things," Michael continued, "because we're in the business of defending the country and defending our troops."[^40] Defense Secretary Pete Hegseth moved to designate Anthropic a "supply chain risk" — a classification normally reserved for foreign adversaries and hostile state-linked entities. The designation would require every company doing business with the Pentagon to certify that it does not use Claude in its workflows. The direct contract at risk: up to $200 million. But the cascading effects would reach far beyond that number. Anthropic has stated that eight of the ten largest U.S. companies use Claude. A supply chain risk designation would force internal audits, rapid tool replacement, and compliance certifications across every major defense contractor and their subcontractors — a disruption that one senior Pentagon official acknowledged would be "an enormous pain in the ass to disentangle."[^41] Chief Pentagon spokesperson Sean Parnell made the position formal: "The Department of War's relationship with Anthropic is being reviewed. Our nation requires that our partners be willing to help our warfighters win in any fight. Ultimately, this is about our troops and the safety of the American people."[^42] ### Anthropic's Position Anthropic CEO Dario Amodei had articulated his position weeks before the crisis erupted. In his January 2026 essay "The Adolescence of Technology," Amodei identified two applications of AI that his company would not support under any circumstances: fully autonomous weapons and mass surveillance of domestic populations.[^43] On autonomous weapons, Amodei's concern is structural, not sentimental. His core argument: "too small a number of 'fingers on the button,' such that one or a handful of people could essentially" control a lethal force without requiring the cooperation of other humans. The constitutional protections embedded in military command structures — the expectation that service members would refuse illegal orders, the chain of accountability from trigger-puller to commander-in-chief — depend on humans remaining in the decision chain. Autonomous weapons controlled by AI bypass that entire architecture of restraint.[^44] On mass surveillance, Amodei acknowledged that the Fourth Amendment already prohibits it but warned that AI capabilities could create situations existing legal frameworks were never designed to handle. He advocated for new legislation — potentially even a constitutional amendment — to address the gap between what AI can do and what the law anticipated.[^45] Anthropic's official statement maintained careful distance: "Anthropic has not discussed the use of Claude for specific operations with the Department of War." The company indicated willingness to loosen its current terms of use — but not to agree to unrestricted military employment of its technology.[^46] ### The Doctrine Gap The Maduro raid is not a story about whether AI should be used in military operations. AI will be used in military operations. It is being used now. The question this paper addresses is whether that employment is guided by doctrine, process, and quality frameworks — or whether every deployment is an ad-hoc improvisation where the participants discover the rules of engagement in real time, under fire, through trial and error. Examine what happened through the lens of every framework this paper describes. **No METT-TC(IT) analysis of the AI tool.** The "terrain" for Claude's employment — legal, ethical, contractual, operational — was not analyzed before deployment. Palantir integrated Claude into Maven. The military employed Maven during Absolute Resolve. Nobody conducted the equivalent of a terrain walk for the AI capability: What are the contractual boundaries? What happens when the provider disagrees with the employment? What escalation path exists? What are the provider's stated red lines, and does this operation cross them? **No Mission Analysis of how the AI fits.** Anthropic assumed its terms of service governed employment. The military assumed "all lawful purposes" governed employment. Palantir assumed its IL-6 accreditation authorized whatever the military directed. Three parties operating under three different understandings of the mission, with no reconciliation — the doctrinal equivalent of a coalition operation where allied forces operate under different rules of engagement without a combined planning conference. **No shared doctrine between provider, integrator, and user.** This is the central failure. When the Army deploys a weapons system, the manufacturer, the maintenance contractor, and the operating unit all reference the same technical manual, the same doctrine, and the same employment parameters. When the military deployed Claude through Palantir in a combat operation, no equivalent shared framework existed. The manufacturer (Anthropic) learned about the employment after the fact. The integrator (Palantir) was caught between its customer and its supplier. The user (JSOC) operated the capability without reference to the manufacturer's constraints — because no doctrine required it. **No AAR process established in advance.** The After Action Review is not optional in military operations. It is the institutional learning mechanism that converts operational experience into improved doctrine. There is no structured AAR process for AI employment in combat. The lessons of the Maduro raid will dissipate into press narratives, congressional hearings, and contract negotiations instead of being codified into doctrine that prevents the next crisis. **No ROE equivalent for AI employment.** Rules of Engagement define what a force is authorized and not authorized to do. They are briefed before every operation, understood by every participant, and enforced by the chain of command. There is no equivalent for AI employment. No "Weapons Hold" (AI analyzes but takes no action), "Weapons Tight" (AI operates within pre-approved rulesets), or "Weapons Free" (AI operates autonomously within defined boundaries) framework that all parties — provider, integrator, and military user — reference from the same document. In the first paper of this series, I described what I called the "Anthropic Paradox": Anthropic built the most capable multi-agent AI system in the world. It published the architecture, the benchmarks, and the engineering lessons. It partnered with Palantir to bring that capability to the military. And then it discovered — in real time, under operational conditions — that capability without doctrine produces exactly the chaos this paper describes.[^47] The Maduro raid proved the paradox under combat conditions. Every gap in the paper's framework — METT-TC(IT), Mission Analysis, shared doctrine, AAR, ROE — manifested in a single operation. The operation succeeded militarily. The aftermath produced a strategic-level crisis between the Department of War and one of the four AI companies holding $200 million frontier AI contracts, a threatened supply chain disruption reaching into eight of the ten largest American corporations, and a public debate that conflated legitimate questions about AI governance with accusations of disloyalty. Consider the counterfactual. Before Operation Absolute Resolve, a combined planning conference — standard practice for coalition operations — includes representatives from the AI provider, the integration platform, and the military user. They agree on employment parameters. They establish escalation procedures for edge cases. They document what the AI will and will not be used for. They brief rules of engagement for the technology — not just for the operators. They establish an AAR protocol that includes the technology stack. The operation proceeds identically. The military outcome is the same. The phone call never happens — because all parties already know how the AI was employed, within agreed parameters. The supply chain risk designation is never threatened. The $200 million contract is never at risk. The strategic relationship between the Department of War and one of the four frontier AI companies is never damaged. None of this required new technology. None of it required new legislation. It required doctrine — the same planning discipline that the military applies to every other aspect of operations. The tools exist. They have existed for decades. They were not applied — because nobody built the bridge between AI capability and military employment doctrine. That bridge is what this paper proposes. The Maduro raid is what happens without it. --- ## 8. AOC 49B — BUILDING THE HUMAN INFRASTRUCTURE On October 31, 2025, the United States Army established Area of Concentration 49B — AI/ML Officer — creating the first dedicated career field for artificial intelligence and machine learning in the history of the American military. The first selection of officers through the Voluntary Transfer Incentive Program opened January 5, 2026, with applications accepted through February 6. Selected officers will be reclassified by the end of fiscal year 2026.[^48] The Army did not create 49B because artificial intelligence is fashionable. The Army created 49B because it recognized — at the institutional level — that AI agents require dedicated human management by uniformed professionals who understand both the technology and the operational environment. ### What 49B Officers Will Do The scope of 49B duties reads like a requirements document for the digital battle staff: - **Operationalize AI/ML capabilities across the range of military operations.** Not experiments. Not demonstrations. Operational employment — in garrison, in the field, in combat. - **Accelerate battlefield decision-making.** Enable commanders to make faster, more informed decisions in complex environments. This is the MDMP acceleration that Blue (Section 5) demonstrated with the 101st Airborne: AI compressing the planning timeline from days to hours. - **Streamline logistics and optimize supply chains.** The unsexy work that determines whether soldiers eat, receive ammunition, and get fuel. AI applied to the S-4 shop. - **Support fielding of robotics and autonomous systems.** The hardware dimension — integrating AI-controlled platforms into tactical units. - **Establish and manage AI systems.** Cradle-to-grave lifecycle management — the 49B officer owns the system from deployment through sustainment through decommissioning.[^49] ### How They Will Be Trained Selected officers will undergo graduate-level education combined with hands-on experience building, deploying, and maintaining AI-enabled systems. The Army's language is precise: these officers will be "practitioners, not just theorists." The training pipeline emphasizes construction, deployment, and sustainment — the full lifecycle of an AI system under operational conditions.[^50] The Army is also exploring expanding the 49B pathway to warrant officers — a signal that the institution recognizes the need for deep technical expertise at the hands-on level, not just officer-level leadership and management. Warrant officers in the U.S. Army are the technical masters of their domains. A 49B warrant officer would be the AI equivalent of a maintenance warrant who knows every system on the aircraft, every failure mode, every workaround — the person the commander calls at 2 AM when the system breaks.[^51] ### Historical Precedent The military has walked this road before. Every time technology outpaced existing career structures, the institution created new specialties to manage the gap. In 2014, the Army created the Cyber Branch — Career Management Field 17 — recognizing that cyberspace operations required dedicated professionals, not signal officers with additional training. The 17-series absorbed electronic warfare specialists from the 29-series in 2018, consolidating offensive and defensive capabilities under a single professional community. Before that, information operations, psychological operations, and civil affairs each spun off from parent branches when the complexity of their missions exceeded what generalists could manage.[^52] The pattern is consistent. Technology arrives. Generalists adapt. The gap between what generalists can manage and what the technology demands grows. The institution creates a new specialty. The new specialty professionalizes the domain. The cycle repeats. 49B follows this pattern exactly. AI arrived. Signal officers, intelligence officers, and operations officers adapted — adding AI tools to their existing competencies. The gap between what these generalists can manage and what operational AI systems demand is now too large. The institution responded. ### The Civilian Gap Here is the question the 49B decision raises for every technology company, every AI startup, and every enterprise deploying multi-agent systems: who is your 49B equivalent? In most organizations, the answer is nobody. The data scientist builds the model. The software engineer deploys it. The product manager defines requirements. The security team monitors threats. Nobody owns the operational employment of AI agents across the full lifecycle — from mission analysis through deployment through sustainment through after-action review. Nobody is trained to think about AI as an operational capability that requires doctrine, quality assurance, and process discipline. The Army recognized this gap and created a career field. The civilian AI industry has not. That gap is where the 40% project cancellation rate that Gartner predicts originates. That gap is where the 41% to 86.7% multi-agent failure rates documented by UC Berkeley live. That gap is where the Anthropic-Pentagon crisis was born.[^53] ### The Recruitment Problem The Army faces a structural challenge: competing with the private sector for AI talent. A senior AI engineer at a major technology company commands $300,000 or more in total compensation. A captain in the United States Army — the grade at which a 49B officer would typically serve after initial selection — earns approximately $85,000 in base pay plus benefits. The gap is real.[^54] The military's counter-offer is not financial. It is mission, purpose, scale, and consequence. A 49B officer will deploy AI systems that support decisions affecting thousands of lives. They will operate at classification levels that no commercial employer can match. They will work on problems where failure means dead Americans, not a dip in quarterly revenue. They will lead teams under conditions of uncertainty, ambiguity, and lethal risk — the conditions that forge the judgment that AI systems cannot provide for themselves. For a certain type of practitioner, that offer is compelling. The Army is betting that enough of them exist to fill the ranks. The bet is not without risk — but the alternative, operating AI-enabled warfare without dedicated professionals, carries far greater risk. ### The Training Pipeline Challenge The 49B career field faces a bootstrapping problem that every new military specialty confronts: the people who will train the first generation of 49B officers do not yet exist as 49B officers. The Army must build the training cadre from officers who developed AI expertise in other branches — signal, intelligence, operations research — and formalize that expertise into a repeatable education pipeline. The initial VTIP window targeted officers with "strong technical or academic backgrounds." This is the seed cadre — the first generation that will define the professional culture, establish the standards, and write the doctrine that every subsequent 49B officer inherits. The quality of this seed cadre will determine whether 49B becomes a prestigious, high-impact specialty that attracts top talent — or an administrative backwater that talented officers avoid. Military history offers both outcomes. The Cyber Branch (CMF 17) launched in 2014 with similar uncertainty and has grown into one of the Army's most competitive career fields, with selection rates for qualified applicants consistently below 50 percent. Special Forces (18-series) officers carry an institutional prestige that shapes the entire culture of Army special operations. By contrast, some functional area designations struggle to attract competitive officers because the institutional culture does not reward the specialty. The 49B trajectory depends on three factors: whether AI-enabled operations deliver measurable battlefield results that senior leaders can point to, whether 49B officers are assigned to positions that matter rather than staff backwaters, and whether the promotion system rewards AI expertise rather than penalizing officers who departed their basic branch. The Army's decision to open the VTIP window barely three months after establishing the AOC suggests institutional urgency. Whether urgency translates into institutional commitment remains the open question. ### What This Means for the Digital Battle Staff The 49B officer is the human node in the digital battle staff. The AI agents provide speed, scale, and tireless analytical capacity. The 49B officer provides what the AI cannot: operational judgment under uncertainty, accountability for outcomes, the authority to override AI recommendations when ground truth contradicts the model, and the doctrinal expertise to employ AI systems in accordance with mission requirements, rules of engagement, and the law of armed conflict. Without 49B officers — or their equivalent in other services and in the civilian sector — the digital battle staff operates without a human commander. The agents execute. Nobody commands. And as the MAST taxonomy documented, 68% of multi-agent failures occur precisely because nobody defined the mission properly or verified the results. The 49B officer is the fix for that 68%. --- ## 9. THE OFF-RAMP — FROM AI DEPENDENCE TO AUTONOMOUS OPERATIONS ## 9.1 The Maturation Imperative: AI as Scaffold, Not Structure Every preceding section argues that AI requires human-built frameworks to function at scale. MDMP provides planning architecture. Lean Six Sigma provides quality control. METT-TC(IT) provides situational analysis. The Cadre model provides human oversight. None of these frameworks address the question practitioners confront the moment they operationalize AI: when does the AI stop doing the work? Operational experience reveals the answer. Every AI-assisted cycle generates two outputs: the deliverable and the institutional knowledge of how to produce that deliverable. The deliverable has immediate value. The institutional knowledge compounds — because it converts AI-dependent processes into deterministic, codified operations that execute without AI involvement. Military doctrine embeds this principle. The Army does not keep senior advisors permanently attached to a unit learning basic battle drills. Advisors train the cadre, the cadre trains the formation, and the advisors move to the next problem. AI occupies the advisor role. The objective is not permanent AI integration — it is AI graduation: systematically reducing AI involvement as human-built systems absorb the repeatable logic.[^57] This distinction carries economic urgency. The Kim et al. study documented in Section 8 quantified token efficiency collapse across multi-agent architectures — from 67.7 tasks per 1,000 tokens for single agents to 13.6 for hybrid systems. Every operation that remains permanently AI-dependent pays that token tax every cycle. Every operation that graduates to deterministic code pays it once and never again. The Off-Ramp Model converts the Kim et al. findings from a research curiosity into an organizational imperative: automate what you can, codify what you learn, and reserve AI for what defies codification. ## 9.2 Commander's Intent Two Levels Up: Why AI Needs the Big Picture Army doctrine requires every staff officer and subordinate commander to understand the mission and commander's intent two echelons above their own. A company commander executing a movement-to-contact knows not just the battalion commander's intent but the brigade commander's as well. This principle exists for one reason: when the plan falls apart — and it will — the subordinate who understands the purpose two levels up makes decisions that advance the overall mission rather than optimizing a local objective that no longer matters. AI systems face this identical challenge. An AI agent executing a specific task — classifying files, consolidating tags, generating reports — will encounter ambiguity. Data does not fit the expected pattern. A rule produces conflicting results. The context window truncates critical information. At that decision point, the AI's behavior depends entirely on whether it understands *why* it received the task, not just *what* the task requires. An AI that knows only its immediate instruction — "merge duplicate tags" — will merge aggressively, potentially destroying meaningful distinctions. An AI that understands the higher intent — "we maintain a lean taxonomy that enables rapid retrieval while preserving semantic precision" — flags the ambiguity and asks for guidance rather than executing a technically correct but strategically wrong action. The UC Berkeley MAST taxonomy (Section 8) validates this with data. Two of the fourteen failure modes — "Disobey Task Specification" and "Task Derailment" — occur precisely when agents lack sufficient context to distinguish between letter-of-the-law compliance and intent-aligned execution. The 37% of multi-agent failures that originate in specification and system design are, at their root, failures of intent communication. The agent received a task without purpose, and when ambiguity arose, it optimized locally rather than globally. This maps directly to how MDMP structures planning. Step 1 of Mission Analysis requires the staff to "thoroughly analyze the higher headquarters' plan or order to determine how their unit — by task and purpose — contributes to the mission, commander's intent, and concept of operations of the higher headquarters." The key phrase is *by task and purpose*. Task alone produces mechanical execution. Purpose produces adaptive execution. AI systems that receive only tasks produce brittle automation. AI systems that receive task, purpose, and intent two levels up produce resilient operations that degrade gracefully when conditions change.[^58] The practical implementation: every AI session, every prompt, every automated rule includes three layers of context. **Task:** what to do. **Purpose:** why this task matters. **Commander's Intent:** the desired endstate two levels above the immediate operation. This is not overhead. This is the difference between an AI that executes and an AI that executes correctly when things go wrong. ## 9.3 Rules of Engagement: The Sandbox That Keeps AI Honest The U.S. military defines Rules of Engagement as "directives issued by competent military authority that delineate the circumstances and limitations under which forces will initiate and/or continue combat engagement." ROE exist because capability without boundaries creates catastrophic risk. A soldier with a rifle and no ROE is a liability. A soldier with a rifle and clear ROE is a precision instrument. AI operates under this same dynamic. An AI agent with full system access and no constraints will take actions that optimize for the immediate instruction at the expense of the broader system. It will delete files that appear redundant without checking dependencies. It will reclassify notes based on keyword matching without understanding context. It will execute batch operations that cannot be reversed. Capability without boundaries. The Anthropic-Pentagon crisis documented in Section 8 illustrates this at strategic scale. Claude was deployed in a military operation without agreed rules of engagement between the AI provider and the military operator. The result was not a technical failure — the system performed as designed. It was a governance failure — nobody defined what the AI could and could not do, under what conditions, and with what approval authority. The absence of AI ROE created a strategic-level crisis from a tactical-level success.[^59] The Off-Ramp Model requires explicit AI Rules of Engagement — a constraints framework that defines what the AI can and cannot do, under what conditions, and with what approval authority. These rules mirror the military's tiered approach to force authorization: | ROE Level | Military Equivalent | AI Application | |-----------|-------------------|----------------| | **Weapons Hold** | Engage only in self-defense or in response to a formal order | Read-only. AI analyzes and reports but takes no action. Requires explicit human approval for every change. | | **Weapons Tight** | Engage only targets positively identified as hostile per established criteria | Rule-bound execution. AI acts only within pre-approved rulesets. Flags anything outside established patterns for human review. | | **Weapons Free** | Engage any target not positively identified as friendly | Full autonomous operation within defined boundaries. AI executes, logs, and reports. Human reviews exceptions only. | **The Two-Strike Rule.** Borrowing from escalation-of-force doctrine, the Off-Ramp Model implements a two-strike boundary for AI operations. On the first anomaly — an unexpected result, a classification conflict, a rule that produces contradictory outputs — the AI flags the issue and continues operating under tightened constraints. On the second anomaly within the same operational cycle, the AI halts execution, generates a detailed incident report, and waits for human intervention. This prevents cascade failures — the 17.2x error amplification documented in the Kim et al. study[^60] — by building automatic circuit breakers into the system. The MAST taxonomy's Category 3 failures — premature termination, incomplete verification, incorrect verification — all emerge from systems that lacked these circuit breakers. **Geographic and Temporal Restraints.** Military ROE routinely impose geographic boundaries: forces cannot cross a phase line without authorization, cannot fire into a designated no-fire zone, cannot operate beyond a specified area of operations. AI ROE impose equivalent constraints. An AI operating on a knowledge base does not touch configuration files. An AI running tag consolidation does not modify file content. An AI executing batch operations does not process more than a defined number of items without a checkpoint. These are not arbitrary restrictions — they are control measures that preserve the commander's ability to manage risk. ## 9.4 Risk Management: Security, Edge Cases, and the Threats AI Creates Section 6 introduced risk management and after-action reviews as essential AI governance mechanisms. The Off-Ramp Model intensifies this requirement because each phase introduces distinct risk profiles that demand different mitigation strategies. **Phase 1 Risks: The AI as Operator** *Data exposure.* AI agents operating on knowledge bases, email systems, and financial documents access sensitive information. Every API call, every context window, every logged session represents a potential data leakage vector. Mitigation: enforce read-only access by default, require explicit permission escalation for write operations, and audit every session's scope against the minimum-necessary-access principle. *Irreversible actions.* An AI that deletes 1,337 files — even correctly — executes an operation that demands verification before execution, not after. Mitigation: mandatory pre-execution review for any destructive operation, staged rollouts (process 10% first, validate, then continue), and automatic backup creation before batch operations. *Context window hallucination.* AI agents operating near the limits of their context window begin losing track of earlier instructions, constraints, and accumulated state. The agent does not report this degradation — it continues executing with degraded awareness. The MAST taxonomy identifies this as FM-1.4: "Loss of Conversation History" — one of the five specification failures that account for 37% of all multi-agent system breakdowns.[^61] Mitigation: enforce session length limits, implement checkpoint-and-resume protocols, and require the AI to summarize its current understanding of constraints before executing critical operations. **Phase 2 Risks: The Codification Trap** *Premature codification.* Converting a pattern into a rule before sufficient data validates the pattern produces brittle automation. A rule derived from 50 observations that fails on the 51st creates more damage than manual processing would have — because the rule executes at machine speed across the entire dataset before anyone detects the error. Mitigation: require a minimum observation threshold before any pattern becomes a production rule. Validate rules against a holdout dataset before deployment. *Edge case blindness.* Deterministic rules handle the 80% case. The remaining 20% contains the exceptions that destroy systems. A classification rule that correctly sorts 800 of 1,000 notes but misclassifies 200 into the wrong PARA category creates a cleanup operation larger than the original problem. Mitigation: every rule includes explicit exception-handling logic and a "cannot classify" pathway that routes ambiguous items to a human review queue rather than forcing a wrong answer. **Phase 3 Risks: Autonomous Operations** *Drift without detection.* An autonomous system that runs without regular human oversight will drift from its intended purpose. Rules that made sense when codified become stale as the underlying data changes. Categories that worked for a 1,000-file vault may not work for a 5,000-file vault. Mitigation: the battle rhythm's quarterly review cycle exists specifically to detect and correct drift. The monthly AI analytical layer provides early warning. Both are non-negotiable — removing them to "save time" reintroduces the risks the entire model was built to eliminate. *Security surface expansion.* Every automated integration point — GitHub webhooks, API connections, scheduled scripts, cloud sync — expands the attack surface. A vault that lives only on a local machine has one security boundary. A vault connected to GitHub, synced via Obsidian Git, processed by scheduled Python scripts, and analyzed by AI through API calls has dozens. Mitigation: apply the principle of least privilege at every integration point. Audit connections quarterly. Assume every external touchpoint is a potential compromise vector and design accordingly.[^62] ## 9.5 The Three-Phase Off-Ramp Model With commander's intent, rules of engagement, and risk management established as foundational requirements, the Off-Ramp Model defines three phases of AI-to-autonomous maturation. Each phase reduces AI involvement while increasing system reliability. The model mirrors the military's approach to capability transfer: demonstrate, coach, hand off. | Phase | AI Role | Human Role | Output | |-------|---------|-----------|--------| | **Phase 1: Operate** | Identifies problems, decides solutions, executes changes, tracks outcomes. Operates under Weapons Hold or Weapons Tight ROE. | Commander's intent, ROE definition, quality review, approval authority. | Deliverables + documented decision logic (rule candidates) + risk register. | | **Phase 2: Codify** | Converts repeatable patterns into deterministic rules. Validates against edge cases. Builds exception-handling pathways. | Rule validation, edge case identification, acceptance testing, security review. | Codified rulebook + application logic + test suite + exception queue design. | | **Phase 3: Analyze** | Inspects app-generated reports, flags anomalies, recommends adjustments. Operates as Inspector General, not operator. | System owner, exception handler, doctrine updater, security auditor. | Autonomous operations with AI-augmented exception handling + quarterly doctrine review. | The critical insight: Phase 1 generates the raw material for Phase 2. Every AI-executed operation that follows MDMP discipline — with defined mission variables, quality checkpoints, and after-action reviews — produces documentation that converts directly into application logic. Organizations that skip the discipline in Phase 1 never accumulate the institutional knowledge required to reach Phase 3. They remain permanently AI-dependent, consuming tokens indefinitely to perform tasks that deterministic code handles faster, cheaper, and more reliably. The MAST taxonomy reinforces this point from the failure perspective. Category 3 failures — premature termination, incomplete verification, incorrect verification — are precisely the failures that Phase 1 discipline prevents. An operation conducted with QASAS-model quality assurance (Section 5) and structured after-action reviews (Section 6) produces verified deliverables *and* the verification criteria that Phase 2 codifies into automated testing. Skip the discipline, and Phase 2 has nothing to codify. ## 9.6 The Battle Rhythm: Cyclic Operations and AI Reduction Military units operate on a battle rhythm — a recurring cycle of activities that maintains operational tempo regardless of leadership presence. The Tactical Operations Center runs shift changes, intelligence updates, logistics reports, and commander's update briefs on a fixed schedule. Everyone knows the cycle. Deviations trigger notifications automatically. AI-integrated systems demand this identical discipline. The battle rhythm defines which operations execute at which frequency, and which tier of intelligence handles each cycle. The principle: AI involvement decreases as cycle frequency increases. | Cycle | Operations | Execution | AI Role | |-------|-----------|-----------|---------| | **Daily** | Inbox processing, classification, link validation, hygiene checks | 100% deterministic. Rule-based scripts, scheduled triggers. | Zero. App runs autonomously. | | **Weekly** | Tag audit, orphan detection, project status review, session archival | 90% deterministic with AI summary. App scans; AI interprets edge cases. | ~10%. Reads findings, flags anomalies. | | **Monthly** | Full health assessment, structural balance, archive candidates, trend analysis | 70% deterministic with AI analysis. App compiles metrics; AI assesses strategy. | ~30%. Pattern recognition, recommendations. | | **Quarterly** | Taxonomy evolution, rule refinement, app logic updates, security audit, doctrine review | 50% collaborative. Human and AI review performance, update rules, refine criteria. | ~50%. Full analytical partnership. | The inverse relationship between frequency and AI involvement reflects operational efficiency. High-frequency operations run without AI because token costs, latency, and availability constraints make AI-dependent daily operations unsustainable at scale. The battle rhythm forces the organization to codify daily operations first, creating a natural maturation pipeline that pushes AI toward its highest-value function: judgment on the exceptions that deterministic code cannot resolve.[^63] The protocol standardization described in Section 8 accelerates this maturation. MCP provides the plumbing that connects deterministic applications to AI analytical layers — a standardized interface between the S3 shop (the app running the battle rhythm) and the commander's analytical advisor (the AI system conducting exception analysis). A2A enables the quarterly doctrine reviews to incorporate insights from multiple AI systems without manual context transfer. The protocols do not replace the battle rhythm — they make it implementable at scale. ## 9.7 The S3 Shop Model: Staff Processes That Run Without the Commander In battalion and brigade operations, the S3 (Operations) section runs the TOC battle rhythm whether the commander stands in the TOC or sleeps in his rack. The duty officer tracks the common operating picture. Reports flow on schedule. Significant activities get logged. Deviations trigger notifications up the chain. The commander walks in, checks the board, makes decisions on the exceptions, and moves to the next priority. This is the target architecture. The application becomes the S3 shop — running the battle rhythm autonomously, maintaining the common operating picture, and flagging exceptions. The AI becomes the commander's analytical advisor — called in when the S3 identifies something that requires judgment beyond established rules. The failure mode is obvious: organizations that never build the S3 shop force the commander to run the TOC personally. Every shift. Every day. Forever. In AI terms, the practitioner remains permanently tethered to the AI interface, manually executing operations that a properly built application handles autonomously. The practitioner who builds their own S3 shop achieves operational freedom. The AI serves them. They do not serve the AI.[^64] ## 9.8 Case Study: Knowledge Base Operations as Proof of Concept A Personal Knowledge Management system provides an instructive microcosm. The author's Obsidian vault — organized under the PARA methodology (Projects, Areas, Resources, Archives) — underwent AI-assisted operations that illustrate the three-phase progression in practice. **Phase 1 (AI Operates):** Over 32 days, 39 AI sessions executed under Weapons Tight ROE — read access to all files, write access only with approval per batch. The vault generated 1,697 commits on the main branch, with 97 tracked tasks and 71 session handoff documents. AI conducted a full vault audit, identified structural issues including a Resources folder bloated with 469 imported bookmark files, and executed a purge of 1,337 redundant files (reducing the vault from 2,281 to 966 files). Tag consolidation achieved a 79% reduction from 459 to 96 unique tags. Multi-instance session coordination protocols were developed after four separate git index contamination incidents — where concurrent AI sessions corrupted each other's staging areas — drove the adoption of atomic commit patterns and explicit-path commits.[^65] Commander's intent two levels up: "build a knowledge base that enables rapid retrieval, supports long-term intellectual compounding, and serves as the foundation for a life operating system." Every tactical decision — which tags to merge, which files to purge, which session protocols to adopt — traced back to that intent. **Phase 2 (AI Codifies):** Decision logic from Phase 1 operations converts into deterministic rules. Tag taxonomy rules define which tags merge and why. PARA classification criteria specify folder assignments. Frontmatter validation enforces consistency across all 5,980 files. Orphan detection identifies broken wikilinks. Session coordination rules — atomic git operations, staged file checks, UUID-based deconfliction — codify the solutions to problems discovered during Phase 1 execution. Each rule includes exception-handling logic and a "cannot classify" pathway. The two-strike rule applies: two consecutive classification failures halt the batch and generate a review request. The institutional knowledge generated during Phase 1 is enormous: 85 agent and skill configuration files, 8 hook scripts for automated quality gates, 10 utility scripts, and 161 documented lessons learned with root causes. Every one of these artifacts represents a decision that was made under AI-assisted conditions, validated through operational use, and codified for deterministic execution. **Phase 3 Target (AI Analyzes):** A standalone application runs codified rules on the battle rhythm. Daily classification, weekly audits, monthly health reports. The application generates findings that surface only the exceptions — notes that defy classification, structural anomalies, trend data suggesting taxonomy evolution. AI reads the report and provides analytical judgment. The practitioner reviews recommendations and updates doctrine. The cycle repeats. The token economics validate the model. A single Phase 1 session consumes approximately 50,000-100,000 tokens to classify and organize files that a Phase 3 application handles for zero tokens indefinitely. With 39 sessions over 32 days averaging 95 commits per day, the total Phase 1 token investment is substantial — but it amortizes across every future execution cycle. Organizations that skip the discipline pay the full token cost every cycle, forever. ## 9.9 The Satisficing Threshold: When "Good Enough" Earns Its Keep The preceding subsections imply a clean march from Phase 1 through Phase 3. No military operation in history has ever executed that cleanly. The Off-Ramp Model describes a direction of travel, not a guaranteed destination. Practitioners who treat Phase 3 as an obligation rather than a conditional outcome waste resources pursuing automation that costs more than the problem it solves. Five operational realities constrain the model's application. Each determines whether a given operation advances to the next phase or remains — permanently and correctly — at its current level. **The 70% Rule.** Patton said a good plan violently executed now beats a perfect plan next week. The Army institutionalized this — the 1/3-2/3 rule allocates two-thirds of available time to subordinates, which means the commander's plan never reaches full maturity before execution begins. Practitioners plan to 70%, execute, and adjust. The same principle governs automation: deploy the system at 70% capability, run it through one battle rhythm cycle, and refine based on actual performance rather than theoretical completeness. Waiting for 100% codification before deployment guarantees the requirements change before the system goes live.[^66] **The Turnover Test.** Military SOPs exist because the next person runs the TOC, not the person who built it. A battalion S3 shop builds a battle rhythm over six months. The S3 PCS's. The replacement inherits SOPs they did not write, for systems they do not fully understand. Within 90 days, drift begins. Within 180, the battle rhythm reflects the new S3's preferences, not the original architecture. The Turnover Test imposes a hard standard: if a new operator cannot run the system within one battle rhythm cycle — daily operations within one day, weekly operations within one week — the system carries too much complexity. Automation that requires its architect to maintain it is not automation. It is a single point of failure wearing a disguise. **The Energy Budget.** Every system competes for finite resources: the practitioner's time, compute costs, cognitive bandwidth, and willingness to sustain maintenance. No commander ever has enough of anything. Units constantly rob Peter to pay Paul — pulling mechanics off maintenance for a guard roster, diverting training funds to cover an unplanned deployment, burning their best NCO on a detail because higher directed it. A Phase 2 codification effort that consumes 40 hours to save 2 hours per week breaks even at week 20. If requirements shift before week 20, that investment evaporates. The disciplined practitioner calculates this ROI before committing resources, not after. The idea that an organization will dedicate sustained resources to perfecting an autonomous system ignores the reality that something else is always on fire. **The Diminishing Returns Cliff.** In Lean Six Sigma terms, moving from 3 sigma to 4 sigma quality costs exponentially more than moving from 2 sigma to 3. Automation follows the same curve. Codifying the first 60% of vault operations delivers massive ROI — classification rules that handle the obvious cases, validation scripts that catch structural errors, scheduled hygiene that prevents drift. Codifying the next 20% delivers moderate ROI — edge case handling, exception routing, anomaly flagging. Codifying the last 20% — the ambiguous classifications, the context-dependent decisions, the cases that resist deterministic logic — often costs more than letting AI handle them every cycle. That last 20% is the AI's permanent job. Not because codification failed, but because codification there destroys more value than it creates. Forcing deterministic rules onto genuinely ambiguous situations produces the kind of brittle, over-fitted automation that breaks spectacularly when conditions change.[^67] **The Satisficing Decision.** Phase 3 is the destination for operations where the math works. Everything else lives permanently in a Phase 1-2 hybrid — and that is not failure. That is resource-informed decision making. Daily inbox processing that consumes 30 minutes? Automate it. The ROI compounds in weeks. Quarterly taxonomy review that takes 4 hours? Keep the AI in the loop. The codification cost exceeds the time saved because the rules change faster than any system can harden them. The Off-Ramp Model does not say "automate everything." It says "automate what earns its keep, and know the difference." ## 9.10 The Four Tiers of AI Value The Off-Ramp Model's phased approach implies a natural question: what, exactly, does AI contribute at each level of operational maturity? The answer is not uniform. AI delivers value across four distinct tiers, and practitioners who understand which tier they occupy make better decisions about where to invest their automation resources. | Tier | AI Function | Examples | |------|------------|---------| | **1: Speed** | AI executes known processes faster than humans. Zero judgment. Pure execution velocity. | Day trading bots, email classification, batch file operations, scheduled scripts. | | **2: Pattern Recognition** | AI surfaces what humans miss across datasets too large for manual scanning. AI does not judge — it surfaces. | Anomaly detection, vault health reports flagging structural drift, market scanners identifying correlation shifts. | | **3: Synthesis** | AI connects information across compartmentalized domains. Cross-domain correlations buried in separate data streams. | Linking tag growth rate to project completion decline. Correlating health data with productivity patterns. Connecting financial trends across separate accounts. | | **4: Judgment** | AI weighs genuinely ambiguous situations where rules do not apply, data conflicts, and context resists codification. The Inspector General function. | Recommending taxonomy restructuring based on shifting usage patterns. Flagging strategic misalignment between stated goals and operational behavior. | Ninety percent of current AI usage sits at Tier 1. That is not a criticism — it is a statement of maturity. Speed carries real value. An organization that automates its Tier 1 operations frees cognitive bandwidth for Tier 2 pattern recognition. An organization that codifies its Tier 2 patterns opens capacity for Tier 3 synthesis. Each tier becomes autonomous before the practitioner graduates to the next. Skipping tiers — attempting Tier 4 judgment before codifying Tiers 1 through 3 — produces the same brittle outcomes as deploying a unit to combat before it completes basic training. The Off-Ramp Model's phases map directly to this tier structure. Phase 1 operations typically deliver Tier 1 and 2 value. Phase 2 codification captures Tier 1 and 2 logic into deterministic systems. Phase 3 analysis represents the transition to Tier 3 and 4, where AI contributes synthesis and judgment rather than speed and pattern detection. The practical implication: most operations reach their highest sustainable value at Tier 2, and the satisficing threshold determines which operations justify the investment to reach Tiers 3 and 4. ## 9.11 The Compounding Principle: Why Discipline Now Pays Exponentially Later Every AI session conducted under MDMP discipline — with defined mission variables, commander's intent, ROE, quality checkpoints, and after-action reviews — deposits into a compounding account of institutional knowledge. Practitioners who maintain this discipline accumulate codifiable rules at an accelerating rate because each new rule simplifies discovery of the next. The mathematics favor the disciplined. A Phase 1 operation consuming 50,000 tokens to classify 100 files produces a classification ruleset that a Phase 3 application executes for zero tokens indefinitely. The initial AI investment amortizes across every future execution. Organizations that skip the discipline — that treat AI as a permanent crutch — pay the full token cost every cycle, forever. They lease AI capability. Disciplined organizations own operational capability. This is the ultimate convergence. The Army builds doctrine so units operate independently. Lean Six Sigma builds processes so organizations sustain quality without constant intervention. The Off-Ramp Model builds autonomous systems so practitioners redirect AI toward its highest sustainable value tier — whether that is speed, pattern recognition, synthesis, or judgment — while deterministic code handles everything the satisficing threshold validates for automation.[^68] The super intelligent five-year-old does not run the house. The adults build the house, and the five-year-old contributes extraordinary insights about how to make it better. But nobody pretends every room needs the five-year-old's constant attention. Some rooms run themselves. Some rooms need periodic inspection. A few — the genuinely ambiguous, the contextually unique, the strategically significant — earn the five-year-old's full analytical power. The disciplined practitioner knows which rooms belong in which category. Rest assured: AI is not taking jobs away. It is changing them in fundamental ways. The market is shifting. Practitioners who conduct an honest SWOT analysis of their own capabilities recognize the threat and the opportunity simultaneously. Those who acquire working knowledge of AI — not to become engineers, but to become informed operators — position themselves in an economy where knowledge and judgment become the central currencies. The ones who dismiss the shift, who wait for someone else to figure it out, will find themselves outpaced by colleagues who understood that the tool changed, so the tradecraft had to change with it. --- ## 10. THE ADVERSARY CALCULUS The United States is not the only nation building a digital battle staff. The adversary calculus shapes every decision about AI adoption speed, doctrine development, and acceptable risk. ### China: Civil-Military Fusion as Structural Advantage The Pentagon's December 2025 report to Congress stated plainly that Beijing is "catching up" in generative AI. The report assessed that "China's commercial and academic AI sectors made progress on large language models and LLM-based reasoning models, which has narrowed the performance gap between China's models and the U.S. models currently leading the field."[^34] The narrowing is not abstract. It is operational. A Reuters investigation in late 2025 documented the People's Liberation Army's rapid integration of DeepSeek — China's breakthrough open-weight AI model — into military weapons systems. The scope of integration is staggering: - **Norinco P60:** China's state-owned defense giant unveiled a military vehicle in February 2026 capable of autonomously conducting combat-support operations at 50 kilometers per hour, powered by DeepSeek. Communist Party officials described it as an early showcase of how Beijing is using AI to compete in the arms race with the United States.[^35] - **Drone swarm decision-making:** Beihang University, China's premier military aviation research institution, is using DeepSeek to improve drone swarm targeting of "low, slow, small" threats — military shorthand for the drones and light aircraft that dominate modern battlefields from Ukraine to the Red Sea.[^55] - **Battlefield planning compression:** Researchers at Xi'an Technological University reported that their DeepSeek-powered system assessed 10,000 battlefield scenarios — each with different variables, terrain conditions, and force deployments — in 48 seconds. A conventional team of military planners would require 48 hours for the same analysis. That is a 3,600-to-1 compression ratio.[^56] - **Autonomous target recognition:** Two dozen military tenders and patents reviewed by Reuters show the PLA integrating AI into drones for autonomous target recognition, tracking, and formation operations with minimal human intervention.[^36] - **Robot dog packs:** AI-powered robot dogs deployed in packs for scouting and explosive clearance — the PLA issued tenders in November 2024, and state media has published images of armed robot dogs from manufacturer Unitree in military drills.[^37] The U.S. State Department's assessment is blunt: "DeepSeek has willingly provided and will likely continue to provide support to China's military and intelligence operations." The assessment cited more than 150 references to DeepSeek in procurement records for PLA entities and defense industrial base affiliates.[^38] ### The Strategic Comparison A January 2026 analysis by the Foreign Policy Research Institute and a companion piece in The Diplomat framed the U.S.-China AI competition along two distinct axes.[^39] | Dimension | United States | China | |-----------|--------------|-------| | **Model development** | Leads in frontier models | Narrowing gap, especially in open-weight | | **Innovation structure** | Decentralized, competitive private sector | Centralized via civil-military fusion | | **Adoption mechanism** | "Diffusion race" — experiment to fielded use | Direct routing: commercial AI to military via streamlined procurement | | **Scaling challenge** | Federated defense enterprise without centralized political control | State direction enables rapid, uniform deployment | | **Government AI projects** | Pace-Setting Projects (7 PSPs, demonstrator-to-replication model) | 81 LLM government projects in first half of 2024 alone | | **Competitive advantage** | Capability edge at the frontier | Speed edge in adoption and integration | The distinction matters. The United States builds the best models. China deploys models fastest. In a competition where the decisive variable is not capability but diffusion — how quickly a promising technology moves from experiment to trusted, fielded use — China's civil-military fusion provides a structural advantage that the U.S. private-sector model cannot easily replicate.[^40] The Maduro raid illustrates the American side of this equation. The capability existed — Claude is arguably the most capable AI model in the world. The integration existed — Palantir's Maven platform is operational across five combatant commands. But the employment doctrine did not exist. The provider-integrator-user framework did not exist. The result: a successful operation that produced a strategic-level crisis between the Pentagon and a critical technology partner. China does not have this problem. When DeepSeek provides AI to Norinco, there is no phone call afterward asking whether the AI was used as intended. The state directs. The company complies. The military deploys. Civil-military fusion eliminates the doctrine gap by eliminating the doctrinal negotiation. This is not an argument for the Chinese model. It is an argument for closing the doctrine gap within the American model — before the speed advantage compounds into a capability-plus-speed advantage that the United States cannot overcome. ### Russia: Disruption Over Matching Russia's approach to AI-enabled warfare reflects a different calculus. A February 2026 CSIS report by Kateryna Bondar documented that Russia "is not chasing technological elegance or conceptual completeness but rather applying AI selectively and ruthlessly in service of battlefield effectiveness."[^41] Russia's AI strategy is shaped by resource constraints, technological gaps, and three years of combat data from Ukraine: - **Systematic data collection.** The Russian military launched a 2025 data collection effort focused on unmanned operations and strike outcomes. The infrastructure aggregates UAS video feeds, operator telemetry, strike effects, and individual pilot performance metrics — each linked to unique personal identifiers. This is the raw material for machine learning at scale.[^42] - **LLM integration.** Planned 2025-2026 upgrades add large language model support for smart assistants and more autonomous AI systems. By mid-2025, Russian defense companies had moved from isolated LLM experiments to full-scale deployment, with retrieval-augmented generation becoming the dominant architecture — mentioned in more than a third of AI-related defense industry vacancies.[^43] - **Tactical focus over strategic ambition.** Russia abandoned the pursuit of a comprehensive automated command-and-control architecture comparable to Western joint concepts. Instead, it reallocated effort toward tactical, task-specific software driven by battlefield necessity. The "Svod" Tactical Situational Awareness Complex, announced August 2025, represents this pragmatic approach — solving concrete problems rather than building elegant systems.[^44] - **Electronic warfare and cyber disruption.** Where Russia cannot match Western AI capability, it aims to degrade it. Electronic warfare systems that jam communications, spoof GPS, and disrupt data links between AI-dependent command nodes represent Russia's asymmetric counter to the digital battle staff. If the AI needs data to function, deny the data.[^45] Russia's strategy creates a specific risk for AI-enabled Western forces: dependence on AI systems that assume network connectivity, data availability, and electromagnetic spectrum access — assumptions that Russian electronic warfare capabilities are specifically designed to invalidate. ### NATO: The Alliance Response NATO's Defence Innovation Accelerator for the North Atlantic — DIANA — announced its largest-ever cohort in December 2025: 150 companies from 24 NATO nations selected to work on ten critical defense and security challenges. Starting January 2026, these companies gain access to DIANA's network of 16 accelerator sites and more than 200 test centers across all 32 NATO member states.[^46] The alliance response is broader than any single nation's effort: - **United Kingdom:** The Ministry of Defence is running multi-agent pilots for logistics and intelligence analysis — applying the orchestrator-worker pattern to staff functions that consume the most analyst hours.[^47] - **France:** Pursuing "strategic AI" with a European-autonomy focus — building sovereign capability that does not depend on American models or American providers. The Anthropic-Pentagon crisis validated this approach in real time.[^48] - **Germany:** Cautious and deliberate. Limiting AI employment to defensive use cases. Germany's position reflects a political calculation — the risks of AI in offensive operations outweigh the benefits given Germany's historical sensitivity to autonomous weapons.[^49] - **Helsing:** Europe's most valuable defense startup ($12 billion valuation) proved that European AI capability is real. On May 28, 2025, Helsing's "Centaur" AI agent flew a Saab Gripen E fighter jet autonomously — executing complex maneuvers in a beyond-visual-range combat environment against a human-piloted Gripen D. It was the first publicly confirmed case of an AI system flying a frontline fighter jet in BVR scenarios. The AI preparation produced the equivalent of 50 years of human pilot experience. Helsing achieved the maiden flight less than six months after conception.[^50] - **ALFA 2026:** Turkey hosted NATO's AI, Quantum, and Autonomous Systems exercise in Istanbul, bringing together 50 organizations from multiple nations to develop solutions for detecting and neutralizing drifting mines, protecting critical undersea infrastructure, and applying quantum technologies to defense challenges. Thirteen Turkish defense firms showcased autonomous systems.[^51] ### The Calculus The adversary calculus reduces to three propositions. First, China is not building a Napoleonic staff model. China's civil-military fusion gives it a structural speed advantage in AI adoption. The U.S. advantage is in model capability — but capability without doctrine, as the Maduro raid demonstrated, is not superiority. A technically inferior AI system deployed with doctrine, training, and process discipline will outperform a technically superior system deployed without them. The 3,600-to-1 planning compression ratio that Xi'an Technological University demonstrated with DeepSeek is not about model quality. It is about integration discipline. Second, Russia's strategy targets the assumptions underlying AI-enabled warfare. Every digital battle staff concept assumes persistent connectivity, reliable data, and electromagnetic spectrum access. Russia's electronic warfare and cyber capabilities are designed to invalidate those assumptions. Doctrine that does not account for degraded AI performance — that does not have a manual fallback, a "Weapons Hold" mode for the AI when the network goes down — is doctrine that will fail in contested environments. Third, NATO's response is diffuse but accelerating. DIANA provides the experimentation infrastructure. Helsing proved the capability. The UK and France are running pilots. But no NATO nation has yet produced the doctrinal framework that connects AI capability to military employment in a way that all alliance members can reference. The interoperability challenge that NATO has managed for decades with weapons systems, communications, and logistics now extends to AI agents — and the standards (MCP for agent-to-tool, A2A for agent-to-agent) are being written by Silicon Valley, not by SHAPE.[^52] The race is not between American AI and Chinese AI. The race is between American doctrine and Chinese integration speed. The nation that first produces a coherent framework for AI employment in military operations — doctrine that governs the relationship between AI provider, integrator, and warfighter; training that produces professionals capable of managing AI systems under operational conditions; process discipline that prevents the ad-hoc improvisation that characterized the Maduro raid — will hold the decisive advantage. The 49B career field is a start. SOCOM's agentic AI experimentation is a start. The seven Pace-Setting Projects are a start. But starts are not doctrine. Doctrine is the bridge between capability and employment. That bridge does not yet exist. Building it is not optional. --- ## 11. THE BRIDGE: WHAT CIVILIAN AI CAN LEARN FROM THE MILITARY The preceding sections established that the U.S. military is building AI agent systems, forming the doctrine to govern them, creating the career fields to operate them, and deploying them in combat operations —all while the civilian AI industry discovers through expensive failure what military doctrine codified decades ago. This section maps the translation. Military concept to AI equivalent. Doctrine to protocol. Experience to engineering requirement. The mapping is not metaphorical. It is structural. ### 10.1 The Pattern Map Every military planning and coordination mechanism has a direct functional equivalent in multi-agent AI system design. The table below presents the complete mapping —each military concept paired with its AI counterpart and the specific problem it solves. | Military Concept | AI Equivalent | What It Solves | |-----------------|---------------|----------------| | Commander's Intent | System prompt / agent persona | Enables independent action when plans break down. The subordinate who understands intent two levels up makes adaptive decisions. The agent that understands purpose beyond its immediate task degrades gracefully under ambiguity. | | OPORD / FRAGO | Structured task format / A2A protocol | Standardized inter-agent communication. The five-paragraph order —Situation, Mission, Execution, Sustainment, Command and Signal —survived seven decades of combat because it eliminates ambiguity between sender and receiver.[^53] | | WARNO (Warning Order) | Pre-task notification / parallel preparation | Enables downstream agents to begin staging before the full plan exists. The one-third/two-thirds rule gives subordinates maximum preparation time. AI systems that wait for complete plans before starting preparation waste the same time military units wasted before WARNOs became doctrine.[^54] | | ROE (Rules of Engagement) | Permission boundaries / guardrails | Prevents unauthorized actions without paralyzing capability. A soldier with a rifle and clear ROE is a precision instrument. An AI agent with clear constraints executes within defined boundaries. Remove the ROE from either, and you get the Anthropic-Pentagon crisis. | | AAR (After Action Review) | Post-execution analysis / feedback loops | Four questions: What did we plan? What happened? Why? What next? This structured reflection converts operational experience into institutional knowledge. AI systems generate enormous operational data but lack systematic mechanisms to convert that data into improvement.[^34] | | COP (Common Operating Picture) | Shared state / context management | Everyone sees the same truth. When agents operate on different versions of reality —stale context, truncated history, conflicting data —the result is the multi-agent equivalent of fratricide: agents working at cross-purposes because nobody maintained a shared picture. | | Battle Rhythm | Scheduled synchronization cycles | Prevents drift between independent agents. The TOC runs shift changes, intelligence updates, logistics reports, and commander's update briefs on a fixed schedule. AI agents operating without synchronization cycles drift from shared objectives exactly as military units drift without a battle rhythm.[^35] | | MDMP (7 steps) | Structured planning framework | Front-loads the thinking. Commanders spend 40% of planning time on Mission Analysis alone —understanding the problem before generating solutions. AI systems do the opposite. They generate solutions immediately and discover problems during execution.[^55] | | Staff specialization (S-1 through S-4) | Agent role specialization | Right agent for right task. The S-2 (Intelligence) does not conduct logistics planning. The S-4 (Logistics) does not produce the intelligence estimate. Role clarity prevents the MAST taxonomy's "Disobey Role Specification" failure mode —agents abandoning their defined function and behaving like another agent.[^56] | | Synchronization Matrix | Dependency management / DAG | Coordinates timing across parallel operations. The synchronization matrix plots every element of combat power against time and geography. A directed acyclic graph plots every agent task against dependencies and sequencing. Both solve the same problem: when does what happen, and what must finish before the next action begins. | This mapping is not the product of theoretical analysis. It emerged from operational experience. After 26 years of watching military formations succeed and fail at coordination, and months of building multi-agent AI workflows that exhibited the exact same failure patterns, the convergence became impossible to ignore. The solution was never more intelligence or faster processing. It was doctrine. ### 10.2 Standards Convergence: The Doctrine Center Emerges Military doctrine standardizes through publications. FM 5-0 defines planning. ADP 6-0 defines command. FM 6-0 defines staff organization. The five-paragraph OPORD format standardizes communication between echelons. These standards required decades of institutional effort, combat experience, and iterative refinement. Every service fought for its own formats before joint doctrine imposed standardization. The AI industry is compressing that timeline into months. Two protocols have emerged that together form the beginning of a standardized agent doctrine at the infrastructure level. Anthropic's **Model Context Protocol (MCP)**, released in late 2024 and donated to the Linux Foundation in December 2025, standardizes how agents connect to tools, data sources, and external context. MCP handles the vertical relationship —an agent reaching down to use a tool, query a database, or access a file system. By early 2026, MCP has achieved 97 million monthly SDK downloads and powers over 10,000 active public servers. ChatGPT, Cursor, Gemini, Microsoft Copilot, Visual Studio Code, Replit, and Sourcegraph have all adopted it. Over 50 enterprise partners —including Salesforce, ServiceNow, Workday, Accenture, and Deloitte —are building MCP integrations.[^36] Google's **Agent2Agent (A2A) Protocol**, announced in April 2025 with over 100 technology partners and donated to the Linux Foundation in June 2025, standardizes how agents communicate with each other. A2A handles the horizontal relationship —agents discovering, coordinating with, and delegating to peer agents across organizational boundaries. The protocol uses Agent Cards for capability discovery, defines task lifecycle management, and supports text, audio, video, and structured data exchange. Founding members under Linux Foundation governance include Amazon Web Services, Cisco, Google, Microsoft, Salesforce, SAP, and ServiceNow.[^37] | Protocol | Military Analogy | Function | |----------|-----------------|----------| | **MCP** | Technical Manual (TM) | How an agent interfaces with its tools and data | | **A2A** | OPORD / FRAGO | How agents coordinate missions with each other | | **MCP + A2A** | Complete C2 stack | Full interoperability from tool use to inter-agent coordination | In December 2025, the Linux Foundation announced the **Agentic AI Foundation (AAIF)** —housing Anthropic's MCP, Block's goose framework, and OpenAI's AGENTS.md standard. Platinum members: AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, and OpenAI.[^38] This is the AI industry building its Combined Arms Center —the doctrinal headquarters where standards are written, tested, and enforced. The parallel is precise: just as the Army's Combined Arms Center at Fort Leavenworth produces the doctrine that enables joint operations, the AAIF produces the protocol standards that will enable multi-agent interoperability. The difference is that it took the Army decades to standardize the OPORD format across all services. The AI industry is attempting the same standardization under market pressure, driven not by institutional wisdom but by the catastrophic cost of operating without it. The adoption numbers confirm the trajectory. Gartner projects that 40 percent of enterprise applications will feature AI agents by 2026 —up from less than 5 percent in 2025. Adobe, S&P Global, and ServiceNow are already building on A2A. One-third of agentic AI implementations will combine agents with different skills by 2027.[^39] ### 10.3 The Failure Mode Map The bridge between military doctrine and AI system design becomes most visible when examining how systems fail. UC Berkeley's MAST taxonomy —14 failure modes across 1,600+ execution traces, earning a Spotlight at NeurIPS 2025 —provides the empirical data. Military doctrine provides the solutions. The alignment is not approximate. It is direct.[^40] **Communication failures map to standardized orders formats.** When agents fail to share critical information (MAST FM-2.4: Information Withholding) or ignore peer inputs (FM-2.5: Ignored Other Agent's Input), they reproduce the intelligence stovepiping that the joint OPORD format was designed to eliminate. The OPORD's "Situation" paragraph exists specifically to ensure every subordinate receives the same intelligence picture. When agents lack a standardized information-sharing format, they withhold or ignore information exactly as stovepiped intelligence sections do. **Planning failures map to MDMP.** Thirty-seven percent of all multi-agent failures originate in specification and system design —the planning phase. MDMP front-loads 40 percent of available time on Mission Analysis precisely because the planning phase is where most operational failures originate. When agents receive tasks without defined purpose, success criteria, or constraints (FM-1.1: Disobey Task Specification), they fail at the same rate as military operations launched without a complete OPORD. **Coordination failures map to the synchronization matrix and battle rhythm.** When agents derail from assigned objectives (FM-2.3: Task Derailment) or repeat completed steps (FM-1.3: Step Repetition), they reproduce the coordination breakdowns that the synchronization matrix prevents. The matrix plots every element against time and geography. Without it, elements duplicate effort, miss sequencing, and drift from the scheme of maneuver. **Oversight failures map to commander's intent, ROE, and human-in-the-loop checkpoints.** When agents terminate prematurely (FM-3.1), verify incorrectly (FM-3.3), or lose conversation history (FM-1.4), they reproduce the assessment failures that the AAR process and battle damage assessment procedures address. The military builds verification into every phase of operations —not because soldiers are unreliable, but because complex operations under uncertainty produce conditions where verification is the only defense against cascading error. The Google/MIT study quantifies the cascade rate: 17.2 times error amplification in independent multi-agent systems, 4.4 times in centralized architectures.[^41] Those are not software metrics. Those are casualty rates for information integrity. Military doctrine was built to survive exactly this kind of degradation. ### 10.4 The Practitioner's Observation Here is what 26 years of military service and months of building AI agent workflows taught me about the convergence. The AI agents I deploy fail the same way platoons fail. They fail when nobody tells them the purpose behind the task. They fail when they lack a common operating picture. They fail when they operate without constraints. They fail when nobody checks their work. They fail when they lose communication with higher headquarters. They succeed the same way platoons succeed. They succeed with clear intent, defined left and right limits, a known end state, and a leader who checks progress at defined intervals. The key insight is this: military doctrine works because it assumes imperfect actors operating under uncertainty with incomplete information. A 19-year-old private does not have perfect judgment. A staff officer does not have complete intelligence. A commander does not have unlimited time. Doctrine accounts for all three limitations simultaneously. It builds redundancy into communication, structure into planning, and verification into execution —not because any individual element is unreliable, but because the system must function when any individual element fails. That description applies perfectly to AI agents. They have imperfect reasoning. They operate with incomplete context. They face computational constraints. The doctrine designed for human formations under these conditions applies with minimal adaptation to AI agent formations under the same conditions. The civilian AI industry spent 2025 discovering this through failure. Cursor's flat-team architecture failed because it violated span-of-control doctrine. Gartner predicts 40 percent of agentic AI projects will be cancelled by 2027 because they launched without mission analysis. UC Berkeley documented 41 to 86.7 percent failure rates across seven frameworks because none of them implemented the planning and verification structures that military doctrine mandates.[^42] The military did not design doctrine for AI. But it designed doctrine for the conditions AI operates under. That is the bridge. ### 10.5 The Translation Challenge The bridge exists. Crossing it requires translation. Military professionals speak in acronyms, numbered paragraphs, and doctrinal references. AI engineers speak in APIs, architectures, and benchmarks. Neither community reads the other's literature. Neither attends the other's conferences. Neither hires from the other's talent pool. This gap is the single largest obstacle to applying military coordination expertise to AI system design. The concepts translate directly. The communities do not. Three initiatives are narrowing the gap. The Army's establishment of 49B —the AI/ML Officer career field —creates a bilingual cadre: military professionals who understand both doctrine and AI systems.[^43] SOCOM's agentic AI experimentation at Avon Park in April 2026 brings AI system designers into direct contact with operational requirements.[^44] The Command and General Staff College's integration of LLMs into MDMP wargaming —where simplified, intent-focused prompts produced "far more realistic adjudications" than detailed prompts —demonstrates that military planning frameworks improve AI performance when properly applied.[^45] But these are early steps. The translation requires sustained institutional investment from both sides. Military doctrine professionals must publish the mapping between their frameworks and AI system design —not as analogy but as engineering specification. AI system designers must study C2 frameworks with the same rigor they apply to transformer architectures. The doctrine exists. The protocols are forming. The technology is deployed. What remains is the disciplined work of translation —the bridge between two communities that solved the same problem without knowing the other existed. --- ## 12. RISK AND RESTRAINT The preceding sections make the case for applying military doctrine to AI agent coordination. This section applies military risk management to the AI systems themselves. The risks are real. They are manageable. But only if the same discipline that governs weapons employment governs AI deployment. ### 11.1 The Legal Framework The legal governance of military AI draws from established law of armed conflict (LOAC) principles, adapted for autonomous and semi-autonomous systems. Four principles form the foundation.[^46] **Human decision authority.** Humans remain the decision authority on lethal force and strategic decisions. No AI system —regardless of speed, accuracy, or confidence —replaces the commander's authority to direct the employment of lethal force. This principle is not a policy preference. It is a legal requirement under LOAC and customary international law. **Transparency.** AI-assisted decisions must be explainable to human decision-makers. A commander who cannot understand why an AI system recommended a particular course of action cannot exercise informed judgment. The requirement for explainability is not a technical nice-to-have. It is a legal prerequisite for command responsibility. **Accountability.** Humans, not AI systems, are accountable for outcomes. When an AI-recommended strike causes civilian casualties, the commander who approved the strike bears legal responsibility —not the algorithm, not the developer, not the integrator. This principle creates an irreducible demand for human understanding of AI-generated recommendations. **Proportionality.** AI systems must respect LOAC constraints on proportionality —the requirement that expected military advantage must outweigh anticipated civilian harm. Proportionality analysis requires contextual judgment that current AI systems cannot reliably provide. Research at the European Conference on Cyber Warfare and Security has begun modeling proportionality assessment through agent-based simulation, but the work remains experimental.[^47] ### 11.2 The Autonomous Weapons Debate The United States maintains that AI cannot independently select targets for autonomous weapons systems. This position is unilateral policy, not treaty-bound obligation. No international treaty currently prohibits lethal autonomous weapons systems, though the UN General Assembly passed a historic resolution in November 2025 calling for negotiations on a legally enforceable agreement by the Seventh Review Conference in 2026, with 156 nations supporting.[^48] The policy distinction creates a gray area that the Maduro operation illuminated. AI can be used for decision support, planning, and intelligence analysis. It cannot autonomously select targets for engagement. But what about an AI system that generates strike recommendations —identifying targets, assessing proportionality, recommending weapons-target pairing —and a human commander approves the recommendation in seconds? Is that autonomous targeting? Or is it decision support with a human in the loop? The answer depends on whether the human approval constitutes meaningful human control or rubber-stamping. If a commander approves every AI recommendation without independent analysis, the human is not exercising decision authority. The human is providing legal cover for algorithmic targeting. The speed of AI-generated recommendations compounds this risk: when AI compresses decision timelines from hours to seconds, the time available for independent human judgment compresses proportionally.[^49] Anthropic's position draws two hard limits: no fully autonomous weaponry operating without meaningful human oversight, and no mass surveillance of American citizens. CEO Dario Amodei has framed these limits as constitutional safeguards —preventing AI from making "mockery" of the First and Fourth Amendments. Anthropic is willing to negotiate on the boundary between these limits and military operational requirements, but not on the limits themselves.[^50] The Pentagon's position is "all lawful purposes" —including weapons development, intelligence collection, and battlefield operations. Pentagon CTO Emil Michael characterized Anthropic's restrictions as "not democratic," arguing that a private company should not limit how a democracy employs tools for national defense. The designation of Anthropic as a potential "supply chain risk" —a classification normally reserved for foreign adversaries —represents the most aggressive government response to an AI company's self-imposed constraints in the short history of military AI.[^51] Neither position is unreasonable. Both positions are incomplete. The resolution requires the kind of structured negotiation that military doctrine provides —shared definitions, agreed constraints, defined escalation procedures, and verification mechanisms. In military terms, it requires agreed ROE between the provider and the operator. The absence of that agreement produced the current crisis. ### 11.3 Classification and Deployment Military AI deployment operates across three classification tiers, each with distinct constraints. **Unclassified (Impact Level 5).** Commercial AI models are permitted with usage monitoring. GenAI.mil operates at IL-5, serving 1.1 million unique users across five of six military branches with models from Google (Gemini), OpenAI (ChatGPT), and xAI (Grok). Anthropic's Claude remains pending amid the current dispute.[^52] **Classified (SECRET, Impact Level 6).** Only DoD-authorized models deployed on accredited infrastructure. Anthropic's Claude, deployed through Palantir Technologies, achieved IL-6 accreditation —making it the first frontier AI model deployed on classified Pentagon networks. Palantir's Maven Smart System provides the deployment platform, with contracts totaling $1.3 billion across five combatant commands through 2029.[^53] **Top Secret (JWICS).** Scale AI's Donovan platform operates on both SIPRNet (SECRET) and JWICS (Top Secret) networks, providing natural language queries against classified battlefield data. Access at this level requires specialized infrastructure and the most restrictive security protocols.[^54] The classification framework creates an operational tension. The most capable commercial AI models —the ones that achieve 90.2 percent improvement through multi-agent orchestration —are restricted at the classified level. The models available on classified networks are constrained by the accreditation process, which moves slower than the technology it evaluates. The result: operators at the unclassified level have access to frontier capability, while operators handling the most sensitive decisions work with constrained tools. ### 11.4 Safety Risks for Military AI Agents Four categories of risk demand specific mitigation strategies. **Adversary manipulation of inputs.** Poisoned intelligence feeds could mislead AI agent teams into flawed assessments. An adversary who understands how an AI agent processes intelligence data can craft inputs designed to produce specific, incorrect outputs. This is not speculative. Adversarial attacks on machine learning systems are well-documented in academic literature and are an explicit area of Chinese military AI research.[^34] **Information cascades.** When one agent reports an assessment, downstream agents may treat that assessment as confirmed intelligence rather than a single-source estimate. "Agent B confirmed" becomes "confirmed" —stripping the caveat, amplifying the confidence, and producing a collective assessment that no single source supports. The Google/MIT study quantified this: 17.2 times error amplification in independent multi-agent systems. In military terms, this is the intelligence echo chamber that produced some of the worst analytical failures in recent history.[^35] **Overconfidence.** Commanders may over-rely on AI-generated recommendations, applying insufficient critical thinking to outputs that arrive with computational precision and apparent certainty. An AI assessment formatted as a decision matrix with probability estimates carries an authority that its underlying uncertainty does not justify. This risk is not unique to AI —military professionals recognized it with satellite imagery, signals intelligence, and every other technical collection platform that produces precise-looking outputs from uncertain data.[^55] **Decision speed outpacing oversight.** When AI compresses decision timelines from days to minutes, the time available for human review, legal counsel, and proportionality analysis compresses proportionally. The battle rhythm that enables thoughtful decision-making at human speed may not survive AI-enabled tempo. If the adversary operates at AI speed and friendly forces operate at human speed, the pressure to remove human checkpoints becomes operationally compelling and strategically catastrophic. The RAND Corporation's analysis of AI's impact on military competition frames the strategic dimension. In "How Artificial Intelligence Could Reshape Four Essential Competitions," the authors identify four axes of competition: quantity versus quality, hiding versus finding, centralized versus decentralized command and control, and cyber offense versus defense. AI shifts the balance on every axis. AI-enabled uncrewed systems create "precise mass" and "affordable mass" —changing the quantity-quality calculus. AI-enhanced sensors shift the hiding-finding balance toward finding. AI enables both more centralized and more distributed C2. The conclusion: the U.S. military "might need to change important aspects of how it traditionally operates" to exploit AI's potential.[^56] ### 11.5 The Academic Foundation The academic community has begun producing the analytical frameworks that doctrine development requires. **Command-Agent** (ScienceDirect, 2025) presents a battlefield command simulation architecture integrating large language models with digital twin technology. The system constructs realistic operational environments through real-time simulation and multi-source data fusion, enabling autonomous command through the Observe-Orient-Decide-Act (OODA) feedback loop. The framework leverages LLMs' natural language capabilities to replace traditional command interfaces, enabling intelligent command through natural language interaction.[^36] **Multi-Agent System for COA Comparison** (European Conference on Cyber Warfare and Security, 2025) combines the Analytic Hierarchy Process with fuzzy logic, enabling agents to collaboratively evaluate hierarchical decision criteria —the same function that MDMP Step 5 (COA Comparison) performs through a decision matrix. The experimental results demonstrate that AI-based COA evaluation adapts effectively to changing operational conditions.[^37] **Agent-Based Model for Proportionality Assessment** (ECCWS, 2025) introduces a model designed to simulate proportionality assessment in military operations, capturing interactions between decision-makers, environmental variables, and operational factors. This addresses the most legally sensitive dimension of military AI —the assessment that expected military advantage justifies anticipated civilian harm.[^38] The **Belfer Center** (Harvard Kennedy School, 2024) defines agentic AI as "multiple autonomy-based technologies working synergistically" that can "perceive its environment and define a course of action on its own to achieve a given goal." The authors —a 2024 National Security Fellow and a 2024 Air Force National Defense Fellow —argue that military leaders should accelerate experimentation and adoption of agentic AI tools into joint operational planning, working to mitigate risks as they arise rather than waiting for a perfect product.[^39] ### 11.6 The Risk Management Assessment The risks are real. They are not unprecedented. Every weapons system in military history has required risk management. Artillery required accuracy standards, safety zones, and fire control procedures. Close air support required terminal attack control, restricted operating zones, and abort criteria. Nuclear weapons required two-person integrity, permissive action links, and a chain of command that begins with the President. Each system's destructive potential demanded a governance framework proportional to its risk. AI agents are no different. Their risk profile is distinct —information integrity rather than kinetic lethality —but the risk management methodology is identical. The Army's four-step process applies directly: identify hazards, assess probability and severity, develop controls, implement controls.[^40] The question is not whether military AI agents pose risks. The question is whether the rate of deployment outpaces the rate of doctrinal development. SOCOM experiments with agentic AI in April 2026. Formal doctrine updates are expected in 2027-2028. The gap between capability deployment and doctrinal governance is where risk lives. The military calls this "doctrine by exception" —units in the field developing procedures that work, which are later codified into formal doctrine. It is a valid and historically successful approach. It is also dangerous, because the exceptions that do not work produce casualties —or in the AI context, strategic crises —before doctrine catches up. --- ## 13. CONCLUSION: THE DIGITAL BATTLE STAFF IS ALREADY HERE The digital battle staff is not a future concept. It is being built right now, by programs with budgets, timelines, and operational requirements. ### The Programs Exist Exia Labs' **Blue** automates the Military Decision Making Process with AI agents assigned to each MDMP step —from Receipt of Mission through COA generation. It is being tested by the 101st Airborne Division and the Washington Army National Guard.[^41] Palantir's **Maven Smart System** provides AI-powered intelligence fusion across five combatant commands, with contracts exceeding $1.3 billion through 2029.[^42] Scale AI's **Thunderforge** and **Donovan** platforms operate on classified networks, enabling natural language queries against battlefield data.[^43] Shield AI's **Hivemind** pilots aircraft autonomously without GPS, communications, or a remote pilot.[^44] Anduril's **Lattice OS** integrates sensors and weapons through an AI-powered C2 system with an open SDK.[^45] The Chief Digital and AI Officer's **GenAI.mil** has reached 1.1 million users across five service branches.[^46] These are not demonstrations. They are deployed systems with military users operating under operational conditions. ### The Doctrine Is Forming The Department of War's AI Strategy establishes seven Pace-Setting Projects and directs the military to become an "AI-first warfighting force."[^47] CSIS identifies three AI-enabled staff models —Networked, Relational, and Adaptive —and recommends the Adaptive model for its resilience and learning capacity.[^48] The Army's 49B career field creates the first dedicated AI/ML officer specialty, with graduate-level training in AI system development, deployment, and sustainment.[^49] The Command and General Staff College has integrated AI into MDMP wargaming, with plans to make AI-enabled wargaming the default method for all planning exercises starting academic year 2026-2027.[^50] ### Technology Outpaces Doctrine SOCOM experiments with agentic AI at Avon Park in April 2026. Formal doctrine updates to FM 6-0 and ADP 6-0 are expected in 2027-2028. Claude has already been deployed in a combat operation before the doctrine governing its use has been written. The Maduro operation is the proof point: capability deployed without shared doctrine between provider, integrator, and operator produces a post-operation crisis that threatens to reshape the entire military-AI industry relationship. This is the pattern military professionals recognize. Capability precedes doctrine. Units in the field adapt. Some adaptations succeed. Some fail. The failures produce after-action reviews. The after-action reviews produce lessons learned. The lessons learned produce doctrine. The doctrine arrives two years after the units needed it. AI compresses this timeline. The technology evolves monthly. Doctrine that takes two years to publish addresses a capability that has transformed three times since the writing began. The doctrinal development model that worked for generations of weapons systems may not survive the velocity of AI capability development. ### The Two-Sentence Summary The U.S. military has two centuries of experience coordinating specialized elements under uncertainty with imperfect information —the exact problem AI agent systems face today. The question is not whether the digital battle staff will exist; it is whether we build it with doctrine or discover its requirements through failure. ### The Call to Action **To military professionals:** Accelerate doctrine development. Do not wait for the 2027-2028 formal updates to FM 6-0. The AI agents are already deployed. The operators need doctrine now. Publish interim guidance. Run experiments. Conduct AARs. Feed the doctrinal pipeline with operational experience, because the doctrine that emerges from combat employment will be better than the doctrine that emerges from working groups. **To the technology industry:** Study military C2 frameworks. They work. They scale. They survived contact with reality —which is more than can be said for most multi-agent architectures currently in production. The hierarchical orchestrator-worker pattern that every major AI company independently discovered is the military command-and-control model. The military has been refining that model for two centuries. Read FM 5-0. Read ADP 6-0. Read the MAST taxonomy alongside the AAR format. The frameworks are open-source, battle-tested, and free. **To policymakers:** Resolve the Anthropic impasse. A frontier AI capability deployed on classified networks through a $200 million contract, now threatened with "supply chain risk" designation over a dispute about usage terms, is not a technology problem. It is a governance failure. The adversary is not waiting for American institutions to settle their disagreements about AI ethics. Define the ROE. Negotiate the boundaries. Get the capability into the hands of the warfighter with agreed constraints, not unresolved ambiguity. **To practitioners in every domain:** Apply MDMP to AI system design. Front-load the thinking. Spend 40 percent of your time on Mission Analysis before you write the first prompt. Build in the checkpoints. Conduct the AAR after every complex operation. Codify what you learn. The Off-Ramp Model described in this paper converts AI-dependent operations into autonomous systems that run without AI involvement —but only if you maintain the discipline during Phase 1. ### The Close Scaling AI agents is herding cats. The cats are brilliant, tireless, and fast —but nobody told them where the barn is, and half of them are chasing mice that do not exist. The military told them where the barn is two hundred years ago. It is time the AI industry read the manual. --- ## CITATIONS [^1]: Thiebault, Paul. "Development of the French Staff System from Ancien Regime to the Revolution." The Napoleon Series, Military Organization. See also: "Napoleon's transition to a staff model," The War College Library, 2009. [^2]: "Staff (military)," Wikipedia. "Most NATO nations, including the United States and most European nations, use the Continental Staff System which has origin in Napoleon's military." [^3]: For Anthropic's multi-agent architecture: "How we built our multi-agent research system," Anthropic Engineering Blog, June 2025. For OpenAI Agents SDK: OpenAI documentation, March 2025. For Microsoft: "Agent Framework merges AutoGen with Semantic Kernel," October 2025. For Google: Agent Development Kit documentation. For Cursor and Gastown production results: see Paper 1 (Marshall, "The Super Intelligent Five-Year-Old," February 2026), Section 8.3. [^4]: Kim et al., "Towards a Science of Scaling Agent Systems," Google Research / Google DeepMind / MIT, December 2025. arXiv:2512.08296v1. 180 controlled experiments across five architecture types and three model families. [^5]: Cemri, Mert, Melissa Z. Pan, Shuyi Yang, et al., "Why Do Multi-Agent LLM Systems Fail?" UC Berkeley Sky Computing Lab, March 2025. arXiv:2503.13657. NeurIPS 2025 Datasets and Benchmarks Track, Spotlight designation. MAST-Data: 1,600+ annotated traces across 7 MAS frameworks. [^6]: Jones, Nate B. AI News & Strategy Daily, January 2026 analysis. Jones is an AI-first product strategist and former Head of Product at Amazon Prime Video. See also: "Agent Wars: The Hype, Hope, and Hidden Risks with Nate B. Jones," AI Explained Podcast, Fiddler AI. [^7]: Marshall, Jeep. Original formulation. See also Paper 1, Section 8.3. [^8]: "Grande Armée," Wikipedia. Napoleon's invasion force for the Russian campaign of 1812 numbered approximately 600,000 soldiers from France and allied nations. [^9]: "French Army Staff and Officers I," War History. See also: "The command and control of the Grand Armee," Master's thesis, U.S. Army Command and General Staff College. [^10]: The corps system created "miniature armies" — each corps was composed ordinarily of three infantry divisions and a division of light cavalry with its own staff and administrative services, permitting "rapid, long-distance marches without clogging up roads and exhausting supplies." "Grande Armée," Wikipedia. [^11]: "Staff (military)," Wikipedia. "The Continental Staff System which has origin in Napoleon's military. Derived from the Prussian Grosse Generalstab (Great General Staff)." [^12]: Jensen, Benjamin and Matthew Strohmeyer. "Agentic Warfare and the Future of Military Operations: Rethinking the Napoleonic Staff." Center for Strategic and International Studies, Futures Lab, July 2025. Available at: https://www.csis.org/analysis/rethinking-napoleonic-staff [^13]: Ibid. The three models are derived from distinct social science theoretical frameworks: Bruno Latour's actor-network theory, Harrison White's network sociology, and Andrew Abbott's theory of professional knowledge creation. [^14]: Ibid. Networked Staff model description. "Smaller staff elements work through functional agents ingesting data including live updates, doctrine, history, and military theory." [^15]: Ibid. Relational Staff model description. Harrison White, *Identity and Control: How Social Formations Emerge* (Princeton University Press, 2008). [^16]: Ibid. Adaptive Staff model description. Andrew Abbott, *The System of Professions: An Essay on the Division of Expert Labor* (University of Chicago Press, 1988). [^17]: Jensen and Strohmeyer, "Rethinking the Napoleonic Staff": "AI now automates intelligence fusion, refines threat assessments, and recommends actions, compressing decision timelines from days to minutes." [^18]: Ibid. "China's strategy to disrupt U.S. decision networks through cyber, electronic, and long-range strikes makes traditional, centralized staffs vulnerable." [^19]: Jensen, Benjamin, Dan Tadross, and Matthew Strohmeyer. "Agentic Warfare Is Here. Will America Be the First Mover?" War on the Rocks, April 2025. https://warontherocks.com/2025/04/agentic-warfare-is-here-will-america-be-the-first-mover/ [^20]: Jensen and Strohmeyer, "Rethinking the Napoleonic Staff," recommendations section. [^21]: "Artificial Intelligence Strategy for the Department of War," Office of the Secretary of Defense, January 9, 2026. Available at: https://media.defense.gov/2026/Jan/12/2003855671/-1/-1/0/ARTIFICIAL-INTELLIGENCE-STRATEGY-FOR-THE-DEPARTMENT-OF-WAR.PDF [^22]: Ibid. Seven Pace-Setting Projects with "single accountable leader and aggressive timelines." [^23]: "Pentagon Seeks $13.4 bn for AI and Autonomy FY 2026 Budget Request," CDO Magazine, February 2026. See also: "$9.8 Billion in Autonomy Spending Hits the AI-Boosted Defense Supply Chain," GlobeNewsWire, February 13, 2026. [^24]: "DOD's $66B IT budget pivots to AI and efficiency," Government Executive / Washington Technology, February 2026. Navy AI spending increase: $308M, 22.7% year-over-year. [^25]: DoW AI Strategy, January 2026. "Incorporate standard 'any lawful use' language into any Department of War contract through which AI services are procured within 180 days." [^26]: Ibid. Section: "Clarifying 'Responsible AI' at the Department of War — Out with Utopian Idealism, In with Hard-Nosed Realism." [^27]: Hegseth, Pete. Remarks at the Pentagon, January 12, 2026. Reported in: "Grok is in, ethics are out in Pentagon's new AI-acceleration strategy," Defense One, January 2026; "'Accelerate like hell': Hegseth moves to reshape DOD's AI and tech hubs," DefenseScoop, January 13, 2026. [^28]: "5 out of 6 military branches have elevated GenAI.mil as their go-to enterprise AI platform," DefenseScoop, February 2, 2026. 1.1 million unique users confirmed. See also: "Pentagon's GenAI.mil Platform Hits 1.1 Million Military Users," AI Unfiltered. [^29]: "ChatGPT will be available to 3 million military users on GenAI.mil," Breaking Defense, February 2026. "Pentagon Partners With OpenAI to Add ChatGPT to GenAI.mil," ExecutiveGov. Contracts worth up to $200 million per company. [^30]: "Anthropic, Google and xAI win $200M each from Pentagon AI chief for 'agentic AI,'" Breaking Defense, July 2025. "Anthropic, Google, OpenAI and xAI granted up to $200 million for AI work from Defense Department," CNBC, July 14, 2025. [^31]: "Pentagon pushes AI companies to deploy unrestricted models on classified military networks," The Decoder, February 2026. "Pentagon CTO urges Anthropic to 'cross the Rubicon' on military AI use cases," DefenseScoop, February 19, 2026. [^32]: "Pentagon rolls out major reforms of R&D, AI," Breaking Defense, January 2026. "Pentagon Consolidates DIU, CDAO Under R&E to Streamline Innovation," GovCIO Media & Research. "'Accelerate like hell': Hegseth moves to reshape DOD's AI and tech hubs," DefenseScoop, January 13, 2026. [^33]: DoW AI Strategy, January 2026. Hegseth rejected "the legacy 'linear' model that moves from lab to program of record over many years." See also Jensen and Strohmeyer, "Rethinking the Napoleonic Staff," recommendation for "rapid learning cycles." [^34]: DefenseScoop, "New Pentagon Report on China's Military Notes Beijing's Progress on LLMs," December 26, 2025. Report: Department of Defense, "Military and Security Developments Involving the People's Republic of China, 2025," Annual Report to Congress, December 2025. Quote: "China's commercial and academic AI sectors made progress on large language models and LLM-based reasoning models, which has narrowed the performance gap." [^35]: "Helsing unveils Lura and SG-1 Fathom —autonomous mass to surveil and defend the depths," Helsing.ai, May 2025. Lura detects signatures 10x quieter, 40x faster than human operators. SG-1 Fathom: 90-day patrol endurance, 1.95m length, 60kg. See also: The Defense Post, May 14, 2025. [^36]: Anthropic. "Donating the Model Context Protocol and Establishing the Agentic AI Foundation." December 9, 2025. https://www.anthropic.com/news/donating-the-model-context-protocol-and-establishing-of-the-agentic-ai-foundation. Adoption metrics: 97M+ monthly SDK downloads, 10,000+ active servers. See also: Pento, "A Year of MCP: 2025 Review," https://www.pento.ai/blog/a-year-of-mcp-2025-review. [^37]: Linux Foundation. "Linux Foundation Launches the Agent2Agent Protocol Project." June 23, 2025. Founding members: AWS, Cisco, Google, Microsoft, Salesforce, SAP, ServiceNow. Google Developers Blog. "Announcing the Agent2Agent Protocol (A2A)." April 9, 2025. https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/. Specification: https://a2a-protocol.org/latest/specification/. [^38]: Reuters, via CNBC, "DeepSeek Aids China's Military and Evaded Export Controls, US Official Says," June 24, 2025. State Department official: "DeepSeek has willingly provided and will likely continue to provide support to China's military and intelligence operations." More than 150 procurement record references cited. [^39]: Farnell, R. and Coffey, K. "AI's New Frontier in War Planning: How AI Agents Can Revolutionize Military Decision-Making." Belfer Center for Science and International Affairs, Harvard Kennedy School. October 11, 2024. https://www.belfercenter.org/research-analysis/ais-new-frontier-war-planning-how-ai-agents-can-revolutionize-military-decision. Definition: agentic AI as "multiple autonomy-based technologies working synergistically." [^40]: Ibid. FPRI analysis: "the Department of Defense is treating AI adoption as an operational race in which the decisive variable is diffusion — how quickly a promising capability moves from experiment to trusted, fielded use." China's 81 LLM government projects in H1 2024 cited in FPRI and Pentagon annual report. [^41]: Axios, "Pentagon Threatens to Label Anthropic's AI a 'Supply Chain Risk,'" February 16, 2026. Anthropic stated "eight of the ten largest U.S. companies" use Claude. The "enormous pain in the ass" quote from a senior Pentagon official per Axios. The $200 million contract value reported in CNBC, Axios, and Breaking Defense. [^42]: Gartner, Inc. "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027." Press Release, June 25, 2025. https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027. Cursor. "Scaling Agents." cursor.com/blog/scaling-agents. October 2025. Cemri et al. (see [^40]). [^43]: Amodei, Dario. "The Adolescence of Technology." darioamodei.com. January 2026. The essay identifies four AI-enabled tools that could entrench autocratic power, and draws an "absolute line" against two of them: mass surveillance and autonomous weapons. [^44]: Ibid. Amodei on autonomous weapons: "too small a number of 'fingers on the button,' such that one or a handful of people could essentially" operate lethal force without human cooperation. The constitutional protections argument is the author's framing of Amodei's stated concern that autonomous weapons eliminate the requirement for human participation in lethal decisions. [^45]: Small Wars Journal. "AI-Enabled Wargaming at the U.S. Army Command and General Staff College: Its Implications for PME and Operational Planning." January 16, 2026. https://smallwarsjournal.com/2026/01/16/ai-enabled-wargaming-cgsc/. Finding: simplified, intent-focused prompts produced "far more realistic adjudications" than detailed prompts. [^46]: NBC News, "Tensions Between the Pentagon and AI Giant Anthropic Reach a Boiling Point," February 2026. Anthropic spokesperson: "We cannot comment on whether Claude, or any other AI model, was used for any specific operation, classified or otherwise." Anthropic's willingness to negotiate terms reported in CNBC, "Anthropic Is Clashing with the Pentagon over AI Use," February 18, 2026. [^47]: Department of War. "Artificial Intelligence Strategy for the Department of War." January 12, 2026. Seven Pace-Setting Projects. $8.2 billion allocated across PSPs in FY2026. 180-day mandate for "any lawful use" contract provisions. [^48]: U.S. Army, "Army Establishes New AI, Machine Learning Career Path for Officers," army.mil, December 2025. DefenseScoop, "Army Creates AI Career Field, Pathway for Officers to Join," December 30, 2025. The 49B AOC was formally established October 31, 2025. VTIP window: January 5 through February 6, 2026. Reclassification by end of FY2026. [^49]: Jensen, B. and Strohmeyer, M. "Rethinking the Napoleonic Staff: Agentic Warfare and the Future of Military Operations." CSIS Futures Lab, July 2025. https://www.csis.org/analysis/rethinking-napoleonic-staff. The report identifies decision timeline compression from days to minutes as a primary consequence of AI integration. [^50]: Saab, "Saab Achieves AI Milestone with Gripen E," saab.com, June 2025. Helsing, "Helsing AI Agent Successfully Completes Saab Gripen E Test Flight," helsing.ai, June 2025. First flights May 28, 2025. BVR combat scenarios confirmed. "50 years of human pilot experience" equivalent cited in Defence News, "Saab, Helsing Let Gripen Fighter Fly with AI in Charge," June 11, 2025. Helsing $12 billion valuation from Tech Funding News, 2025. [^51]: DefenseScoop. "Pentagon CTO urges Anthropic to 'cross the Rubicon' on military AI use cases." February 19, 2026. https://defensescoop.com/2026/02/19/pentagon-anthropic-dispute-military-ai-hegseth-emil-michael/. Axios. "Pentagon threatens to label Anthropic's AI a 'supply chain risk.'" February 16, 2026. Breaking Defense. "Pentagon CTO says 'not democratic.'" February 2026. [^52]: SHAPE: Supreme Headquarters Allied Powers Europe. The observation that agent communication standards are being written by the private sector rather than by military alliance headquarters is the author's analysis. MCP (Anthropic, donated to Linux Foundation December 2025) and A2A (Google, donated to Linux Foundation 2025) are both governed by the Agentic AI Foundation under the Linux Foundation — not by any military or government body. [^53]: Palantir Technologies investor announcement. "Anthropic and Palantir Partner to Bring Claude AI Models to AWS for U.S. Government Intelligence and Defense Operations." 2024. https://investors.palantir.com/news-details/2024/Anthropic-and-Palantir-Partner-to-Bring-Claude-AI-Models-to-AWS-for-U.S.-Government-Intelligence-and-Defense-Operations/. Maven contract: DefenseScoop, "DOD raises Maven to $1B+." 2025. [^54]: Johns Hopkins Applied Physics Laboratory, "GenWar Sim," JHU APL, 2025. See also: "Johns Hopkins APL Establishes AI Wargaming Lab to Boost Strategic National Security Analysis and Planning," October 30, 2025. [^55]: NPR, "What We Know About the Military Operation to Capture Maduro," January 3, 2026. Fox News, "US Military Details Timeline of Operation, Revealing More Than 150 Aircraft Involved," January 4, 2026. Washington Post, "Maduro Raid Killed About 75 in Venezuela, U.S. Officials Assess," January 6, 2026. Cuban casualties: Cuba confirmed 32 of its soldiers killed. Task and Purpose, "Delta Force Soldiers Carried Out Raid to Capture Maduro," January 4, 2026. Air and Space Forces Magazine, "US Airpower Paved Way for Special Ops to Capture Venezuela's Maduro," January 5, 2026. The 160th SOAR timeline: ingress at 0201 local, back over water by approximately 0420 local. [^56]: Burdette, Z., Phillips, D., Heim, J.L., Geist, E., Frelinger, D.R., Heitzenrater, C., and Mueller, K.P. "How Artificial Intelligence Could Reshape Four Essential Competitions in Future Warfare." RAND Corporation, RRA4316-1. 2026. https://www.rand.org/pubs/research_reports/RRA4316-1.html. "Precise mass" and "affordable mass" terminology from report findings. [^57]: The "AI graduation" concept draws from the Army's Security Force Assistance Brigade (SFAB) model, where advisor teams build partner force capacity and then redeploy to the next mission. The objective is institutional self-sufficiency, not permanent advisory presence. See FM 3-22, *Army Support to Security Cooperation* (2013). [^58]: ADRP 5-0, *The Operations Process* (2012). Mission Analysis step requirements including higher headquarters' intent analysis. "Thoroughly analyze the higher headquarters' plan or order to determine how their unit — by task and purpose — contributes to the mission, commander's intent, and concept of operations of the higher headquarters." [^59]: See Section 8 and associated citations. The Anthropic-Pentagon crisis (February 2026) provides the clearest example of capability deployed without agreed rules of engagement between the technology provider and the military operator. [^60]: Kim, Y., et al. "Towards a Science of Scaling Agent Systems." See Section 8, footnote 1. Error amplification: 17.2x for decentralized architectures, 4.4x for centralized. [^61]: Cemri, M., et al. "Why Do Multi-Agent LLM Systems Fail?" See Section 8, footnote 4. FM-1.4: Loss of Conversation History — one of five Specification & System Design failure modes accounting for 37% of all MAS failures. [^62]: Risk management framework adapted from AR 385-10, *The Army Safety Program* (2023), and ATP 5-19, *Risk Management* (2014). The four-step process — identify hazards, assess hazards, develop controls, implement controls — applies to AI operations with minimal adaptation. The principle: risk management is continuous throughout the operation, not a one-time planning activity. [^63]: The battle rhythm concept draws from FM 6-0, *Commander and Staff Organization and Operations* (2022). "The battle rhythm is a deliberate daily cycle of command, staff, and unit activities intended to synchronize current and future operations." The adaptation to AI operations preserves the principle while replacing human shift cycles with computational cycles. [^64]: The S3 (Operations Officer/Section) model reflects standard Army staff organization at battalion and brigade level. The S3 is responsible for the unit's training and operations — the staff section that "runs the fight." This metaphor applies directly to AI system architecture: the application (S3) runs the operational cycle while the AI (commander's analytical advisor) provides judgment on exceptions. [^65]: All vault metrics sourced from live extraction, PARA vault, February 21, 2026. Git index contamination incidents: 4 documented (sessions alpha, cobra, condor, manta) — resolved through atomic commit patterns, staged file pre-checks, and explicit-path commit protocols. See also: 161 documented lessons learned with root causes in `2-RESOURCES/Lessons-Learned/`. [^66]: The "70% solution" is attributed to General George S. Patton: "A good plan, violently executed now, is better than a perfect plan next week." The 1/3-2/3 rule is codified in FM 5-0, *Army Planning and Orders Production* (2022): commanders use no more than one-third of available time for their own planning, leaving two-thirds for subordinates. [^67]: The sigma cost escalation is a core Lean Six Sigma principle. Moving from 3σ (66,807 DPMO) to 4σ (6,210 DPMO) requires approximately 10x the investment of moving from 2σ to 3σ. The "last 20%" codification problem mirrors this exactly: the investment to codify edge cases exceeds the value of eliminating AI from those decisions. See Pyzdek, Thomas. *The Six Sigma Handbook*, 4th ed. (2014). [^68]: The compounding principle echoes Warren Buffett's observation about compound interest: "The first rule is not to lose what you have. The second rule is not to forget the first rule." In the Off-Ramp context: every disciplined AI session compounds institutional knowledge. Every undisciplined session wastes tokens without compounding anything. The gap between disciplined and undisciplined practitioners widens with every operational cycle. ## APPENDIX A: GLOSSARY OF MILITARY AND AI TERMS | Term | Definition | |------|------------| | **A2A** | Agent-to-Agent Protocol. Open standard created by Google and donated to the Linux Foundation enabling AI agents to discover, communicate with, and delegate to peer agents across organizational boundaries. | | **AAR** | After Action Review. Structured post-operation assessment using four questions: What did we plan? What happened? Why? What will we do differently? Codified in TC 25-20. | | **ADP** | Army Doctrine Publication. The Army's highest-level doctrinal documents, providing foundational guidance for operations, planning, and leadership. | | **BVR** | Beyond Visual Range. Air combat engagement at distances beyond the pilot's visual acquisition capability, typically using radar-guided missiles. Relevant to AI-piloted autonomous combat aircraft. | | **CCA** | Collaborative Combat Aircraft. Uncrewed aircraft designed to operate alongside manned fighters, controlled by AI autonomy software such as Shield AI's Hivemind. | | **CDAO** | Chief Digital and Artificial Intelligence Officer. Pentagon office responsible for accelerating DoD adoption of data, analytics, and AI capabilities. Oversees the seven Pace-Setting Projects. | | **COA** | Course of Action. A potential plan or approach to accomplish a mission. MDMP generates, analyzes, compares, and selects among multiple COAs. | | **COP** | Common Operating Picture. A single identical display of relevant information shared by multiple commands and agencies, facilitating collaborative planning and situational awareness. | | **CSIS** | Center for Strategic and International Studies. Washington-based policy research organization. Published "Rethinking the Napoleonic Staff" (2025) on AI-enabled military command structures. | | **DIANA** | Defence Innovation Accelerator for the North Atlantic. NATO program accelerating innovation across 32 member nations through 16 accelerator sites and 200+ test centers. | | **DMAIC** | Define, Measure, Analyze, Improve, Control. The Lean Six Sigma improvement methodology that structures continuous process improvement. | | **DoW** | Department of War. The renamed U.S. Department of Defense under the current administration's directive. | | **FRAGO** | Fragmentary Order. An abbreviated order issued to change or modify an existing OPORD. Provides timely changes without rewriting the complete order. | | **GenAI.mil** | The Department of War's enterprise generative AI platform, providing secure LLM access to military personnel. 1.1 million unique users across five service branches as of February 2026. | | **IHL** | International Humanitarian Law. The body of international law regulating the conduct of armed conflict, including the Geneva Conventions and their Additional Protocols. | | **IL-5** | Impact Level 5. DoD information classification level for Controlled Unclassified Information (CUI) and mission-critical data. GenAI.mil operates at IL-5. | | **IL-6** | Impact Level 6. DoD classification level for SECRET-level national security data. Requires accredited infrastructure for deployment. Palantir's Maven platform operates at IL-6. | | **JAIC** | Joint Artificial Intelligence Center. Pentagon organization (predecessor to CDAO) responsible for accelerating DoD AI adoption. | | **JSC** | Joint Service Committee on Military Justice. DoD organization responsible for reviewing and proposing changes to the Manual for Courts-Martial and Uniform Code of Military Justice. Provides LOAC compliance guidance relevant to military AI employment. | | **JWICS** | Joint Worldwide Intelligence Communications System. The Top Secret/SCI-level network used by the intelligence community and DoD for highly classified communications. | | **LOAC** | Law of Armed Conflict. Legal framework governing the conduct of hostilities, including principles of distinction, proportionality, military necessity, and humanity. | | **MDMP** | Military Decision Making Process. Seven-step planning methodology: Receipt of Mission, Mission Analysis, COA Development, COA Analysis (War Gaming), COA Comparison, COA Approval, Orders Production. Codified in FM 5-0. | | **MCP** | Model Context Protocol. Open standard created by Anthropic and donated to the Linux Foundation standardizing how AI agents connect to tools, data sources, and external context. 97M+ monthly SDK downloads. | | **METT-TC(IT)** | Mission, Enemy, Terrain, Troops, Time, Civil Considerations (Information Technology). The framework for analyzing operational variables. IT added as seventh variable for digital operations. | | **AOC 49B** | Area of Concentration 49B. The Army's AI/ML Officer career field, established October 2025. First VTIP window: January-February 2026. | | **OPORD** | Operations Order. Five-paragraph format: Situation, Mission, Execution, Sustainment, Command and Signal. The standardized communication format for directing military operations. | | **OODA** | Observe, Orient, Decide, Act. Decision cycle developed by Colonel John Boyd (USAF). Describes the continuous loop of information processing and decision-making in competitive environments. | | **PSP** | Pace-Setting Project. One of seven priority AI programs under the Department of War's AI Acceleration Strategy, directed to demonstrate capabilities by July 2026. | | **ROE** | Rules of Engagement. Directives issued by competent military authority delineating circumstances and limitations under which forces will initiate or continue combat engagement. | | **SIPRNet** | Secret Internet Protocol Router Network. The SECRET-level network used by DoD and the intelligence community for classified communications below Top Secret. | | **SOCOM** | Special Operations Command. Unified combatant command responsible for special operations forces across all services. Hosting agentic AI experimentation April 2026. | | **SOF** | Special Operations Forces. Military forces organized, trained, and equipped to conduct special operations —unconventional warfare, direct action, special reconnaissance, and related missions. | | **SWaP-C** | Size, Weight, Power, and Cost. Engineering constraints for military systems, particularly relevant to AI systems deployed at the tactical edge where computational resources are limited. | | **TE 26-2** | Technical Experimentation 26-2. SOCOM's agentic AI demonstration event, April 13-17, 2026, at Avon Park Air Force Range, Florida. | | **TITAN** | Tactical Intelligence Targeting Access Node. Palantir's $178 million next-generation data fusion and deep-sensing platform for the U.S. Army. | | **UCAV** | Unmanned Combat Aerial Vehicle. An uncrewed aircraft designed for combat operations, including strike missions. Examples: Anduril Fury (YFQ-44A), Helsing CA-1 Europa. | | **UAS** | Unmanned Aerial System. The complete system including the unmanned aircraft, ground control station, and communications links. | | **VTIP** | Voluntary Transfer Incentive Program. The Army's process enabling officers to transfer between career fields, used for the initial 49B selection window. | | **WARNO** | Warning Order. A preliminary notice of an action or order that follows. Enables subordinate units to begin preparation before the detailed OPORD is issued. | --- ## APPENDIX B: ABOUT THE AUTHOR LTC Jeep Marshall, US Army (Retired), served 26 years in Airborne Infantry and Special Operations. His career spanned assignments across the operational force, culminating in seven years training brigade-level staffs through simulation-driven exercises. In that role, the planning frameworks described in this paper —MDMP, METT-TC, synchronization matrices, battle rhythms, and after-action reviews —were not theoretical constructs. They were tested under operational pressure daily, in exercises where the friction of multi-echelon coordination, imperfect intelligence, and time-constrained decision-making replicated the conditions that doctrine is designed to manage. He holds a Lean Six Sigma Black Belt certification and currently builds AI-integrated workflows using Claude Code, applying military planning doctrine and LSS process governance to multi-agent AI system design. His personal knowledge management system —an Obsidian vault organized under the PARA methodology —has generated over 1,600 commits across 39 AI sessions in 32 days, serving as an accidental laboratory for the multi-agent coordination problems this paper examines. He is the author of "The Super Intelligent Five-Year-Old: Why AI Needs Military Doctrine and Lean Six Sigma —Not the Other Way Around" (2026), which applies MDMP, DMAIC, and QASAS-model quality assurance to AI agent system design. --- ## Series Navigation | | | |---|---| | **This paper** | Paper 2 of 7 | | **Previous** | [[Paper-1-The-Super-Intelligent-Five-Year-Old\|← Paper 1: The Super Intelligent Five-Year-Old]] | | **Next** | [[Paper-3-The-PARA-Experiment\|Paper 3: The PARA Experiment →]] | | **Case Study** | [[Case-Study-Session-Close-Automation\|Case Study 1: Session Close Automation]] | | **Home** | [[Home\|← Series Home]] | ## Related - [[Index - Herding-Cats-in-the-AI-Age]] — parent folder - [[Paper-2-Sections-10-12]] — sibling - [[Paper-2-Sections-4-6]] — sibling